For the Linux version of
Superluminal
(a CPU profiler) we make heavy use of eBPF to capture performance data. This is the story about how an innocent profiling session led to a change to the Linux kernel that makes eBPF map-in-map updates much faster.
What is eBPF
eBPF (originally “
e
xtended
B
erkeley
P
acket
F
ilter”, though now used as a standalone term) is a powerful system in the Linux kernel that allows you to safely run custom programs directly inside the kernel. These programs can be attached to various hooks in the kernel called tracepoints, kprobes, or
perf
events. You can think of an eBPF program as C code that executes whenever a specific kernel event occurs. An example of this is the
sched_switch
tracepoint, which triggers on every thread context switch.
Superluminal uses eBPF to collect performance data such as context switches and sampling events.
eBPF maps
Data exchange between a kernelspace eBPF program and the userspace controlling program (in our case, Superluminal) goes through eBPF “maps”. An eBPF map is a shared memory structure that acts as a bridge between kernel and userspace. Each map represents an underlying data structure; examples of map types are arrays, hash maps, ring buffers, and
many more
.
eBPF programs running in kernelspace can update maps to send data back to userspace. For example, Superluminal’s eBPF backend uses the ring buffer map type to output performance events (such as context switches and samples) from the eBPF program to userspace. The controlling program can also update maps from userspace to make data available for use in the kernelspace eBPF program.
As explained in a
previous article
, Superluminal makes use of
.eh_frame
data in a binary to retrieve stack backtraces when sampling. Since sampling happens in kernelspace through an eBPF program as described above, we need to upload the
.eh_frame
data to the eBPF program from userspace for each relevant binary so that the eBPF program can make use of the data.
The
.eh_frame
data is stored in an eBPF map of type
BPF_MAP_TYPE_ARRAY_OF_MAPS
, which essentially represents a 2D-array. In C++, you could express this as a
std::vector<std::vector<UnwindRow>>
, where there is one entry in the outer
vector
per unique binary loaded in the profiled process(es) and the inner
vector
holds the actual unwind data for that binary.
The process to go from a binary to unwind data being available for use in eBPF is as follows:
The unwind data is extracted from the
.eh_frame
section. This is described in the linked article, and is already very efficient.
The unwind data is converted to our internal format that’s highly optimized for speed & memory efficiency.
The converted unwind data is uploaded to eBPF through the
bpf_map_update_elem
userspace function, which inserts the unwind data for each unique binary into the outer array.
From there on, the eBPF programs can make use of the unwind data.
Performance problems are never where you think they are
It is important that the unwind data is made available to eBPF as soon as possible, since the eBPF code won’t be able to unwind callstacks before the unwind data has been uploaded. To lower this latency as far as possible, we use various mechanisms, one of which is precaching the unwind data before profiling starts. This is done by enumerating the needed binaries (i.e. the main executable, and shared libraries it depends on) for each relevant process and then extracting, converting and uploading the unwind data for each binary to eBPF.
We saw in the previous article that the extract step took much longer than expected, which caused this precaching step to take much longer than we wanted. After optimizing that part, the precache step was much faster, but still much slower than we’d expected it to be.
Fortunately, we happen to be developing a CPU profiler, and what’s the point of that if you’re not going to use it? So let’s profile the profiler to see what’s going on.
A profile of this part of the capturing process looks like this:
If you’re not familiar with Superluminal, this is showing the wall-clock timeline for each thread in the process. A green color means the thread is executing code at that point, any other color means it’s waiting on something (i.e. a lock, IO, network, etc).
In this test, there are about 1400 binaries that need to be precached, and the profile shows that this step takes ~830ms end-to-end. The actual work of precaching is spread over the available CPUs using our job scheduler: a job is started for each binary, where each job does the extract/convert/upload for that binary, and then inserts the uploaded data into the outer map.
I’m testing on a machine with 32 logical CPUs, so while 830ms may
seem
like it’s not worth worrying about, it actually represents ~25
seconds
of work spread across those 31 cores (32 minus 1 for the thread that starts the jobs). That feels like it’s
way
too long for what this is doing, especially with the optimizations we previously made to the unwind data extraction.
We would expect most time to be taken up by the conversion process, since that does the actual work, whereas the upload should just be copying memory from user to kernelspace, and the insert into the outer map should be very fast. But looking at the timeline for the various JobScheduler threads we see surprisingly little actual work happening (i.e. green colors), some minor blips here and there, and a whole lot of waiting (i.e. red colors) instead.
Expanding one of the threads that’s spending all its time waiting and zooming in a bit, we can see what it’s doing in detail:
This is very unexpected.
Just at a glance you can immediately see all time is being taken up by
bpf_map_update_elem
, highlighted in white. This function is responsible for inserting the unwind data in the outer eBPF map as described above. While there might reasonably be some overhead involved with copying data across the user/kernel boundary, this is excessive.
The function statistics show that there’s a total of 25 seconds in this function alone across all job scheduler threads, with each call taking ~18ms on average:
We can also see that when the thread is executing this function, it is in a wait state: the thread overview at the top of the thread shows the red color. This means the function is not actually doing any work: it’s waiting on something. By clicking on the corresponding wait state (i.e. one of the red areas), we can see the callstack that caused that thread to block. In this case the stack that caused the wait looks like this, with the relevant frames highlighted:
So it looks like the
bpf_map_update_elem
userspace function results in a
map_update_elem
syscall in the kernel, which calls
synchronize_rcu_normal
, which is what eventually causes the thread to switch out. This is where you’d normally reach the limit of what you can do with regards to optimization, since this is all happening in kernelspace.
Linux, however, is open source, which means we can dig into the kernel source to better understand what’s going on here.
Down the rabbit hole
Let’s look at
map_update_elem
first. This is the implementation of the syscall that
bpf_map_update_elem
eventually results in. Most of the function is not that interesting, just sanity checking inputs. The actual work the function is doing looks like this:
The
bpf_map_update_value
function being called here is a helper function that actually updates the
value
for the specified
key
. We can see that there is no direct call to the
synchronize_rcu_normal
function we’re looking for, but we do see a call to
maybe_wait_bpf_programs
when
bpf_map_update_value
succeeds.
Let’s look at the code for it:
static void maybe_wait_bpf_programs(struct bpf_map *map){ /* Wait for any running non-sleepable BPF programs to complete so that * userspace, when we return to it, knows that all non-sleepable * programs that could be running use the new map value. For sleepable * BPF programs, synchronize_rcu_tasks_trace() should be used to wait * for the completions of these programs, but considering the waiting * time can be very long and userspace may think it will hang forever, * so don't handle sleepable BPF programs now. */ if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS || map->map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS) synchronize_rcu();}
So we found our call to
synchronize_rcu
. There are a few things of note here. First of all, this call only happens when the map being updated is of type
BPF_MAP_TYPE_HASH_OF_MAPS
or
BPF_MAP_TYPE_ARRAY_OF_MAPS
. These map types are also known as “map-in-map” types. And it so happens that we’re indeed updating a map of type
BPF_MAP_TYPE_ARRAY_OF_MAPS
as described earlier.
It is very interesting that the call to
synchronize_rcu
is conditional on the type of the map being updated. If the call was unconditional, then it’s probably there for a very good reason. But the fact that it’s conditional means that there are code paths where this expensive call isn’t needed (i.e. for regular map types), and so that might be an indication we could do something about this.
There is also a comment that explains what this code aims to achieve, though it’s hard to understand the comment without more knowledge of how eBPF works, and in particular how synchronization between userspace & kernelspace works when it comes to data structures like eBPF maps.
So let’s unpack that first.
Synchronization without waiting
As we described earlier, eBPF maps are used for bi-directional data exchange between kernel & userspace. Let’s assume we have an eBPF program that looks like this (pseudocode-ish):
// Equivalent to std::vector<std::vector<UnwindRow>> as described earlierBPF_MAP_TYPE_ARRAY_OF_MAPS unwindData;void ContextSwitchHandler(){ int key = 10; // some key uniquely identifying a particular binary // find the inner array for the key; equivalent to std::vector<UnwindRow> void* binaryUnwindData = bpf_map_lookup_elem(&unwindData, &key); // do something with binaryUnwindData, for example, unwind the stack}
The question is: what would you expect to happen when the value for a key in a map (in this case
10
) is updated from userspace (via
bpf_map_update_elem
), while there are still eBPF programs running in kernelspace that are using the “previous” value for that key (in this case
binaryUnwindData
)?
This kind of concurrent access to a shared datastructure (in this case the eBPF map) requires some kind of synchronization between reader (the eBPF program) and the writer (the userspace program) to prevent the reader from getting its data pulled out from under it. Without synchronization, you have the problem that when the value is updated and the old value is deleted, any readers of that old value may be left with a dangling pointer.
The way the eBPF system (and indeed, the kernel in general) deals with these kinds of synchronization issues is quite elegant.
The key insight is that the synchronization problem here isn’t that
the value is updated
, the problem is that
the old value is deleted
. Taking the example of our eBPF program above, this program could continue working with
binaryUnwindData
just fine, even if the value for key
10
in the map is replaced with a new value,
as long as it’s guaranteed
that the memory containing
binaryUnwindData
is not freed until
after
the eBPF program finishes executing.
The way the kernel makes this guarantee is in essence quite simple. Instead of deleting the old value immediately after an update, the deletion of the old value is queued on a special kernel thread. This kernel thread, typically called
rcu_sched
or
rcu_preempt
, waits for the system to reach a state where it is guaranteed that no readers are still accessing any old data. This state is called the “quiescent state”, and the time it takes for the system to reach this state is called the “grace period”. Once the system reaches this state, the kernel thread deletes any queued old values via their associated callback.
The Linux kernel calls this system the
R
ead-
C
opy-
U
pdate, or RCU, system. The reality behind this system/how it works is of course much more complicated than this (extremely) simplified description. For example, the way the kernel determines that the system has reached the quiescent state is quite complicated.
The full details on how this system works are outside the scope of this article, but if you’re curious, see the official
RCU documentation
or
this
excellent article.
An important observation about this system is that it’s non-blocking: since the deletion is deferred, the writer doesn’t have to wait for the deletion to complete. In our case, the writer is
map_update_elem
(via
bpf_map_update_elem
) and for non-map-in-map types it returns immediately after updating the value, while the kernel handles freeing the old value at some later point in time.
Armed with this knowledge we can attempt to understand the comment in
maybe_wait_bpf_programs
again. The relevant part of the comment is this, stripped of the parts that aren’t relevant to understanding this issue:
Wait for any running BPF programs to complete so that userspace, when we return to it, knows that all programs that could be running use the new map value
So what this code is trying to achieve is in some ways the opposite of what
bpf_map_update_elem
does for non-map-in-map types.
As we just saw, for the regular case, any eBPF programs that are running concurrently with
bpf_map_update_elem
will continue running with whatever value they retrieved from the map, while
bpf_map_update_elem
immediately returns to the caller after updating the map. There is therefore no guarantee which “version” of the value for the updated key is in use at any given point in time: it could be the old value, the new value, or a mix of the two.
However, per the comment, for map-in-map types it is apparently important to guarantee that after
bpf_map_update_elem
returns, the old value is no longer in use: any running eBPF programs should be using the new value. But, since it is not possible to “update” (i.e. patch) already-running eBPF programs to use the new value, there is only one way for
bpf_map_update_elem
to achieve that guarantee, and that is by waiting for the system to reach the quiescent state we described in the previous section.
That’s exactly what
synchronize_rcu
does: it blocks until the system reaches that state, turning the normally asynchronous
bpf_map_update_elem
into a blocking operation. It is essentially a global synchronization point.
That also explains the performance issue we’re seeing. The blocking wait for the system to reach the quiescent state can take an indeterminate amount of time, and is dependent on the state of the system. This can potentially take many milliseconds (we’ve measured 8-20ms across different systems), and we’re calling it across 31 threads.
What’s happening is that we read and convert the unwind data across our job scheduler threads. This runs in parallel and takes very little time, due to previously made optimizations. All jobs then attempt to upload the unwind data they just converted at approximately the same time, and they all hit this blocking wait in
bpf_map_update_elem
simultaneously. The blocking waits via
synchronize_rcu
then finish in sequence, which serializes the upload, making the upload step effectively single threaded. After that’s done, the process repeats.
But why
So that’s the
what
of the performance issue we’re seeing: we’re hitting an expensive synchronization point on every update. But to determine what (if anything) we can do about this, we also need to understand the
why
:
Why is this guarantee about the new value of the map important?
Why is it apparently
only
important for these two types of maps, and not the many other map types?
To answer these questions, let’s look at the
commit
that introduced this code:
The map-in-map frequently serves as a mechanism for atomic
snapshotting of state that a BPF program might record. The current
implementation is dangerous to use in this way, however, since
userspace has no way of knowing when all programs that might have
retrieved the “old” value of the map may have completed.
This change ensures that map update operations on map-in-map map types
always wait for all references to the old map to drop before returning
to userspace.
…that didn’t really help. Fortunately, development on the Linux kernel happens mostly in the open, and each patch has a corresponding mailing list discussion associated with it. In this case, that discussion can be found
here
. You can read it if you’re interested, but the summary is that this code was added to support the following scenario.
Let’s say you have an eBPF program that looks something like this (pseudocode):
// The statistics we're interested in trackingenum EStatistics{ EStatistics_Duration, // ...}// Record various EStatistics for context switches. Equivalent to std::unordered_map<EStatistics, std::vector<uint64>>BPF_MAP_TYPE_HASH_OF_MAPS recordedCSwitchStatistics;void ContextSwitchHandler(){ __u64 start = bpf_ktime_get_ns(); // ... perform potentially expensive work here ... __u64 duration = bpf_ktime_get_ns() - start; // find the inner array for the key; equivalent to std::vector<uint64> int key = EStatistics_Duration; void* durationStatistics = bpf_map_lookup_elem(&recordedCSwitchStatistics, &key); // add the duration of this event to the array; equivalent to timestampStatistics.push_back(duration) bpf_map_update_elem(durationStatistics, nextIndex++, duration);}
So this is an eBPF program that runs on every context switch. It does some work to handle the context switch, and it wants to report how long it took back to userspace. To do so, there is a
BPF_MAP_TYPE_HASH_OF_MAPS
containing statistics. In this case there’s just
EStatistics_Duration
, but there could be others.
On every run of this program, it records the start & end timestamps of the work it’s doing to calculate the duration. Then it adds that duration to the statistics map. The inner map in this case is a list of all individual durations.
Now, the goal here is for the userspace controlling program to periodically read out the statistics that have been logged so far. Again in pseudocode, this could look like this:
void readStatisticsFromEBPF(){ // get the current inner array with the statistics int key = EStatistics_Duration; void* currentDurationStatistics = bpf_map_lookup_elem(&recordedCSwitchStatistics, &key); // do something with the statistics}
The problem is that there’s now unsynchronized concurrent access to
currentDurationStatistics
: while userspace is reading the values from the map, the eBPF program can still be writing statistics to it. For this inner map type (
BPF_MAP_TYPE_ARRAY
), concurrent reads and writes aren’t automatically synchronized: it’s essentially shared memory without built-in locking. This is a race because userspace could read a partially updated array or read while eBPF is writing to it, leading to inconsistent data.
We can attempt to solve this by having
two
arrays: one that userspace is reading from, and one that eBPF is writing to, essentially double buffering:
void readStatisticsFromEBPF(){ // get the current inner array with the statistics int key = EStatistics_Duration; void* oldDurationStatistics = bpf_map_lookup_elem(&recordedCSwitchStatistics, &key); // replace (swap) the array in the map with a new one so that eBPF starts writing to that one void* newDurationStatistics = create_array(1024); bpf_map_update_elem(&recordedCSwitchStatistics, &key, newDurationStatistics); // do something with the statistics}
This
almost
works, but the problem is that
bpf_map_update_elem
is not atomic: as we saw before, it updates the value for the key (in this case
EStatistics_Duration
) and then returns before all readers have finished. This means that after it returns, there may still be eBPF programs running that are making use of
oldDurationStatistics
.
So this is still a race, and it is this race that the commit fixes: with the added
synchronize_rcu
call,
bpf_map_update_elem
is atomic for map-in-map types. After it returns, it is guaranteed that the old value of the key (in this case
oldDurationStatistics
) is no longer in use by any eBPF programs and is thus safe to do with whatever you want.
Reading the discussion, before ending up at the final commit, the patch went through several iterations.
It started out as a new
BPF_SYNCHRONIZE_MAP_TO_MAP_REFERENCES
command (syscall) in eBPF that could be issued from userspace as an explicit synchronization point where needed. The maintainers felt that this was exposing too many eBPF implementation details to userspace, and that it would be hard for users to understand exactly what the new command does and when it should be used.
Instead, they
suggested
just always doing this sync in
bpf_map_update_elem
for map-in-map types:
I believe the only issue being discussed is user space doesn’t know
when it’s ok to start draining the inner map when it was replaced
by bpf_map_update syscall command with another map, right?
If we agree on that, should bpf_map_update handle it then?
Wouldn’t it be much easier to understand and use from user pov?
The original submitter
responded
that it didn’t seem right to force this synchronization on all users, given the relatively niche usecase:
Maybe with a new BPF_SYNCHRONIZE flag for BPF_MAP_UPDATE_ELEM and
BPF_MAP_DELETE_ELEM. Otherwise, it seems wrong to make every user of
these commands pay for synchronization that only a few will need.
The maintainers still felt that it would be
good idea
, as the cost of this was anticipated to be small:
I don’t think extra flag is needed. Extra sync_rcu() for map-in-map
is useful for all users. I would consider it a bugfix,
since users that examine deleted map have this race today
and removing the race is always a good thing especially since the cost
is small.
As we’ve seen, however, the cost of this is far from small, but that’s hindsight for you.
Optimizing it
Now that we thoroughly understand the code and problem, we can start thinking about ways to resolve it. Let’s consider our options, starting from the most direct approach.
The most obvious fix would be to remove this sync point from
bpf_map_update_elem
for map-in-map types and to change it to be an optional sync via an opt-in flag instead, as originally suggested on the mailing list. Unfortunately, this behavior has been in the kernel since 2018. That makes it impossible to change, since any modifications might break existing programs that (perhaps unknowingly) depend on this behavior
1
, and as we all know “
WE DO NOT BREAK USERSPACE
”
2
. So that’s not a real option.
The next most obvious fix would be to make use of batched eBPF map updates. Right now, the problem is that we’re uploading the unwind data for each binary individually using separate
bpf_map_update_elem
calls, which means we’re hitting this sync point for each upload. The eBPF API also has a function
bpf_map_update_batch
since kernel 5.6, which can update multiple elements. Using this function would mean this sync point is hit only
once per batch
.
For the precache step this would be a perfect fit. We know up front how many binaries we need to upload, so we can relatively simply divide them in batches, which are then all uploaded at the same time. This might still hit the sync point across multiple threads as before, but due to the batching, the number of sync points is much lower. For example, if we choose a batch size of 100, you would only hit the sync point 14 times instead of once per job. That would be a massive improvement.
That being said, the precache step is not the only time where we upload unwind data to eBPF. When a program is running, it might load in (many) additional shared libraries. For example, some applications we’ve tested against dynamically load hundreds of shared libraries at startup. When a shared library is loaded, we also need to upload the corresponding unwind data.
In that case we
don’t
want to batch uploads, because that increases the latency between the time a library is loaded and the time the unwind data is made available for unwinding to eBPF. This means that when the rate of shared library loading is high, you would still run into this perf issue. We needed a more general solution, so let’s see what other options there are.
Opting out
As we saw, in the original discussion on the mailing list, it was suggested that this explicit sync point should be a flag instead of the default behavior. The patch went the other way, but now that it’s the default, we can also consider adding an opt-
out
flag to the eBPF API to disable this behavior for cases (like ours) where you know that this is not the behavior you want.
Adding such an opt-out flag is exactly what we suggested on the eBPF kernel mailing list. The
discussion
around this was productive, initially leaning towards acceptance. But then somebody
asked
whether modifying the kernel to use
synchronize_rcu_expedited
instead of
synchronize_rcu
in this case made any difference to performance.
We weren’t aware of that function beforehand, but reading up on it,
synchronize_rcu_expedited
is a version of
synchronize_rcu
that’s supposed to reach the quiescent state of the system much faster. It was a good suggestion to at least try out, since it would be a less invasive change than adding an entirely new userspace flag would be. If this suggestion worked, it would mean the performance of
bpf_map_update_elem
would just transparently improve for all users, without needing to be aware of a new flag.
This required compiling our own kernel, which took some doing, but we were able to test this change when we got that running. Did it make a difference? See for yourself, and note that this screenshot is taken at the same zoom level as the original:
It makes a huge difference. The total time for the precache step now takes a total of ~26ms instead of the ~830ms it previously did, or 31x faster. Looking at the functions statistics for
bpf_map_update_elem
shows that the average time in this function is now 59
micro
seconds instead of the 18ms it was before, or 305x faster, for a total time of 80ms across the same 31 threads. That is much more reasonable compared to where we started.
While adding an opt-out flag would get this down even further, at this point we felt it was not worth adding that flag anymore, given the other concerns around exposing a new userspace flag.
Why wasn’t this found before
It’s interesting to think about why this bottleneck wasn’t found before, given that this code was introduced in 2018.
When you read articles about profiling on Linux, you’ll often encounter the terms
“on-cpu” vs “off-cpu”
profiling. On-cpu analysis involves figuring out what code that’s actually
running
is doing and is typically what a sampling profiler does. Off-cpu analysis in contrast is about figuring out what threads that
aren’t
currently running are doing, i.e. to investigate what they’re waiting on (a lock, network, etc).
These two kinds of analyses are often described as things you look at separately, with “on cpu” being seen as the thing you look at primarily, and “off cpu” as something you look at occasionally when you need to. This is reflected in the defaults of tools such as
perf
: when you record using a default commandline such as
perf record -o ./perf.data --call-graph dwarf --all-cpus
only sampling data (i.e. “on cpu”) will be recorded. It is
possible
to perform off-cpu analysis with
perf
, but it requires being aware of the difference, and the specific commandline arguments that are needed to enable it.
In contrast, in Superluminal we take the view that the distinction between the two is irrelevant: when profiling you’re
always
interested in where your time is going. It doesn’t matter whether your program is spending its time actively executing code (on-cpu) or whether it’s waiting for something (off-cpu). Both things contribute to the total time taken by your program and in today’s multi-core world, off-cpu analysis is as important as on-cpu analysis to understand the performance of software. We therefore always collect both on-cpu and off-cpu data by default to give you the complete picture.
This article hopefully demonstrates why: the bottleneck we found went undetected for 8 years because most performance analysis on Linux is done using purely sampling profilers. In a sampling profiler this bottleneck is invisible, because the root problem is that the
bpf_map_update_elem
enters a wait state via
synchronize_rcu
and it’s not executing any code. As a test, now that we know what the issue is, we tried using
perf
in sampling-only mode to find the same bottleneck, and as expected,
perf
reported
bpf_map_update_elem
as taking almost no time at all.
An instrumenting profiler would have done slightly better: even if you’d thought to mark up
bpf_map_update_elem
, which you most likely wouldn’t have, with instrumentation you’d at least be able to see that the function had high wall-clock time. But it wouldn’t be able to give you any information about
why
the function takes a long time, since you can only instrument your own code, and not the kernel itself.
Because Superluminal shows both sampling
and
wait information on a wall-clock timeline with full kernel visibility, however, the problem was immediately obvious and allowed us to find & fix the problem.
Wrapping up
What started out as a regular profiling session of our own code ended up with a trip down the kernel rabbithole, where we discovered and fixed an 8-year-old bottleneck affecting all eBPF map-in-map users.
bpf_map_update_elem
is now much faster for these map types, resulting in a 31x speedup of capture startup time on our end.
We submitted a
patch
with this change, which was
accepted
and will be shipped in the Linux
6.19
kernel update. If you’re using
BPF_MAP_TYPE_ARRAY_OF_MAPS
or
BPF_MAP_TYPE_HASH_OF_MAPS
in eBPF, your program will transparently get much faster from 6.19.
So! I guess we’re kernel contributors now.
foreshadowing
Hyrum’s law
: with a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody.
↩
This is from the kernel’s point of view. On Linux, the job of breaking userspace is left to
glibc
instead, which is more than happy to do so. But that’s
another story
.
↩
Some surprising things about DuckDuckGo you probably don't know
We have hundreds of easter-egg logos (featuring our friendly mascot Dax Brown) that surface when you make
certain
queries
on
our search engine
.
Our subreddit
is trying to
catch ‘em all
. They’ve certainly caught a lot, currently 504, but we keep adding more so it’s a moving target. The total as of this post is 594. I’m the one personally adding them in my spare time just for fun and I recently did a
Duck Tales episode
(our
new podcast
) with more details on the process. This incarnation of specialty logos is relatively new, so if you are a long-term user and haven’t noticed them, that’s probably why (aside from of course that you’d have to search one of these queries and notice the subtle change in logo). And, no promises, but I am taking requests.
There is a rumor continuously circulating that we’re owned by Google, which of course couldn’t be
farther from the truth
. I was actually
a witness in the U.S. v. Google trial
for the DOJ. I think this rumor started because Google used to own the domain duck.com and was pointing it at Google search for several years. After my public and private complaining for those same years, in 2018 we finally convinced Google to
give us the duck.com domain
, which we now use for
our email protection
service, but the rumor still persists.
We’ve been an independent company since our founding in 2008 and been working on our own search indexes for as many years. For over fifteen years now (that whole time) we’ve been doing our own knowledge graph index (like answers from Wikipedia), over ten years for local and other instant-answer indexes (like businesses), and in the past few years we’ve been ramping up our wider web index to support our
Search Assist
and
Duck.ai
features. DuckDuckGo began with me crawling the web in my basement, and in the early days, the FBI actually showed up at my front door since I had crawled one of their
honeypots
.
The plurality of our search traffic now comes from our own browsers. Yes, we have our own browsers with our search engine built in along with a ton of other protections. How do they compare to other popular browsers and extensions, you ask? We made a
comparison page
so you can see the differences. Our mobile browsers on
iOS
&
Android
launched back in 2018 (wow, that’s seven years ago), and our desktop browsers on
Mac
and
Windows
in 2022/23. Our iOS
browser market share
continues to climb and we’re now #3 in the U.S. (behind Safari and Chrome) and #4 on Android (behind Chrome, Samsung, and Firefox). People appreciate all the protections and the front-and-center (now customizable) fire button that quickly clears tabs and data in an (also customizable) animation of fire.
About 13% of U.S. adults self-report as a “current user” of DuckDuckGo. That’s way more than most people think. Our search market share is lower since all of those users don’t use us on all of their devices, especially on Android where Google makes it especially hard. Once you realize that then it is less surprising that we have the highest
search market share
on Mac at about 4% in the U.S., followed by iOS at about 3%. I’m talking about the U.S. here since about 44% of our searches are from the U.S., and no other country is double digits, but rounding out the top ten countries are Germany, the United Kingdom, France, Canada, India, the Netherlands, Indonesia, Australia, and Japan.
Our approach to AI differs from most other companies trying to shove it down your throat in that we are dedicated to making all AI features
private, useful, and
optional
. If you like AI, we offer private AI search answers at
duckduckgo.com
and private chat at
duck.ai
, which are
built-into our browsers
. If you don’t like or don’t want AI, that’s cool with us too. You can easily turn all of these features off. In fact, we made a
noai.duckduckgo.com
search domain that automatically sets those settings for you, including a recent setting we added that allows you to hide many AI-generated images within image search. Another related thing you might find surprising is search traffic has continued to grow steadily even since the rise of ChatGPT (with Duck.ai traffic growing even faster).
Speaking of lots of countries, our team has been completely distributed from the beginning, now at over 300 across about 30 countries as well, with less than half in the U.S. And
we’re still hiring
. We have a
unique work culture
that, among other things, avoids standing meetings on Wednesdays and Thursdays. We get the whole company together for a week once a year.
We played a critical role in the
Global Privacy Control
standard and the creation of
search preference menus
. I have a graduate degree in Technology and Public Policy and so we’ve done more of this kind of thing than one might expect, even going so far to draft our own
Do Not Track legislation
before we got GPC going. We also donate yearly to like-minded organizations (
here’s our 2025 announcement
), with our cumulative donations now at over $8 million. Check our
donations page
for details going back to 2011. We can do this since we’ve been profitable for about that long, and more recently have even started
investing in related startups
as well.
If this hodge-podge of stuff makes you think of anything, please let me know. I’m not only taking requests for easter-egg logo ideas, but also for stuff to write about.
Loved reading through
GReg TeChnoLogY
Anthony Bourdain’s Lost Li.st’s
and seeing the list of lost Anthony Bourdain li.st’s made me think on whether at least
some
of them we can recover.
Having worked in security and crawling space for majority of my career—I don’t have the access nor permission to use the proprietary storages—I thought we might be able to find something from publicly available crawl archives.
Common Crawl
If
Internet Archive
had the partial list that Greg published, what about the
Common Crawl
? Reading through their
documentation
, it seems straightforward enough to get prefix index for Tony’s lists and grep for any sub-paths.
Putting something up with help of Claude to prove my theory, we have
commoncrawl_search.py
that makes a single index request to a specific dataset and if any hits discovered, retrieve them from the public s3 bucket—since they are small straight-up HTML documents, seems even more feasible than I had initially thought.
Simply have a python version around 3.14.2 and install the dependencies from
requirements.txt
. Run the below and we are in business. Now, below, you’ll find the command I ran and then some manual archeological effort to prettify the findings.
NOTE
Images
have been lost
. Other avenues had struck no luck. I’ll try again later.
Any and all emphasis, missing punctuation, cool grammar is all by Anthony Bourdain. The only modifications I have made is to the layout, to represent
li.st
as closely as possible with
no changes to the content.
NOTE
If you see these blocks, that’s me commenting if pictures have been lost.
Recovering what we lost
From Greg’s page, let’s go and try each entry one by one, I’ll put the table of what I wasn’t able to find in
Common Crawl
, but I would assume exists elsewhere—I’d be happy to take another look. And no, none of this above has been written by AI, only the code since I don’t really care about
warcio
encoding or writing the same python requests method for the Nth time. Enjoy!
Things I No Longer Have Time or Patience For
Cocaine
True Detective
Scripps Howard
Dinners where it takes the waiter longer to describe my food than it takes me to eat it.
Beer nerds
Nice Views
I admit it: my life doesn’t suck. Some recent views I’ve enjoyed
Montana at sunset : There’s pheasant cooking behind the camera somewhere. To the best of my recollection some very nice bourbon. And it IS a big sky .
Puerto Rico: Thank you Jose Andres for inviting me to this beautiful beach!
Naxos: drinking ouzo and looking at this. Not a bad day at the office .
LA: My chosen final resting place . Exact coordinates .
Istanbul: raki and grilled lamb and this ..
Borneo: The air is thick with hints of durian, sambal, coconut..
Chicago: up early to go train #Redzovic
If I Were Trapped on a Desert Island With Only Three Tv Series
The Wire
Tinker, Tailor, Soldier, Spy (and its sequel : Smiley’s People)
Edge of Darkness (with Bob Peck and Joe Don Baker )
The Film Nobody Ever Made
Dreamcasting across time with the living and the dead, this untitled, yet to be written masterwork of cinema, shot, no doubt, by Christopher Doyle, lives only in my imagination.
This guy
And this guy
All great films need:
The Oscar goes to..
And
NOTE
Sorry, each item had a picture attached, they’re gone.
I Want Them Back
If you bought these vinyls from an emaciated looking dude with an eager, somewhat distracted expression on his face somewhere on upper Broadway sometime in the mid 80’s, that was me . I’d like them back. In a sentimental mood.
NOTE
There were 11 images here.
Objects of Desire
material things I feel a strange, possibly unnatural attraction to and will buy (if I can) if I stumble across them in my travels. I am not a paid spokesperson for any of this stuff .
Vintage Persol sunglasses : This is pretty obvious. I wear them a lot. I collect them when I can. Even my production team have taken to wearing them.
19th century trepanning instruments: I don’t know what explains my fascination with these devices, designed to drill drain-sized holes into the skull often for purposes of relieving "pressure" or "bad humours". But I can’t get enough of them. Tip: don’t get a prolonged headache around me and ask if I have anything for it. I do.
Montagnard bracelets: I only have one of these but the few that find their way onto the market have so much history. Often given to the indigenous mountain people ’s Special Forces advisors during the very early days of America’s involvement in Vietnam .
Jiu Jitsi Gi’s: Yeah. When it comes to high end BJJ wear, I am a total whore. You know those people who collect limited edition Nikes ? I’m like that but with Shoyoroll . In my defense, I don’t keep them in plastic bags in a display case. I wear that shit.
Voiture: You know those old school, silver plated (or solid silver) blimp like carts they roll out into the dining room to carve and serve your roast? No. Probably not. So few places do that anymore. House of Prime Rib does it. Danny Bowein does it at Mission Chinese. I don’t have one of these. And I likely never will. But I can dream.
Kramer knives: I don’t own one. I can’t afford one . And I’d likely have to wait for years even if I could afford one. There’s a long waiting list for these individually hand crafted beauties. But I want one. Badly. http://www.kramerknives.com/gallery/
R. CRUMB : All of it. The collected works. These Taschen volumes to start. I wanted to draw brilliant, beautiful, filthy comix like Crumb until I was 13 or 14 and it became clear that I just didn’t have that kind of talent. As a responsible father of an 8 year old girl, I just can’t have this stuff in the house. Too dark, hateful, twisted. Sigh...
THE MAGNIFICENT AMBERSONS : THE UNCUT, ORIGINAL ORSON WELLES VERSION: It doesn’t exist. Which is why I want it. The Holy Grail for film nerds, Welles’ follow up to CITIZEN KANE shoulda, coulda been an even greater masterpiece . But the studio butchered it and re-shot a bullshit ending. I want the original. I also want a magical pony.
NOTE
Each bulleted point had an image too.
Four Spy Novels by Real Spies and One Not by a Spy
I like good spy novels. I prefer them to be realistic . I prefer them to be written by real spies. If the main character carries a gun, I’m already losing interest. Spy novels should be about betrayal.
Ashenden–Somerset Maugham
Somerset wrote this bleak, darkly funny, deeply cynical novel in the early part of the 20th century. It was apparently close enough to the reality of his espionage career that MI6 insisted on major excisions. Remarkably ahead of its time in its atmosphere of futility and betrayal.
The Man Who Lost the War–WT Tyler
WT Tyler is a pseudonym for a former "foreign service" officer who could really really write. This one takes place in post-war Berlin and elsewhere and was, in my opinion, wildly under appreciated. See also his Ants of God.
The Human Factor–Graham Greene
Was Greene thinking of his old colleague Kim Philby when he wrote this? Maybe. Probably. See also Our Man In Havana.
The Tears of Autumn -Charles McCarry
A clever take on the JFK assassination with a Vietnamese angle. See also The Miernik Dossier and The Last Supper
Agents of Innocence–David Ignatius
Ignatius is a journalist not a spook, but this one, set in Beirut, hewed all too closely to still not officially acknowledged events. Great stuff.
Hotel Slut (That’s Me)
I wake up in a lot of hotels, so I am fiercely loyal to the ones I love. A hotel where I know immediately wher I am when I open my eyes in the morning is a rare joy. Here are some of my favorites
CHATEAU MARMONT ( LA) : if I have to die in a hotel room, let it be here. I will work in LA just to stay at the Chateau.
CHILTERN FIREHOUSE (London): Same owner as the Chateau. An amazing Victorian firehouse turned hotel. Pretty much perfection
THE RALEIGH (Miami): The pool. The pool!
LE CONTINENTAL (Saigon): For the history.
HOTEL OLOFSSON (Port au Prince): Sagging, creaky and leaky but awesome .
PARK HYATT (Tokyo): Because I’m a film geek.
EDGEWATER INN (Seattle): kind of a lumber theme going on...ships slide right by your window. And the Led Zep "Mudshark incident".
THE METROPOLE (Hanoi): there’s a theme developing: if Graham Greene stayed at a hotel, chances are I will too.
GRAND HOTEL D'ANGKOR (Siem Reap): I’m a sucker for grand, colonial era hotels in Asia.
THE MURRAY (Livingston,Montana): You want the Peckinpah suite
Steaming Hot Porn
from my phone
Bun Bo Hue
Kuching Laksa
Pot au Feu
Jamon
Linguine
Meat
Dessert
Light Lunch
Meat on a Stick
Oily Little Fish
Snack
Soup
Homage
NOTE
Pictures in each have not been recovered.
5 Photos on My Phone, Chosen at Random
Not TOO random
Madeline
Beirut
Musubi
BudaeJiggae
Dinner
NOTE
Shame, indeed, no pictures, there was one for each.
People I’d Like to Be for a Day
Bootsy Collins
Bill Murray
I’m Hungry and Would Be Very Happy to Eat Any of This Right Now
Spaghetti a la bottarga . I would really, really like some of this. Al dente, lots of chili flakes
A big, greasy double cheeseburger. No lettuce. No tomato. Potato bun.
A street fair sausage and pepper hero would be nice. Though shitting like a mink is an inevitable and near immediate outcome
Some uni. Fuck it. I’ll smear it on an English muffin at this point.
I wonder if that cheese is still good?
Observations From a Beach
In which my Greek idyll is Suddenly invaded by professional nudists
Endemic FUPA. Apparently a prerequisite for joining this outfit.
Pistachio dick
70’s bush
T-shirt and no pants. Leading one to the obvious question : why bother?
Guilty Pleasures
Popeye’s Mac and Cheese
The cheesy crust on the side of the bowl of Onion Soup Gratinee
Macaroons . Not macarons . Macaroons
Captain Crunch
Double Double Animal Style
Spam Musubi
Aerosmith
Some New York Sandwiches
Before he died, Warren Zevon dropped this wisdom bomb: "Enjoy every sandwich". These are a few locals I’ve particularly enjoyed:
PASTRAMI QUEEN: (1125 Lexington Ave. ) Pastrami Sandwich. Also the turkey with Russian dressing is not bad. Also the brisket.
EISENBERG'S SANDWICH SHOP: ( 174 5th Ave.) Tuna salad on white with lettuce. I’d suggest drinking a lime Rickey or an Arnold Palmer with that.
THE JOHN DORY OYSTER BAR: (1196 Broadway) the Carta di Musica with Bottarga and Chili is amazing. Is it a sandwich? Yes. Yes it is.
RANDOM STREET FAIRS: (Anywhere tube socks and stale spices are sold. ) New York street fairs suck. The same dreary vendors, same bad food. But those nasty sausage and pepper hero sandwiches are a siren song, luring me, always towards the rocks. Shitting like a mink almost immediately after is guaranteed but who cares?
BARNEY GREENGRASS : ( 541 Amsterdam Ave.) Chopped Liver on rye. The best chopped liver in NYC.
Great Dead Bars of New York
A work in progress
SIBERIA in any of its iterations. The one on the subway being the best
LADY ANNES FULL MOON SALOON a bar so nasty I’d bring out of town visitors there just to scare them
THE LION'S HEAD old school newspaper hang out
KELLY'S on 43rd and Lex. Notable for 25 cent drafts and regularly and reliably serving me when I was 15
THE TERMINAL BAR legendary dive across from port authority
BILLY'S TOPLESS (later, Billy’s Stopless) an atmospheric, working class place, perfect for late afternoon drinking where nobody hustled you for money and everybody knew everybody. Great all-hair metal jukebox . Naked breasts were not really the point.
THE BAR AT HAWAII KAI. tucked away in a giant tiki themed nightclub in Times Square with a midget doorman and a floor show. Best place to drop acid EVER.
THE NURSERY after hours bar decorated like a pediatrician’s office. Only the nursery rhyme characters were punk rockers of the day.
Lost page
It was surprising to see that only one page was not recoverable from the common crawl.
What’s next?
I’ve enjoyed this little project tremendously—a little archeology project. Can we declare victory for at least this endeavor? Hopefully, we would be able to find images, but that’s a little tougher, since that era’s cloudfront is fully gone.
What else can we work on restoring and setting up some sort of a public archive to store them? I made this a
git repository
for the sole purpose so that anyone interested can contribute their interest and passion for these kinds of projects.
Thank you and until next time!
◼︎
Workday project at Washington University hits $266M
The total cost of a Workday implementation project at Washington University in St. Louis is set to hit almost $266 million, it was revealed after the project was the subject of protests from students.
In late October, students demonstrated outside the Faculty Senate demanding the University’s leadership reveal more details about its finances, including its spending on Workday, amid concerns about job losses at the institution.
In
an email to Student Life
, the institution’s independent student newspaper, David Gray, executive vice chancellor for finance and chief financial officer (CFO), said the total cost of the project was set to reach upwards of $265 million over at least seven years, roughly $16,000 per student.
The student newspaper said the Workday project was broken down into $81 million for financial and human resources services (HCM), $98.9 million for the student application called Sunrise, and $56.5 million for planning, data integration, and financial aid. Meanwhile $23.8 million in the 2026 financial year is for support and $5.7 million for annual licensing.
The project started with HCM in 2018, which went live in 2021. The student application started planning in 2020 and went live in 2024 and 2025.
“The legacy student information system was in its last phase of life. It was a 1990s era set of fragile, homegrown applications including WebSTAC, WebFAC, SIS Admin and other platforms. With the transition, the University replaced nearly 80 separate student systems with Workday,” Gray told the newspaper.
We contacted both the University and Workday for comment and will update this article if we hear back.
Washington University in St. Louis is a private research university in Missouri. It is not to be confused with the University of Washington, a public university in Washington State.
Co-incidentally, the latter
has also implemented Workday
in a project which similarly attracted criticism. In March last year, hundreds of research grants were stuck in processing limbo, as the institution grappled with the $340 million implementation.
The US West Coast university spent more than five years shifting to a centralized cloud-based SaaS finance and HR system. At the time, it said it had made significant progress with its workstreams, but there was still more to do.
In late 2024, Workday CEO Carl Eschenbach told
The Register
that more than 90 percent of the SaaS HR and finance application vendor's rollouts were a success, putting aside the company's high-profile difficulties in
Maine
and
Iowa
state-level projects. ®
Purdue University Approves New AI Requirement for All Undergrads
As part of its larger AI strategy, Purdue University will require all undergraduates to demonstrate basic AI competency, beginning next year.
getty
Purdue University will begin requiring that all of its undergraduate students demonstrate basic competency in artificial intelligence starting with freshmen who enter the university in 2026.
The new “AI working competency” graduation requirement
was approved by the university’s Board of Trustees
at its meeting on December 12. It’s part of a broader
AI@Purdue
strategy that spans five areas: Learning with AI, Learning about AI, Research AI, Using AI and Partnering in AI.
“The reach and pace of AI’s impact to society, including many dimensions of higher education, means that we at Purdue must lean in and lean forward and do so across different functions at the university,” said Purdue President Mung Chiang in a news release. “AI@Purdue strategic actions are part of the Purdue Computes strategic initiative, and will continue to be refreshed to advance the missions and impact of our university.”
The requirement will be embedded into every undergraduate program at Purdue, but it won’t be done in a “one-size-fits-all” manner. Instead, the Board is delegating authority to the provost, who will work with the deans of all the academic colleges to develop discipline-specific criteria and proficiency standards for the new campus-wide requirement. Chiang said students will have to demonstrate a working competence through projects that are tailored to the goals of individual programs. The intent is to not require students to take more credit hours, but to integrate the new AI expectation into existing academic requirements.
Although the requirement doesn’t officially kick in until next fall, some of the underlying educational resources and innovations will be made available to currently enrolled students as soon as the spring semester.
While the news release claimed that Purdue may be the first school to establish such a requirement, at least one other university has introduced its own institution-wide expectation that all its graduates acquire basic AI skills. Earlier this year, The Ohio State University launched an
AI Fluency initiative
, infusing basic AI education into core undergraduate requirements and majors, with the goal of helping students understand and use AI tools— no matter their major.
Purdue wants its new initiative to help graduates:
Understand and use the latest AI tools effectively in their chosen fields, including being able to identify the key strengths and limits of AI technologies;
Recognize and communicate clearly about AI, including developing and defending decisions informed by AI, as well as recognizing the influence and consequences of AI in decision-making;
Adapt to and work with future AI developments effectively.
Purdue Provost Patrick Wolfe said that it was “absolutely imperative that a requirement like this is well informed by continual input from industry partners and employers more broadly,” and therefore he has “asked that each of our academic colleges establishes a standing industry advisory board focusing on employers’ AI competency needs and that these boards are used to help ensure a continual, annual refresh of our AI curriculum and requirements to ensure that we keep our discipline-specific criteria continually current.”
Purdue already has BA and BS degree programs in AI, and it offers a Masters of Science in Artificial Intelligence as well. Recently, it has taken major steps to develop its AI research capacity in areas such as agriculture and food systems, manufacturing, transportation and logistics, and health sciences, and it has equipped faculty and staff with additional AI resources like
Microsoft 365 Copilot
.
In November,
Purdue and Google announced plans
to strengthen their educational and research partnership, and the university has collaborated with Apple to launch a
Spatial Computing Hub
on campus. You can learn more about Purdue’s overall AI resources and strategy
here
.
As nearly every business sector adopts artificial intelligence into its core operations, creating a growing demand for workers with basic AI skills, look for more colleges and universities to place a new emphasis on how best to educate students about artificial intelligence tools. New AI majors and minors are being introduced, interdisciplinary AI centers are being formed, and faculty and students are using AI tools to advance research in a wide range of fields.
Not too long ago, colleges’ main concern about AI was how to prevent students from using it to cheat on assignments, short-changing their learning in the process. Now, that apprehension is being replaced by a new priority — preparing students for the demands of a workforce rapidly being transformed by artificial intelligence technologies.
Lucas de Groot, Designer of Calibri, on the State Department’s Switch Back to Times New Roman
Daring Fireball
news.ycombinator.com
2025-12-13 20:38:18
From the LucasFonts account, in a comment on Hacker News:
Professional typography can be achieved with both serif and
sans-serif fonts. However, Times New Roman — a typeface older
than the current president — presents unique challenges.
Originally crafted in Great Britain for newspaper printing,...
Our studio, LucasFonts, designed Calibri. Here are our CEO Luc(as) de Groot’s thoughts on the matter:
The decision to abandon Calibri on the grounds of it being a so-called “wasteful diversity font” is both amusing and regrettable. Calibri was specifically designed to enhance readability on modern computer screens and was selected by Microsoft in 2007 to replace Times New Roman as the default font in the Office suite. There were sound reasons for moving away from Times: Calibri performs exceptionally well at small sizes and on standard office monitors, whereas serif fonts like Times New Roman tend to appear more distorted. While serif fonts are well-suited to high-resolution displays, such as those found on modern smartphones, on typical office screens the serifs introduce unnecessary visual noise and can be particularly problematic for users with impaired vision, such as older adults.
Professional typography can be achieved with both serif and sans-serif fonts. However, Times New Roman—a typeface older than the current president—presents unique challenges. Originally crafted in Great Britain for newspaper printing, Times was optimised for paper, with each letterform meticulously cut and tested for specific sizes. In the digital era, larger size drawings were repurposed as models, resulting in a typeface that appears too thin and sharp when printed at high quality.
Serif fonts are often perceived as more traditional, but they are also more demanding to use effectively. While a skilled typographer can, in theory, produce excellent results with Times, using it in its default digital form is not considered professional practice.
Calibri, by contrast, incorporates extensive spacing adjustments and language-specific refinements. The digital version of Times New Roman, developed in the early days of computing, offers only minimal kerning and letter-pair adjustments. This is especially evident in words set in all capitals—such as “CHICAGO”—where the spacing is inconsistent: the letters “HIC” are tightly packed, while “CAG” are spaced too far apart. Microsoft cannot rectify these issues without altering the appearance of existing documents.
There comes a time in every software engineer’s life when they come up with a
new binary-to-decimal floating-point conversion method. I guess my time has
come. I just wrote one, mostly over a weekend:
https://github.com/vitaut/zmij
.
It incorporates lessons learned from implementing
Dragon4
,
Grisu
and
Schubfach
along with a few new ideas
from myself and others. The main guiding principle is Alexandrescu’s
“no work is less work than some work” so a number of improvements come from
removing things from Schubfach (conditional branches, computations and even
candidate numbers).
~59 times (not percent!) faster than
sprintf
on macOS (Dragon4?)
Converting a single double takes about 10 to 20 ns on Apple M1.
What are the improvements?
Here is a list of improvements compared to Schubfach:
Selection from 1-3 candidates instead of 2-4
Fewer integer multiplications in the shorter case
Faster logarithm approximations
Faster division and modulo
Fewer conditional branches
More efficient significand and exponent output
Let’s take a look at some of them.
The first small improvement is having a single branch to quickly check for
special cases: NaN, infinity, zero or subnormals. There are still additional
checks within that path but the common case is more streamlined.
Another improvement is using faster fixed-point logarithm approximations.
Schubfach does the following:
// log10_2_sig = round(log10(2) * 2**log10_2_exp)
constexprint64_tlog10_2_sig=661'971'961'083;constexprintlog10_2_exp=41;// Computes floor(log10(pow(2, e))) for e <= 5456721.
autofloor_log10_pow2(inte)noexcept->int{returne*log10_2_sig>>log10_2_exp;}
Dragonbox also uses 32-bit approximations with slightly different constants.
Similarly, we can replace some integer divisions with integer multiplications.
Compilers already know how to do this, but we can do better when we know that
the range of inputs is small:
// Returns {value / 100, value % 100} correct for values of up to 4 digits.
inlineautodivmod100(uint32_tvalue)noexcept->divmod_result{assert(value<10'000);constexprintexp=19;// 19 is faster or equal to 12 even for 3 digits.
constexprintsig=(1<<exp)/100+1;uint32_tdiv=(value*sig)>>exp;// value / 100
return{div,value-div*100};}
Another optimization and simplification is branchless handling of
irregular rounding intervals. I wrote about rounding intervals in my
earlier
blog post
, but for the purposes of this post it is sufficient
to know that a rounding interval for a floating-point number is an interval
that contains all real numbers that round back to that number. Normally the
intervals are symmetric, except when there is a jump in the exponent
(the irregular case):
Most algorithms handle irregular intervals via a completely separate path or at
least some branching. This is not terrible, because irregular cases are rare for
random floating-point numbers. However, it is possible to handle it cheaply and
branchlessly, avoiding extra complexity, which is what I did.
A more interesting improvement comes from a talk by Cassio Neri
Fast Conversion From Floating Point Numbers
. In Schubfach, we look at four
candidate numbers. The first two, of which at most one is in the rounding
interval, correspond to a larger decimal exponent. The other two, of which at
least one is in the rounding interval, correspond to the smaller exponent.
Cassio’s insight is that we can directly construct a single candidate
from the upper bound in the first case.
This improvement has a nice effect: it allows us to avoid scaling the value
itself by a power of 10, because we only need the lower and upper bounds.
This saves two 64-bit integer multiplications in the shorter case.
Unfortunately, this does not help in the longer case, but there are improvements
to be made there as well. Classic Schubfach first checks whether there is only
one candidate from the second set in the rounding interval and returns early in
that case. We can combine this check with the closedness check. This seems
counterintuitive, because we do more work (sorry, Andrei), but it eliminates a
poorly predicted conditional branch and also simplifies the code.
So we go from this:
uint64_tdec_sig_over=dec_sig_under+1;boolunder_in=lower+bin_sig_lsb<=(dec_sig_under<<2);boolover_in=(dec_sig_over<<2)+bin_sig_lsb<=upper;if(under_in!=over_in){// Only one of dec_sig_under or dec_sig_over are in the rounding interval.
returnwrite(buffer,under_in?dec_sig_under:dec_sig_over,dec_exp);}// Both dec_sig_under and dec_sig_over are in the interval - pick the closest.
intcmp=scaled_sig-((dec_sig_under+dec_sig_over)<<1);boolunder_closer=cmp<0||cmp==0&&(dec_sig_under&1)==0;returnwrite(buffer,under_closer?dec_sig_under:dec_sig_over,dec_exp);
to this:
// Pick the closest of dec_sig_under and dec_sig_over and check if it's in
// the rounding interval.
int64_tcmp=int64_t(scaled_sig-((dec_sig_under+dec_sig_over)<<1));boolunder_closer=cmp<0||(cmp==0&&(dec_sig_under&1)==0);boolunder_in=(dec_sig_under<<2)>=lower;write(buffer,(under_closer&under_in)?dec_sig_under:dec_sig_over,dec_exp);
There are also many improvements in significand and exponent output. The
simplest one, which has been used for many years in
{fmt}
and which I learned from Alexandrescu’s talk
“Three Optimization Tips for C++”, is using a lookup table to output pairs of
decimal digits. This alone halves the number of integer multiplications and is
particularly important here, because the significand is often 16–17 digits long.
Another trick is branchless removal of trailing zeros using another small
lookup table, which I believe comes from the excellent
Drachennest
project by Alexander Bolz. There are
ideas for improving this further and potentially getting rid of the lookup
table entirely.
Is this a new algorithm?
Does it deserve to be called a new algorithm, or is it just an optimization of
Schubfach?
This method, or at least some elements of it, will be used in {fmt},
and it is also a good candidate for JSON serialization in
Thrift
and elsewhere. If you have other
applications that could benefit from faster floating-point formatting,
feel free to check it out now, or wait until it is integrated into {fmt}.
Thanks to my ISO C++ paper
P2587
“to_string or not to_string”,
std::to_string
will also be able to use this
or a similar method. This will make this standard API both performant and
actually useful.
Current limitations
Despite the name, the implementation is not fully
polished
yet. In particular, it currently
supports only exponential, also known as scientific, format, although adding
fixed format should be straightforward.
“Fun” fact
My former colleague David Gay wrote an early
dtoa
implementation back at
Bell Labs, and it was widely used for many years.
We thought Go would give us a single, portable agent binary for every Linux distro. Turns out… not exactly. But also, kind of yes.
This post kicks off a series about the traps we fell into while building a cross-platform server monitoring agent.
First, some
theory
.
simob
is our open source server monitoring agent that powers the Simple Observability platform. We like to think of it as a passive sensor, not a long running program or daemon. Because in the real world a passive sensor does not come with a long list of requirements. It’s small, self contained and can fit inside the existing system. That is the same goal we have for simob: a lightweight standalone binary with no requisites or external dependencies.
The same idea also applies to how we wanted to ship it. We wanted a project that you can compile from source on your development machine and run anywhere across your infrastructure. No complicated pipelines. No third party build services. Just a simple build that produces a portable binary.
Why we chose Go
In the observability world, if you're building an agent for metrics and logs, you're probably writing it in Go.
Promtail
,
Telegraf
,
Grafana Alloy
and many others are all written in Go.
And there are good reasons for that. First it’s compiled. A whole class of runtime errors gets caught before you even run the binary.
Then there is the garbage collector. For something that’s constantly ingesting and forwarding data, not having to manage memory is a massive advantage.
The Goroutines are also an excellent abstraction. We knew our agent would need to manage a lot of parallel task: tailing log files, reading from input plugins, and sending data upstream. We could write clear, sequential-looking code for each task and let the runtime handle the concurrency
And of course, because we thought we could compile it for any platform. "Just set
GOOS
and
GOARCH
at compile time and you're done"
The simple stuff
Most of the early work was simple. The Go ecosystem is more than a decade old and very rich. For core metrics collection we relied on
gopsutil,
a Go port of Python’s
psutil
. It gives you CPU, memory, network and disk metrics with a pretty clean API. It supports a wide range of operating systems and CPU architectures, removing the need for system specific code that we would otherwise have to write ourselves.
When it starts getting hard, the case of journal collector
Things became more complex once users asked for systemd journal log support. Journal logs are not stored in plain text. They use a binary format and live in
/var/log/journal
or
/run/log/journal
(depending on whether persistent logging is enabled). The format is structured, indexed and can include inline compression.
We had two options. The first was to write our own parser. The file format is
documented
and the systemd source is
available
Tools like
Kaitai Struct
could help us generate the parser code. It was not impossible. But it required time and careful reading of both the spec and the real implementation.
"Note that the actual implementation in the systemd codebase is the only ultimately authoritative description of the format, so if this document and the code disagree, the code is right"
— A comforting note from the systemd journal documentation. Nothing says "stable, well-documented binary format" like the docs telling you they might be wrong.
Our real concern was compatibility. We wanted a binary that works everywhere. That means support for past, current and future version of the journal format. We did not want to spend time maintaining a backward compatible parser or doing code archaeology. So this option was discarded.
The second option was to use the
C API
provided by systemd for reading the journal. A
Go wrapper
already exists. It exposes the journald C API directly. On paper this looked like the right solution, so this is what we chose.
Once we started using it, Go added some constraints. Because the wrapper calls the C API directly, the systemd library is dynamically linked. It must be present on the target machine at runtime. That part is fine. A machine without systemd has no journal logs to collect anyway. It does, however, introduce new build problems.
The first problem is that the build breaks on non systemd systems such as macOS. Since libsystemd is not available, you cannot build from or cross compile to Linux. You must build from a Linux system.
This affects both release builds and development builds. You cannot even run go run locally on a non systemd machine because the compiler cannot find the systemd library. Thankfully Go has build tags to tell the compiler what to include on each platform.
This line instructs the Go compiler to only build this file on Linux systems
It does add some code bloat, since a stub file is required for other systems so the package still compiles.
// myfunc_linux.go
//go:build linux
package mypkg
func MyFunc() string {
// real Linux implementation
}
// myfunc_stub.go
//go:build !linux
package mypkg
func MyFunc() string {
// "stub for other systems"
}
Separate files with build tags let you provide a real implementation for Linux while keeping a stub so the package still compiles elsewhere.
The second problem is that libsystemd differs between architectures. You need an amd64 version to build an amd64 binary and an arm64 version to build an arm64 binary. You cannot simply set
GOARCH
to produce every target from one worker. Each architecture build must run on a worker that has the matching libsystemd.
The glibc problem
There is another issue that shows up and is much harder to spot at first.
Go has a build flag called
CGO_ENABLED
. When it is enabled, the Go compiler links any C dependencies dynamically. This includes explicit C wrappers, like the
sdjournal
package, but also indirect calls inside the Go standard library. A common example is DNS resolution, which relies on glibc on Linux systems. With
CGO_ENABLED
set to 1, the final binary links to libc at runtime.
The default value depends on the environment. It is enabled by default when building natively on a system that supports cgo. It is disabled when cross compiling or when the C compiler is not available on the
PATH
. These defaults usually make sense. You generally do not want to enable cgo for cross compilation or for targets where glibc does not exist, such as Windows.
The problem is that a dynamically linked libc does not work on all Linux systems. Some Linux distributions do not use glibc. Mainly Alpine Linux, that uses musl. This means a binary built for a Linux system with
CGO_ENABLED
will work on Ubuntu or Debian but will fail at runtime on Alpine.
/bin/sh: ./simob: Permission denied
Don't get fooled by the "Permission denied". On Alpine and other musl systems, this error, when permissions are clearly set, almost always means the kernel can't find the required glibc dynamic linker.
This forces you to build a separate version of the agent for non glibc systems.
So, is Go the problem?
Not really. Go behaved exactly as documented. We were the ones assuming that "portable" meant "effortless". Once we pulled in low-level C libraries and tarted targeting a mix of glibc and non-glibc systems, the simple story fell apart. None of it is dramatic, just a set of constraints you only notice once you trip over them.
Our initial idea of building everything on a laptop and shipping the same binary everywhere did not survive for long. We now rely on GitHub Actions with the right runners for each architecture. It is more moving parts than we wanted, but it works and it stays out of the critical path.
Local builds are still possible with containers or emulation, although a bit more clunky than we hoped.
In the end the build pipeline is more complicated than we imagined, but the binaries we ship remain small and self-contained. That was the original goal, and we managed to keep that part intact.
Andrew Nesbitt recently wrote a post titled
What is a Package Manager
? This post attempts to do the same for build systems.
big picture
At a high level,
build systems
are tools or libraries that provide a way to
define
and
execute
a series of transformations from
input
data to
output
data that are
memoized
by
caching
them in an
object store
.
Transformations are called
steps
or
rules
1
and define how to execute a
task
that generates zero or more outputs from zero or more inputs.
A rule is usually the
unit of caching
; i.e. the
cache points
are the outputs of a rule, and
cache invalidations
must happen on the inputs of a rule.
Rules can have
dependencies
on previous outputs, forming a directed graph called a
dependency graph
.
Dependencies that form a cyclic graph are called
circular dependencies
and are usually banned.
2
Outputs that are only used by other rules, but not “interesting” to the end-user, are called
intermediate outputs
.
A output is
outdated
,
dirty
, or
stale
if one of its dependencies is modified, or,
transitively
, if one of its dependencies is outdated.
Stale outputs invalidate the cache and require the outputs to be
rebuilt
.
An output that is cached and not dirty is
up-to-date
.
Rules are outdated if any of their outputs are outdated.
If a rule has no outputs, it is always outdated.
Each invocation of the build tool is called a
build
.
A
full build
or
clean build
occurs when the cache is empty and all transformations are executed as a
batch job
.
A cache is
full
if all its rules are up-to-date.
An
incremental build
occurs when the cache is partially full but some outputs are outdated and need to be rebuilt.
Deleting the cache is called
cleaning
.
A build is
correct
or
sound
if all possible incremental builds have the same result as a full build.
3
A build is
minimal
(occasionally
optimal
) if rules are rerun at most once per build, and only run if necessary for soundness (
Build Systems à la Carte
,
Pluto
).
In order for a build to be sound, all possible cache invalidations must be
tracked
as dependencies.
A build system without caching is called a
task runner
or
batch compiler
.
Note that task runners still often support dependencies even if they don't support caching.
Build systems with caching can emulate a task runner by only defining tasks with zero outputs, but they are usually not designed for this use case.
4
Some examples of build systems:
make
,
docker build
, rustc.
Some examples of task runners:
just
, shell scripts,
gcc
.
specifying dependencies
A build can be either
inter-process
, in which case the task is usually a single
process execution
and its input and output files, or
intra-process
, in which case a task is usually a single function call and its arguments and return values.
In order to track dependencies, either all inputs and outputs must be
declared
in source code ahead of time, or it must be possible to
infer
them from the execution of a task.
Build systems that track changes to a rule definition are called
self-tracking
. Past versions of the rule are called its
history
(
Build Systems à la Carte
).
The act of inferring dependencies from runtime behavior is called
tracing
.
If a traced rule depends on a dependency that hasn’t been built yet, the build system may either error,
suspend
the task and
resume
it later once the dependency is built, or
abort
the task and
restart
it later once the dependency is built (
Build Systems à la Carte
).
Inter-process builds often declare their inputs and outputs, and intra-process builds often infer them, but this is not inherent to the definition.
5
Some example of intra-process builds include spreadsheets, the
wild
linker, and memoization libraries such as python’s
functools.cache
.
applicative and monadic structure
A build graph is
applicative
if all inputs, outputs, and rules are declared ahead of time.
We say in this case the graph is
statically known
.
Very few build systems are purely applicative, almost all have an escape hatch.
The graph is
monadic
if not all outputs are known ahead of time, or if rules can generate other rules dynamically at runtime. Inputs that aren’t known ahead of time are called
dynamic dependencies
. Dynamic dependencies are weaker than a fully monadic build system, in the sense that they can
express
fewer build graphs.
6
Build systems that do not require declaring build rules are always monadic.
Some examples of monadic build systems include
Shake
, ninja
dyndeps
, and Cargo build scripts.
Some examples of applicative build systems include
make
(with
recursive make
disallowed), Bazel (excluding native rules), and map/reduce libraries with memoization, such as
this unison program
.
early cutoff
If a dirty rule R has an outdated output, reruns, and creates a new output that matches the old one, the build system has an opportunity to avoid running later rules that depend on R.
Taking advantage of that opportunity is called
early cutoff
.
In unsound build systems, it’s possible that the build system does not accurately
detect
that it needs to rebuild.
Such systems sometimes offer a way to
force-rerun
a target: keeping the existing cache, but rerunning a single rule.
For inter-process build systems, this often involves
touch
ing a file to set its modification date to the current time.
the executor
A
build executor
runs tasks and is responsible for
scheduling
tasks in an order that respects all dependencies, often using heuristics such as dependency depth or the time taken by the task on the last run.
They also detect whether rule inputs have been modified, making the rule outdated; this is called
rebuild detection
.
The build executor is responsible for restarting or suspending tasks in build systems that support it.
Executors often provide
progress reporting
, and sometimes allow
querying
the dependency graph.
Occasionally they trace the inputs used by the task to enforce they match the declared dependencies, or to automatically add them to an internal dependency graph.
inter-process builds
In the context of inter-process builds, an
artifact
is an output file generated by a rule.
8
A
source file
is an input file that is specific to the current
project
9
(sometimes
repository
or
workspace
) as opposed to a
system dependency
that is reused across multiple projects.
A project is loosely defined but generally refers to the set of all input and output files that the build system knows about, usually contained in a single directory.
Source files can be
generated
, which means they are an output of a previous rule.
Build files
contain rule definitions, including (but not limited to) task definitions, input and output declarations, and metadata such as a human-readable description of the rule.
Inputs are usually split into
explicit inputs
passed to the spawned process,
implicit inputs
that are tracked by the build system but not used in the task definition, and
order-only inputs
that must exist before the rule can execute, but do not invalidate the cache when modified.
Process executions have more inputs than just files, such as the rule itself, environment variables, the current time, the current working directory, and occasionally network services or local daemons
10
.
The set of all inputs that are not source files or command line arguments is called the
environment
.
Processes can be
sandboxed
to prevent them from depending on the network, a daemon, or occasionally system dependencies; this is sometimes called a
sandboxed environment
or
isolated environment
.
System dependencies are more expansive than I think they are often understood to be.
They include compilers, linkers, programming language libraries
11
, and
static and dynamically linked object files
, but also the dynamic loader, language runtime, and various system configuration files.
The subset of these dependencies needed for building a minimal program in a given language, along with various tools for inspecting and modifying the outputs at runtime, are called a
toolchain
.
Toolchains are inherently specific to a given language, but sometimes (e.g. in GCC) a single compiler will support multiple languages as inputs.
A build is
hermetic
(rarely,
self-contained
or
isolated
12
) if it uses no system dependencies and instead defines all its dependencies in the project (
Bazel
).
Sandboxing and hermeticity are orthogonal axes; neither one implies the other.
For example, docker builds are sandboxed but not hermetic, and nix shells are hermetic but not sandboxed.
Compiler or linkers sometimes have their own
incremental caches
.
Reusing the cache requires you to
trust
the compiler to be sound when incrementally rebuilding.
This is usually implicit, but hermetic or sandboxed builds require an opt-in to reuse the cache.
Bazel calls this kind of reuse a
persistent worker
.
determinism
A build is
deterministic
if it creates the same output every time in some specific environment.
A build is
reproducible
if it is deterministic and also has the same output in
any environment
, as long as the system dependencies remain the same.
remote caching
Caching can be remote or local.
Remote caching
is almost always unsound unless the build is both hermetic and reproducible (i.e. its only environment dependencies are controlled by the build system).
Downloading files from the remote cache is called
materializing
them.
Most build systems with remote caching
defer materialization
as long as possible, since in large build graphs the cache is often too large to fit on disk.
Builds where the cache is never fully materialized are called
shallow builds
(
Build Systems à la Carte
).
Remote caching usually, but not necessarily, uses
content addressed hashing
in a
key-value store
to identify which artifact to download.
Some example build systems that use remote caching: Bazel, Buck2, nix,
docker build
.
interface
Build systems usually have a way to run a subset of the build.
The identifier used to specify which part of the build you want to run is called a
target
.
13
Targets are usually the filenames of an artifact, but can also be abstract names of one or more rules.
Bazel-descended build systems call these names
labels
.
Make-descended build systems call these
phony targets
.
Some build systems, such as cargo, do not use target identifiers but instead only have subcommands with arguments; the combination of arguments together specifies a set of targets.
Some example targets:
make all
cargo build --test http_integration
buck2 build :main
meta-build systems
Inter-process build systems are often divided into a
configuration
step and a
build
step.
A build system that only runs the configuration step, and requires another tool for the build step, is called a
meta-build system
.
Usually this meta-build system
discovers
the rules that need to be executed (often through file globbing or some other programmatic way to describe dependencies), then
serializes
these rules into an
action graph
, which can be stored either in-memory or on-disk. On-disk serialized action graphs are usually themselves build files, in the sense that you can write them by hand but you wouldn't want to.
Configuration steps usually allow the developer to choose a set of
configuration flags
(occasionally,
build flags
) that affect the generated rules.
Some build systems also integrate directly with the
package manager
, but this is uncommon, and usually the build system expects all packages to be pre-downloaded into a known location.
Some examples of meta-build systems are CMake, meson, and autotools.
VFS
Advanced build systems can integrate with a
virtual file system
(VFS) to check-out source control files on-demand, rather than eagerly (
EdenFS
).
intra-process builds
The equivalent of system dependencies within a process is
non-local state
, including environment variables, globals, thread-locals, and class member fields (for languages where
this
is passed implicitly).
Especially tricky are function calls that do
inter-process communication
(IPC), which are basically never sound to cache.
Tracing intra-process builds is very very hard since it’s easy to call a function that depends on global state without you knowing.
14
In this intra-process context, most object stores are
in-memory caches
. A build system that supports saving (
persisting
) the cache to disk is said to have
persistence
. The system for persisting the cache is sometimes called a
database
, even if it is not a general-purpose database in the sense the term is normally used (
Salsa
).
tracing
Tracing intra-process build systems are sometimes called a
query system
.
15
They work similarly to their inter-process equivalents: the interface looks like normal function calls, and the build system tracks which functions call which other functions, so it knows which to rerun later.
Intra-process build systems that allow you to explicitly declare dependencies usually come from the background of
functional reactive programming
(FRP). FRP is most often used in UI and frontend design, but many of the ideas are the same as the build systems used for compiling programs.
Unlike any of the build systems we've talked about so far, FRP libraries let you look at
past versions
of your outputs, which is sometimes called
remembering
state (
React
). To make this easier to reason about, rules can be written as
event handlers
.
Some examples of libraries with dependency declarations:
React
.
so, what counts as a build system?
A build system is pretty much anything that lets you specify dependencies on a previous artifact 😄 Some more weird examples of build systems:
Github Actions (jobs and workflows)
Static site generators
Docker-compose files
Systemd unit files
Excel
Hopefully this post has given you both a vocabulary to talk about build systems and a context to compare them!
Nearly all build systems are inconsistent about whether a rule refers to an
abstract description
of how to build an output (i.e., can be reused for multiple sets of inputs and outputs), or a
concrete instantiation
of that description for a specific set of inputs and outputs. We have to live with the ambiguity, unfortunately.
↩
Weird things can happen here though; for example early cutoff can allow circular dependencies. This sometimes comes up for generated build.ninja files.
↩
The
pluto paper
defines this as “after a build, generated files consistently reflect the latest source files”. Neither my definition nor pluto's definition are particularly well-defined if the build is non-deterministic. Defining this formally would probably require constructing an isomorphism between all programs with the same runtime behavior; but “runtime behavior” is not well-defined for a general-purpose build system that can output artifacts that are not programs.
↩
As we'll see later, the reverse is also true: a common design for build systems is to automatically inject cache points into an existing task runner, or to design the rule file to look as similar to a shell script or function call as possible.
↩
In particular, nearly all modern inter-process build systems have a limited form of tracing where they ask the compiler to generate "dep-info" files
16
that show which files were used (usually through imports) by a given source file. Note that this dep-info is not available until after the first time a build has run, and that this only works if the compiler supports it.
↩
Note that the dev-guide assumes that tasks are expensive relative to the cost of constructing the graph. This is true in the context of rustc, where LLVM codegen
17
normally dominates compilation time, but it isn't true for e.g.
spreadsheets
.
↩
It's possible for tasks to create files that aren't tracked by the build system, but these aren't called artifacts. I don't know a good word for these; "byproducts" is the closest but some build systems use that to mean
any
intermediate artifacts.
↩
I'm not super happy with this definition because it conflicts with how compilers use the term, but I do think it describes how most build systems think about files.
↩
Poorly written rules can also depend on which other rules are executing at the same time, which is called a
race condition
. Note this does not require the rule to be unsound, only for it to use intermediate files the build system doesn’t know about.
↩
for C, header files; for other languages, usually source files or intermediate representations.
↩
Yes, this overlaps with the term for sandboxing. Try to avoid the word "isolated" if possible.
↩
This has no relation to a
target platfom
, which is related to cross-compiling. I wish we had better names for these things.
↩
This actually has very strong analogies to the way "query" is used in a database context: just like a tracing query system, a database has to be able to restart a query's transaction if the data it's trying to access has been modified.
↩
Or, more rarely, type-checking, borrow-checking, or coherence checking.
↩
Standards for a Responsible AI Future: Reflections on the Seoul Statement
Internet Exchange
internet.exchangepoint.tech
2025-12-11 18:02:32
The statement comes at a time when principles of justice, dignity, and human rights are increasingly politicized, questioned, or treated as negotiable....
The statement comes at a time when principles of justice, dignity, and human rights are increasingly politicized, questioned, or treated as negotiable.
By Jacobo Castellanos, Coordinator of the Technology, Threats, and Opportunities team at WITNESS
On December 2, the ITU, ISO and IEC issued their
Seoul Statement
: a vision for AI standards that account for global contexts, rights impacts, and real-world harms.
The Seoul Statement includes four core commitments:
Integrating socio-technical perspectives into standards:
ensuring AI standards address not just algorithms and data, but real-world impacts on people, societies, and the environment.
Embedding human rights and universal values:
protecting dignity, privacy, fairness, and non-discrimination throughout AI design and governance.
Building an inclusive, multi-stakeholder community:
enabling governments, industry, researchers, and civil society to shape global AI norms together.
Strengthening public–private collaboration and capacity-building:
reducing global inequalities so all countries and communities can meaningfully benefit from AI.
This vision is not only welcome; it is a meaningful signal of hope.
It comes at a time when principles of justice, dignity, and human rights—once a shared reference point for international cooperation and for civil society’s engagement with governments and companies—are increasingly politicized, questioned, or treated as negotiable.
Why this matters
Standards, like regulation, form the structural base of the AI stack. By committing to explicitly consider human rights and real-world impact into standards development, the ITU, ISO, and IEC can help effectively steer AI’s impact toward protecting human rights, strengthening the information ecosystem, and fostering responsible innovation.
Human rights and civil society groups have been calling for this shift for years (see for example
OHCHR’s latest report
). Standards alone won’t solve every AI concern, but they can create a pathway, together with regulation and tooling, that will lead towards rights protections and limiting misuse. At WITNESS, we work at the intersection of technology and human rights, and we have seen this firsthand in
our work
with the
Coalition for Content Provenance and Authenticity (C2PA
), where a harm assessment continues to shape both the design of the standard and the ecosystem around it. By developing Content Credentials, a form of tamper-evident metadata that travels with an image, video, or audio file to show when, where, and how it was created or modified, C2PA offers a practical example of how standards can embed rights considerations from the ground up.
From Promise to Practice
While this vision is promising, a pressing question remains:
How will these commitments be translated into action?
The Seoul Statement was presented during a two-day summit held in Seoul, but concrete plans for its implementation were not shared. Representatives from the ITU, ISO, and IEC did not publicly outline how this vision would be realized, and no details were provided regarding budgets, mechanisms, timelines, or accountability measures.
Standards work is inherently slow and resource-intensive. Incorporating socio-technical and human rights considerations adds another layer of complexity that requires significant investment in expertise, time and financial support. Without such investment, the Seoul Statement risks becoming a symbolic gesture rather than a meaningful turning point.
A notable concern was the limited presence of civil society at the Seoul summit. Multistakeholder participation was frequently mentioned, yet only a few human rights groups attended. Government and industry voices were far more visible, which is too narrow a basis for defining future AI norms. For the SDOs’ vision to carry real weight, civil society must be involved consistently, adequately resourced, and included from the beginning, not added in as an afterthought.
A Call to Stay Engaged
Still, there is reason for cautious optimism. The Seoul Statement represents an important first step, formally issued by institutions that will play a fundamental role in shaping the future of AI. By acknowledging that AI standards cannot be “just technical” and must be grounded in human rights and societal wellbeing, it creates a platform to push for meaningful change.
At WITNESS, we will continue to be actively involved in the C2PA, where we co-chair its
Threats and Harms Task Force
, and we will engage with the
World Standards Cooperation’s
AI and Multimedia Authenticity Standards Collaboration (ITU, IEC and ISO) as it positions AI standards as a powerful tool for regulation development and enforcement.
We call on civil society, researchers, regulators and funders to remain engaged, not only when milestones are announced, but through the long, technical, often opaque process of drafting, reviewing and implementing standards. We must also hold the ITU, ISO, and IEC accountable to their own vision, while working to extend this commitment to other national and international SDOs, and to the remaining building blocks that sit atop the foundations of regulation and standards in the AI ecosystem.
You can support independent bookstores and get great deals without lining the pockets of billionaires. Help support Internet Exchange by shopping our bookshop
The Stack
on Bookshop.org. The Stack curates books that connect the dots between code and culture, protocol and protest, theory and practice.
Support the Internet Exchange
If you find our emails useful, consider becoming a paid subscriber! You'll get access to our members-only Signal community where we share ideas, discuss upcoming topics, and exchange links. Paid subscribers can also leave comments on posts and enjoy a warm, fuzzy feeling.
Not ready for a long-term commitment? You can always
leave us a tip
.
A year into integrating Germ with Bluesky’s AT Protocol, Tessa Brown describes how an encrypted messenger evolved into part of a growing open-protocol social ecosystem.
https://rhosf.leaflet.pub/3m7kwii6nws2c
Hannah Aubry argues that Elon Musk’s decision to block the European Commission from advertising on X highlights why governments and public institutions must move away from corporate-controlled platforms and toward open, decentralised social networks like Mastodon.
https://blog.joinmastodon.org/2025/12/the-world-needs-social-sovereignty
At the November 2025 TPAC event, Tara Whalen reported on the W3C/IAB workshop on Age-Based Restrictions on Content, highlighting challenges from diverse regulations, technical approaches, and enforcement roles.
https://www.youtube.com/watch?v=0lwsLg1Aa9g
In this recorded talk, Laura Lazaro Cabrera of the Center for Democracy and Technology Europe and Daniel Leufer of Access Now explain why the European Commission’s Digital Omnibus is not a simple technical fix and weakens key EU digital laws.
https://www.youtube.com/watch?app=desktop&v=ChUl6SN84qY&t=42s
Two ARMOR side meetings in Montreal explored the scale of network interference and the practical work needed to improve end-to-end resilience. Interested in getting involved in censorship resilience work in the IETF? Open tasks listed at the end.
https://mailarchive.ietf.org/arch/msg/armor/Ra3ogsNuAnykRW1axWB2UynRj-E/
John Minnich argues that US China rivalry is reshaping how power dynamics influence the supply of technology to less developed countries, potentially boosting access to advanced tech in the Global South unless a future grand bargain cools competition and reduces incentives to share it.
https://link.springer.com/article/10.1007/s43508-025-00125-9
This position paper argues that the distribution of nonconsensually collected nude images by researchers perpetuates image-based sexual abuse and that the machine learning community should stop the nonconsensual use of nude images in research.
https://openreview.net/pdf?id=Ev5xwr3vWh
Tanzania’s post-election tensions have taken a new turn after Instagram removed or restricted the accounts of two prominent activists, Mange Kimambi and Maria Sarungi-Tsehai, following their posts showing alleged killings and disappearances in the wake of the disputed 29 October poll.
https://www.youtube.com/watch?v=Q39_BThXvvY
Major social platforms like X, Meta, YouTube, Google Search and Substack function as a coordinated media ecosystem advancing far-right interests, argues Robin Berjon.
https://berjon.com/fascintern-media
Instacart is running large-scale pricing experiments that mean different customers pay different prices for the exact same groceries, at the same time, from the same store.
https://groundworkcollaborative.org/work/instacart
New research finds that when you account for the pressure and downsides people feel from not using popular apps, TikTok and Instagram shift from appearing beneficial to actually reducing overall wellbeing—even for the people who use them most.
https://www.aeaweb.org/articles?id=10.1257/aer.20231468
Are you doing amazing work related to data privacy or public policy? Privacy and Public Policy Conference are accepting abstract submissions for oral, lightning, and poster presentations, and the deadline is
December 15
.
https://privacypublicpolicy-conference.github.io/website
Comments on the ITU SG’s Fourth Draft Report for WTPF-26, published in the lead-up to the January Council Working Group sessions in Geneva, are due
December 18
.
https://www.itu.int/md/S24-WTPF26PREP-R-0004/en
SG21 created the AHG-EAI to examine future directions for Embodied AI (EAI) standardization, following discussions at the ITU workshop. For those interested in Embodied AI, and in particular the ITU's Ad Hoc Group EAI in SG21, please note that the next (virtual and possibly final) meeting is scheduled for
December 18, 6am EST.
https://www.itu.int/en/ITU-T/studygroups/2025-2028/21/Documents/AHGs/2025-10_TOR-AHG-EAI.pdf
Pondering Middle East Petrostates as American Media Owners
Daring Fireball
www.businessinsider.com
2025-12-13 17:16:28
Peter Kafka, writing at Business Insider:
And last: It’s possible that Middle Eastern countries are
investing in an American media conglomerate solely for a financial
return, and would have zero interest in the content that
conglomerate makes and distributes. But that’s an assertion that
many fo...
Donald Trump welcomed Saudi Crown Prince Mohammed bin Salman at the White House in November. Now, bin Salman's country is reportedly backing Larry and David Ellison's bid for Warner Bros. Discovery
Win McNamee/Getty Images
If that story sounds familiar, there's a good reason: In November, Variety reported more or less the same thing — which prompted Paramount to call the
story
"categorically inaccurate".
Now Variety is doubling down on its initial report.
Bloomberg
also reports that "Middle East funds" are involved in
the Ellisons' bid
. A Paramount rep declined to comment. I also reached out to the sovereign wealth funds. The Abu Dhabi Investment Authority declined to comment. I haven't heard back from the others.
But as I noted before, the fact that it's even
possible
that Middle Eastern petrostates could have ownership stakes in a giant American media conglomerate — one that would control major movie studios, streaming networks, and news outlets — tells us a lot about 2025. A few years ago, this would have seemed like a non-starter; now it seems quite close to happening.
David and Larry Ellison are charging ahead in their bid to buy Warner Bros Discovery.
Eric Charbonneau/Getty Images for The Hollywood Reporter
That's because Paramount, which is competing with Netflix and Comcast in the WBD bidding, still seems like the most likely WBD owner when all of this is done. That's partially because Paramount is offering to buy all of WBD, while Comcast and Netflix only want part of it. And partially because
the Ellisons — Larry Ellison in particular — are close to Donald Trump
, and we live in a world where people close to Donald Trump often get what they want.
In the absence of anyone involved in the deal talking to me on the record, I can imagine some arguments why a petrostate-backed mega-media conglomerate makes sense:
The funds would presumably have minority stakes in a
combined Paramount/WBD
, and it would presumably remain controlled by Americans.
Foreign investors have frequently owned some or all of big, American-based media companies: See, for instance, Japan's Sony, which owns a major movie studio and music label. And Saudi investor Prince Alwaleed bin Talal was a longtime minority investor in Rupert Murdoch's Fox empire; now he has a stake in the company formerly known as Twitter.
A Middle East-financed deal for WBD could raise some eyebrows
All true! But I still think that there are differences that will certainly raise eyebrows, and maybe more forceful pushback, if a combined Ellison/Middle East deal goes forward.
One obvious point: It's one thing to have a private company or investor from another company taking a stake in an American media giant; it's another to have one that's directly controlled by a foreign government.
Another one: As media companies continue to consolidate, the power of the remaining ones gets amplified. On their own for instance,
CBS News and CNN
have dwindling influence and financial power; a company that combines the two, though, might have more meaningful sway. You can argue that the Saudis owning one of the world's biggest video game companies is also meaningful, but the video game industry never gets the attention it deserves, and that seems likely to continue in this case.
And last: It's possible that Middle Eastern countries are investing in an American media conglomerate solely for a financial return, and would have zero interest in the content that conglomerate makes and distributes. But that's an assertion that many folks would have a hard time taking at face value. And while lots of American companies have sought Middle Eastern funding for years, there was a pause after 2018, following the murder and dismemberment of Washington Post contributor Jamal Khashoggi — a shocking act
the CIA concluded was ordered by Saudi Arabia's Crown Prince Mohammed bin Salman himself
. (He has denied involvement.)
Now bin Salman might end up owning a piece of major American news outlets and other media arms. How's that going to go over?
Microservices is a service-oriented software architecture in which server-side applications are constructed by combining many single-purpose, low-footprint network services. The touted benefits are improved modularity, reduced testing burden, better functional composition, environmental isolation, and development team autonomy. The opposite is a Monolithic architecture, where a large amount of functionality lives in a single service which is tested, deployed, and scaled as a single unit.
Twilio Segment adopted this as a best practice early-on, which served us well in some cases, and, as you’ll soon learn, not so well in others.
In the early days of Twilio Segment, we reached a tipping point with a core piece of
Twilio Segment’s product
. It seemed as if we were falling from the microservices tree, hitting every branch on the way down. Instead of enabling us to move faster, the small team found themselves mired in exploding complexity. Essential benefits of this architecture became burdens. As our velocity plummeted, our defect rate exploded.
Eventually, the team found themselves unable to make headway, with 3 full-time engineers spending most of their time just keeping the system alive. Something had to change. This post is the story of how we took a step back and embraced an approach that aligned well with our product requirements and needs of the team.
Twilio Segment’s customer data infrastructure ingests hundreds of thousands of events per second and forwards them to partner APIs, what we refer to as
server-side destinations
. There are
over one hundred types of these destinations
, such as Google Analytics, Optimizely, or a custom webhook.
Years back, when the product initially launched, the architecture was simple. There was an API that ingested events and forwarded them to a distributed message queue. An event, in this case, is a JSON object generated by a web or mobile app containing information about users and their actions. A sample payload looks like the following:
As events were consumed from the queue, customer-managed settings were checked to decide which destinations should receive the event. The event was then sent to each destination’s API, one after another, which was useful because developers only need to send their event to a single endpoint, Twilio Segment’s API, instead of building potentially dozens of integrations. Twilio Segment handles making the request to every destination endpoint.
If one of the requests to a destination fails, sometimes we’ll try sending that event again at a later time. Some failures are safe to retry while others are not. Retry-able errors are those that could potentially be accepted by the destination with no changes. For example, HTTP 500s, rate limits, and timeouts. Non-retry-able errors are requests that we can be sure will never be accepted by the destination. For example, requests which have invalid credentials or are missing required fields.
At this point, a single queue contained both the newest events as well as those which may have had several retry attempts, across all destinations, which resulted in
head-of-line blocking
. Meaning in this particular case, if one destination slowed or went down, retries would flood the queue, resulting in delays across
all
our destinations.
Imagine destination X is experiencing a temporary issue and every request errors with a timeout. Now, not only does this create a large backlog of requests which have yet to reach destination X, but also every failed event is put back to retry in the queue. While our systems would automatically scale in response to increased load, the sudden increase in queue depth would outpace our ability to scale up, resulting in delays for the newest events. Delivery times for all destinations would increase because destination X had a momentary outage. Customers rely on the timeliness of this delivery, so we can’t afford increases in wait times anywhere in our pipeline.
To solve the head-of-line blocking problem, the team created a separate service and queue for each destination. This new architecture consisted of an additional router process that receives the inbound events and distributes a copy of the event to each selected destination. Now if one destination experienced problems, only it’s queue would back up and no other destinations would be impacted. This microservice-style architecture isolated the destinations from one another, which was crucial when one destination experienced issues as they often do.
Each destination API uses a different request format, requiring custom code to translate the event to match this format. A basic example is destination X requires sending birthday as traits.dob in the payload whereas our API accepts it as traits.birthday. The transformation code in destination X would look something like this:
Many modern destination endpoints have adopted Twilio Segment’s request format making some transforms relatively simple. However, these transforms can be very complex depending on the structure of the destination’s API. For example, for some of the older and most sprawling destinations, we find ourselves shoving values into hand-crafted XML payloads.
Initially, when the destinations were divided into separate services, all of the code lived in one repo. A huge point of frustration was that a single broken test caused tests to fail across all destinations. When we wanted to deploy a change, we had to spend time fixing the broken test even if the changes had nothing to do with the initial change. In response to this problem, it was decided to break out the code for each destination into their own repos. All the destinations were already broken out into their own service, so the transition was natural.
The split to separate repos allowed us to isolate the destination test suites easily. This isolation allowed the development team to move quickly when maintaining destinations.
As time went on, we added over 50 new destinations, and that meant 50 new repos. To ease the burden of developing and maintaining these codebases, we created shared libraries to make common transforms and functionality, such as HTTP request handling, across our destinations easier and more uniform.
For example, if we want the name of a user from an event, event.name() can be called in any destination’s code. The shared library checks the event for the property key name and Name. If those don’t exist, it checks for a first name, checking the properties firstName, first_name, and FirstName. It does the same for the last name, checking the cases and combining the two to form the full name.
The shared libraries made building new destinations quick. The familiarity brought by a uniform set of shared functionality made maintenance less of a headache.
However, a new problem began to arise. Testing and deploying changes to these shared libraries impacted all of our destinations. It began to require considerable time and effort to maintain. Making changes to improve our libraries, knowing we’d have to test and deploy dozens of services, was a risky proposition. When pressed for time, engineers would only include the updated versions of these libraries on a single destination’s codebase.
Over time, the versions of these shared libraries began to diverge across the different destination codebases. The great benefit we once had of reduced customization between each destination codebase started to reverse. Eventually, all of them were using different versions of these shared libraries. We could’ve built tools to automate rolling out changes, but at this point, not only was developer productivity suffering but we began to encounter other issues with the microservice architecture.
The additional problem is that each service had a distinct load pattern. Some services would handle a handful of events per day while others handled thousands of events per second. For destinations that handled a small number of events, an operator would have to manually scale the service up to meet demand whenever there was an unexpected spike in load.
While we did have auto-scaling implemented, each service had a distinct blend of required CPU and memory resources, which made tuning the auto-scaling configuration more art than science.
The number of destinations continued to grow rapidly, with the team adding three destinations per month on average, which meant more repos, more queues, and more services. With our microservice architecture, our operational overhead increased linearly with each added destination. Therefore, we decided to take a step back and rethink the entire pipeline.
The first item on the list was to consolidate the now over 140 services into a single service. The overhead from managing all of these services was a huge tax on our team. We were literally losing sleep over it since it was common for the on-call engineer to get paged to deal with load spikes.
However, the architecture at the time would have made moving to a single service challenging. With a separate queue per destination, each worker would have to check every queue for work, which would have added a layer of complexity to the destination service with which we weren’t comfortable. This was the main inspiration for Centrifuge. Centrifuge would replace all our individual queues and be responsible for sending events to the single monolithic service. (Note that Centrifuge became the back-end infrastructure for
Connections
.)
Given that there would only be one service, it made sense to move all the destination code into one repo, which meant merging all the different dependencies and tests into a single repo. We knew this was going to be messy.
For each of the 120 unique dependencies, we committed to having one version for all our destinations. As we moved destinations over, we’d check the dependencies it was using and update them to the latest versions. We fixed anything in the destinations that broke with the newer versions.
With this transition, we no longer needed to keep track of the differences between dependency versions. All our destinations were using the same version, which significantly reduced the complexity across the codebase. Maintaining destinations now became less time consuming and less risky.
We also wanted a test suite that allowed us to quickly and easily run all our destination tests. Running all the tests was one of the main blockers when making updates to the shared libraries we discussed earlier.
Fortunately, the destination tests all had a similar structure. They had basic unit tests to verify our custom transform logic was correct and would execute HTTP requests to the partner’s endpoint to verify that events showed up in the destination as expected.
Recall that the original motivation for separating each destination codebase into its own repo was to isolate test failures. However, it turned out this was a false advantage. Tests that made HTTP requests were still failing with some frequency. With destinations separated into their own repos, there was little motivation to clean up failing tests. This poor hygiene led to a constant source of frustrating technical debt. Often a small change that should have only taken an hour or two would end up requiring a couple of days to a week to complete.
The outbound HTTP requests to destination endpoints during the test run was the primary cause of failing tests. Unrelated issues like expired credentials shouldn’t fail tests. We also knew from experience that some destination endpoints were much slower than others. Some destinations took up to 5 minutes to run their tests. With over 140 destinations, our test suite could take up to an hour to run.
To solve for both of these, we created Traffic Recorder. Traffic Recorder is built on top of
yakbak
, and is responsible for recording and saving destinations’ test traffic. Whenever a test runs for the first time, any requests and their corresponding responses are recorded to a file. On subsequent test runs, the request and response in the file is played back instead requesting the destination’s endpoint. These files are checked into the repo so that the tests are consistent across every change. Now that the test suite is no longer dependent on these HTTP requests over the internet, our tests became significantly more resilient, a must-have for the migration to a single repo.
It took milliseconds to complete running the tests for all 140+ of our destinations after we integrated Traffic Recorder. In the past, just one destination could have taken a couple of minutes to complete.
It felt like magic.
Once the code for all destinations lived in a single repo, they could be merged into a single service. With every destination living in one service, our developer productivity substantially improved. We no longer had to deploy 140+ services for a change to one of the shared libraries. One engineer can deploy the service in a matter of minutes.
The proof was in the improved velocity. When our microservice architecture was still in place, we made 32 improvements to our shared libraries. One year later, we’ve made 46 improvements.
The change also benefited our operational story. With every destination living in one service, we had a good mix of CPU and memory-intense destinations, which made scaling the service to meet demand significantly easier. The large worker pool can absorb spikes in load, so we no longer get paged for destinations that process small amounts of load.
Moving from our microservice architecture to a monolith overall was huge improvement, however, there are trade-offs:
Fault isolation is difficult.
With everything running in a monolith, if a bug is introduced in one destination that causes the service to crash, the service will crash for all destinations. We have comprehensive automated testing in place, but tests can only get you so far. We are currently working on a much more robust way to prevent one destination from taking down the entire service while still keeping all the destinations in a monolith.
In-memory caching is less effective.
Previously, with one service per destination, our low traffic destinations only had a handful of processes, which meant their in-memory caches of control plane data would stay hot. Now that cache is spread thinly across 3000+ processes so it’s much less likely to be hit. We could use something like Redis to solve for this, but then that’s another point of scaling for which we’d have to account. In the end, we accepted this loss of efficiency given the substantial operational benefits.
Updating the version of a dependency may break multiple destinations.
While moving everything to one repo solved the previous dependency mess we were in, it means that if we want to use the newest version of a library, we’ll potentially have to update other destinations to work with the newer version. In our opinion though, the simplicity of this approach is worth the trade-off. And with our comprehensive automated test suite, we can quickly see what breaks with a newer dependency version.
Our initial microservice architecture worked for a time, solving the immediate performance issues in our pipeline by isolating the destinations from each other. However, we weren’t set up to scale. We lacked the proper tooling for testing and deploying the microservices when bulk updates were needed. As a result, our developer productivity quickly declined.
Moving to a monolith allowed us to rid our pipeline of operational issues while significantly increasing developer productivity. We didn’t make this transition lightly though and knew there were things we had to consider if it was going to work.
We needed a rock solid testing suite to put everything into one repo. Without this, we would have been in the same situation as when we originally decided to break them apart. Constant failing tests hurt our productivity in the past, and we didn’t want that happening again.
We accepted the trade-offs inherent in a monolithic architecture and made sure we had a good story around each. We had to be comfortable with some of the sacrifices that came with this change.
When deciding between microservices or a monolith, there are different factors to consider with each. In some parts of our infrastructure, microservices work well but our server-side destinations were a perfect example of how this popular trend can actually hurt productivity and performance. It turns out, the solution for us was a monolith.
Special thanks to
Rick Branson
for helping review and edit this post at every stage.
Ready to see what Twilio Segment can do for you?
The Customer Data Platform Report 2025
Drawing on anonymized insights from thousands of Twilio customers, the Customer Data Platform report explores how companies are using CDPs to unlock the power of their data.
Yesterday I shared a little program called the
Mark V. Shaney
Junior
at
github.com/susam/mvs
. It
is a minimal implementation of a Markov text generator inspired by
the legendary Mark V. Shaney program from the 1980s. If you don't
know about Mark V. Shaney, read more about it on the Wikipedia
article
Mark
V. Shaney
.
It is a very small program that favours simplicity over efficiency.
As a hobby, I often engage in exploratory programming where I write
computer programs not to solve a specific problem but simply to
explore a particular idea or topic for the sole purpose of
recreation. I must have written small programs to explore Markov
chains for various kinds of state spaces over a dozen times by now.
Every time, I just pick my last experimental code and edit it to
encode the new state space I am exploring. That's usually my
general approach to such one-off programs. I have hundreds of tiny
little experimental programs lying on my disk at any given time.
Once in a while, I get the itch to take one of those exploratory
programs, give it some finishing touches, wrap it up in a nice Git
repo along with a
README.md
,
CHANGES.md
and the whole shebang and share it on
github.com/susam
and
codeberg.org/susam
. The
Mark V. Shaney Jnr program that I shared yesterday happened to be
one such exercise.
If you scroll down the
README
of this
project, you'll find some nice examples of the gibberish produced by
this program. The first few examples there are the result of
training the model on
A Christmas Carol
by Charles Dickens,
one of my favourite authors. It is often said that Dickens never
used fewer words when more would suffice. So I thought there
couldn't be a better piece of text when it comes to testing out my
tiny Markov model. I'll not reproduce the generated text examples
here for the sake of brevity. If you are interested to take a look,
just head over to
the
Gibberish
Examples
section of the README.
Soon after sharing the project, I wondered what kind of gibberish it
would produce if I fed all 24 years of my
blog
posts and pages
into the program.
Well, here's one of the results:
$ python3 mvs.py < susam.txt
while a query replace operation is approved by the user. The above
variable defines the build job. It can be incredibly useful while
working on assembly language and machine code. In fact, all internal
resources like the result to refine the search prompt changes from
bck-i-search: to fwd-i-search:. Now type C-SPC (i.e. ctrl+space) to
set a mark causes Emacs to use 32-bit registers like EBP, ESP,
etc. Thus the behaviour is undefined. Such code may behave differently
when compiled with the readily available GNU tools like the shape
of 8. Flipping "P" horizontally makes it a proper quine: cat $0
This is the text that comes out after the program consumes over 200
posts consisting of about 200,000 words. Here is another example:
$ python3 mvs.py < susam.txt
enjoy asking "what happens if" and then type M-x zap-up-to-char RET
b. The buffer for this specific video, the actual fare for 8.3 km and
11 are all written from scratch. No prior knowledge is expected to
slow down in future. For now, I will add a statement like x =
0.999... , the inner corner square as discussed in the code segment
into the REPL window. Unlike Slimv, Vlime can work with and debug
executable files, it can be more convenient. M-x: Execute Extended
Command The key sequence M-q invokes the command cat and type TAB to
indent the
Here is a particularly incoherent but amusing one:
$ python3 mvs.py < susam.txt
Then open a new Lisp source file and the exact answer could harm
students' self-esteem. Scientists have arbitrarily assumed that an
integral domain. However, the string and comment text. To demonstrate
how a build job can trigger itself, pass input to standard output or
standard error), Eshell automatically runs the following command in
Vim and Emacs will copy the message length limit of 512 characters,
etc. For example, while learning to play the game between normal mode
to move the point is on an old dictionary lying around our house and
that is moving to the small and supportive community
No, I have never written anywhere that opening a Lisp source file
could harm anyone's self-esteem. The text generator has picked up
the 'Lisp source file' phrase from my
Lisp in Vim
post and the
'self-esteem' bit from the
From Perl
to Pi
post.
By default, this program looks at trigrams (all sequences of three
adjacent words) and creates a map where the first two words of the
trigram are inserted as the key and the third word is appended to
its list value. This map is the model. In this way, the model
captures each pair of adjacent words along with the words that
immediately follow each pair. The text generator then chooses a key
(a pair of words) at random and looks for a word which follows. If
there are multiple followers, it picks one at random. That is
pretty much the whole algorithm. There isn't much more to it. It
is as simple as it gets. For that reason, I often describe a simple
Markov model like this as the 'hello, world' of language modelling.
The number of words in the key of the map can be set via command
line arguments. By default, it is 2 as described above. This value
is also known as the order of the model. So by default the order is
2. If we increase it to, say, 3 or 4, the generated text becomes a
little more coherent. Here is one such example:
$ python3 mvs.py 4 < susam.txt
It is also possible to search for channels by channel names. For
example, on Libera Chat, to search for all channels with "python" in
its name, enter the IRC command: /msg alis list python. Although I
have used Libera Chat in the examples above, there are plenty of
infinite fields, so they must all be integral domains too. Consider
the field of rational numbers Q . Another quick way to arrive at this
fact is to observe that when one knight is placed on a type D square,
only two positions for the second knight such that the two knights
Except for a couple of abrupt transitions, the text is mostly
coherent. We need to be careful about not increasing the order too
much. In fact, if we increase the order of the model to 5, the
generated text becomes very dry and factual because it begins to
quote large portions of the blog posts verbatim. Not much fun can
be had like that.
Before I end this post, let me present one final example where I ask
it to generate text from an initial prompt:
$ python3 mvs.py 2 100 'Finally we'
Finally we divide this number by a feed aggregrator for Emacs-related
blogs. The following complete key sequences describe the effects of
previous evaluations shall have taken a simple and small to contain
bad content. This provides an interactive byte-compiled Lisp function
in MATLAB and GNU bash 5.1.4 on Debian is easily reproducible in
Windows XP. Older versions might be able to run that server for me it
played a significant burden on me as soon as possible. C-u F: Visit
the marked files or directories in the sense that it was already
initiated and we were to complete the proof.
Apparently, this is how I would sound if I ever took up speaking
gibberish!
Author’s note: I originally intended for this post to cover adventure games, computer role-playing games, wargames and other simulations, a brief look at the home video game market, and finally the rise of hybrids that fused home video game systems with personal computers. In the grand scheme of the story about personal computers that I am trying to tell, it probably does not make sense to lavish nearly 7,000 words on early adventure games alone, but it’s a topic of personal interest to me and the tale grew in the telling.
Play was central to the formation of personal computer culture. For the early hobbyists who were fascinated by the guts of the machine, the computer was a plaything in and of itself. Many of those who joined the hobby in 1975 or 1976 did so because of games: they had experience with the extensive BASIC game culture that circulated in the time-sharing systems of universities, high schools, and even corporations, and wanted to keep playing at home.
Even after the rise of commercial personal computer software, when the first truly useful applications began appearing, games remained by far the most popular software category (counting by number of titles produced and number of units sold, although not by dollar value). One 1980 catalog of Apple II software, for example, lists 265 titles, of which roughly two-thirds are games, from
Ack-Ack
(an anti-aircraft target shooter)to
Wipe Off
(a
Breakout
clone). The rest of the catalog comprises demos, educational programs, and a smattering of business software. Whatever they might say about the practical value of the personal computer, buyers had an evident hunger for games.
[1]
The Early Games and Their Market
Computer owners got their hands on games in one of three ways. In the early years, the most common means would be simply copying a paper or cassette tape from a friend or colleague, whether with the permission of the original author or not. In the early years, most hobbyists treated game software as a commons to be freely shared, just as it had been in the time-sharing culture through cooperatives like DECUS. This peer-to-peer copying would never entirely go away, despite the commercialization of game software and various schemes by publishers to try to prevent it.
Many magazines and books also published “type-ins,” complete computer programs (almost always written in BASIC) intended to be manually entered at the keyboard (and then saved to tape or disk), and these, too, were most often games. Dave Ahl’s
BASIC Computer Games
(first published in 1973 by Digital Equipment Corporation), a collection of over 100 type-ins, reputedly sold one million copies by 1979. Though type-in publication continued through the 1980s, the inherent limits on the length of such programs (only the most dedicated would tackle a type-in that was more than a few hundred lines long) and their reliance on the universality of BASIC (rather than more performant compiled languages) meant that their significance waned as the sophistication of the game market increased. They could serve as fun demos or educational tools for learning to code, but could not compare to similar games available commercially.
[2]
Selection from a state capital guessing game type-in, from the first issue of
Softside
(October 1978). The median type-in was a simplistic game or graphical demo like this.
A selection from a type-in for a much more complex adventure game, published in the September 1980
Softside
. This goes on for two-and-a-half more pages, and is about the limit of what was feasible in the type-in format for all but the most steadfast of typers.
Finally, of course, there were the commercial titles offered by software publishers. The game business began in the same way as the personal computer hardware business: with hobby-entrepreneurs selling their creations to fellow hobbyists. In July 1976, for example, D.E. Hipps of Miami, Florida offered a Star Trek game written for MicroSoft’s Altair BASIC for $10 (no one at this stage of the industry paid any attention to niceties such as licensing agreements for the use of the Star Trek name). No common standard data storage standard existed; hobbyists employed a mix of paper teletype tapes, cassette storage, and (for the most extravagant) floppy disks. So Hipps opted to distribute his game as printed source code: a type-in! SCELBI (creators of one of the early, pre-Altair hobby computers), offered another Star Trek variant called
Galaxy
in the same form. By the late 1970s, the convergence of the industry on a small number of popular storage standards (with CP/M dominant) resolved this problem, and most games were distributed in plastic baggies containing instructions and a cassette or floppy disk.
[3]
Contents of a typical computer game package circa 1980. The instructions, command reference, special instruction sheet, and cassette would have come together in a plastic baggie.[Ernst Krogtof, retro365.blog]
It didn’t take long for other entrepreneurs to see a business opportunity in making it easier for software authors to publish their games. It took some time for clear business models and market verticals to emerge. No categorial distinction existed between publishers of games and publishers of utility and business software prior to 1980: Personal Software’s first big hit was
MicroChess
, followed by
VisiCalc
, followed by (as we’ll soon see)
Zork
. Programma International’s founder began as a hoarder of Apple II software, much of it acquired from copies unauthorized by the original author, then turned legitimate to sell those authors’ software instead. Softape tried selling bundles of software by subscription, and then started its own newsletter for subscribers,
Softalk
.
Some magazines went the other way around:
Softside
magazine (located the next town over from
BYTE’s
Peterborough, New Hampshire headquarters) created The Software Exchange (TSE), while Dave Ahl’s
Creative Computing
set up a label called Sensational Software. Type-ins printed in the magazines became a gateway drug to more convenient (and often more complex and interesting) software available for sale on cassette or diskette.
[4]
Figure 21: Creative Computing heavily advertised the Sensational Software brand in the pages of the magazine, as in this July 1980 example describing some of their most popular hits and offering a free catalog of their full offering of 400 titles.
The early personal computer game culture imitated what came before it. The boundary between mini- and microcomputer culture was permeated by thousands who used time-sharing systems at work or school and then went home to a hobby computer. Prior to 1977, a game written for a personal computer was almost invariably based on a game drawn from the other side of that boundary.
Barring a few exceptions (such as the PLATO system available at some universities), users interacted with such computer systems through teletypes or video teletypes that alternated sending and receiving text. So, the resulting games were turn-based, purely textual, and relied on strategy and calculation (or pure luck) to win, not timing and reaction speed. These textual games suited the early hobbyists perfectly, since almost all of their computers also had text-only interfaces, whether mechanical teletypes or video displays like the TV Typewriter.
Other than simple quizzes, demos, and guessing games, popular titles included simulations such as
Hammurabi
,
Civil War
and
Lunar Lander
; statistical recreations of sports contests (baseball, basketball, golf, etc.); or classic games or puzzles from the physical world, like checkers, Yahtzee, and the towers of Hanoi. By far the most popular by far, however, judging by the number of variations published and references in hobby magazines, were descendants of Mike Mayfield’s 1971
Star Trek
, a strategic game of galactic war against the Klingons.
[5]
An example of a user interaction in a Mike Mayfield-style Star Trek game. Each command entered by the user produces a textual response from the computer. There is no continuous display; the SRS command must be used each time the player wants to see their situation. [The Wargaming Scribe,
https://zeitgame.net/archives/1770%5D
Figure 22:
Some early personal computers, however, had video controllers with built-in memory, which allowed for more sophisticated interfaces than the simple back-and-forth exchanges of a teletype. Processor Technology, whose VDM-1 display interface could paint characters at arbitrary points on the screen, sold real-time textual games by Steve Dompier like
Trek-80
(released in 1976, despite the name). Its interface (including a galactic sector map made of text characters and readouts of the
Enterprise
’s weapon and shield status) updated in real-time in response to player and (simulated) enemy actions, rather than scrolling by one turn at a time. Cromemco, maker of the Dazzler, an Altair-compatible graphics board, offered the only personal computer games to use pixel graphics prior to the Apple II, starting with a version of the seminal
Spacewar
in early 1977. They followed with a suite of similar games such as
Tankwar
and
Dogfight
.
[6]
Steve Dompier’s Trek-80 was similar in concept to earlier Trek games, but with alternating commands and responses replaced by a continuously displayed textual interface. [
Creative Computing
(July/August 1977), 44]
Spacewar on the Dazzler couldn’t match the PDP-1 original, but nothing else at the time sported pixel graphics. [
Creative Computing
(July/August 1977), 43]
After 1977, when computers with graphical displays became more widely available (especially the full-color Apple II), computer games tapped a new vein of inspiration (and imitation): arcade games. Originally commercialized by Atari and its imitators as standalone arcade cabinets in the early 1970s, then moving into homes by the mid-1970s, these games were typically real-time and focused on action. Relatively cheap and easy-to-make, and relatively disposable to the user (few took more than a few minutes to play a complete game), computer action games proliferated by the hundreds and thousands, many of them direct or near clones of pre-existing arcade or home video games.
By 1980, however, there were major innovations that set personal computer games apart from other game media. In-depth simulations, expansive adventures that took hours to solve, and dungeon crawls teeming with a variety of monsters, treasures, and traps provided immersive experiences that the action-oriented video game consoles, did not, and (given their limited memory and storage capacity) could not provide. Once combined with full-color, bitmapped graphics, these games also surpassed anything previously available on their time-sharing predecessors. The era of imitation was definitively over.
Adventure
For several years of my childhood, for reasons that I no longer recall, our family’s Apple IIe computer, equipped with a green-and-black monochrome monitor, resided in my bedroom. Though much of my autobiographical memory is quite hazy, I can clearly remember each of the Apple II games we owned, each with its own 5 ¼-inch-square floppy disk:
Syzygy
(a space shooter in the vein of
Asteroids
),
One on One: Dr. J vs. Larry Bird
,
Winter Games
, and
Arcticfox
(a sci-fi tank simulator with wireframe graphics).
But the game that truly captured my imagination, the game whose opening sequence and imagery remain etched (monochromatically) in my mind, was
King’s Quest II: Romancing the Throne
, a 1985 title by Sierra On-Line. The forty-nine-screen, hand-drawn fairy tale kingdom that you explore in the game (via your avatar, King Graham) felt like a vast world of endless possibility compared to the cramped half-court of
One on One
, the endlessly repeating monotony of a biathlon course in
Winter Games
, or the sterile polygonal landscape of
Arcticfox
’s Antarctica. That open-ended feeling was enhanced by the lure of hidden secrets just out of reach, and a freeform text interface that accepted English commands like “THROW APPLE” (though only a tiny subset of the commands you could imagine would actually work). Despite its limitations and many, many frustrations (at age seven or eight, with no hint book and no Internet walkthroughs, I certainly never came close to completing it), it made me feel that I was truly experiencing an adventure.
One of the colorful environments of King’s Quest II. Limited at the time to a monochrome monitor, I never saw it like this.
The adventure game genre originated in a freely shared, text-driven game created in the time-sharing world. The game, which I will call
Adventure
(it is variously called
Colossal Cave Adventure
,
Colossal Cave
,
Adventure
, or simply ADVENT, after the game’s PDP-10 file name)challenged players to find five treasures within a cave complex by navigating a maze, solving puzzles, and defeating a band of axe-wielding dwarves Its author was Will Crowther, a programmer at Bolt, Beranek and Newman (BBN), where he had written core infrastructural software for ARPANET, the first nationwide computer network.
The BBN team that worked on the Interface Message Processors (IMPs), minicomputers that routed messages across the ARPANET network. This photo was probably taken circa 1969-1970, when the first IMPs were delivered. Crowther is second from the right.
In 1975, Crowther went through a painful divorce. He had always enjoyed playing games with his school age daughters, so he began crafting a game on the company’s DEC PDP-10 to help him stay connected with them. Crowther copied the physical structure of
Adventure
’s cave directly from a portion of the Mammoth complex in Kentucky. (He had met his wife through caving, and they had explored Mammoth together, so the game was also, in a sense, a means of staying connected to his former, married life.) It is probable (though not certain) that Crowther also drew some inspiration from a popular 1973 time-sharing game called
Hunt the Wumpus
, which required users to use textual clues to find and kill a Wumpus hidden in a system of caves without falling into a pit. But the conceptual structure of
Adventure
(delving into the earth to find treasure and magical artifacts in the face of devious obstacles and armed foes) came from a new game of pencil, paper, and imagination that Crowther was playing with some of his BBN friends, called
Dungeons and Dragons
.
[7]
In Crowther’s words:
…the caving had stopped, because that had become awkward, so I decided I would fool around and write a program that was a re-creation in fantasy of my caving, and also would be a game for the kids, and perhaps had some aspects of the
Dungeons and Dragons
that I had been playing.
[8]
Just as in the later
King’s Quest II
, the player used simple verb-noun commands (such as “TAKE LAMP”) to interact with the world, but lacking a graphical screen with a visible avatar, he or she also used text commands to move about the world, from one room of the cave to the next (e.g., “SOUTH” or “EAST”). Crowther showed the game off to his D&D buddies and his daughters, then took a new job in California, and forgot about it.
[9]
Time-sharing games had once propagated gradually from computer to computer via collectives like the Digital Equipment Computer Users’ Society or colleagues and friends mailing paper tapes to one another. But BBN was on the ARPANET, and Crowther had put his game on a public directory in the BBN computer. From there, someone copied it across the network to a computer in a Stanford lab, where a graduate student, Don Woods, found it in early 1977.
Crowther-Woods
Adventure
as someone might have experienced it in the late 1970s, on a DEC video terminal. [Autopilot, CC BY-SA 3.0]
Fascinated by Crowther’s game, Woods contacted him for the FORTRAN source code, and set about expanding it. He increased the scope by adding more rooms, more puzzles, more foes, and more ways to interact with the world; but he also added the ability to save your progress and a point-tracking system with a final objective: to find fifteen treasures and return them to the starting location. Woods’ larger, more polished version of
Adventure
spread rapidly across the time-sharing world, and became an obsession for some, keeping them at the office past midnight in search of that last treasure. (One of Woods’ additions was a setting to allow admins to disable the game during working hours.)
[10]
Adventureland
A frizzy-haired Florida man, Scott Adams, was the first to commercialize a version of
Adventure
for the personal computer. He had first fallen in love with computers on a time-sharing terminal at his Miami high school in the late 1960s. He went on to earn a computer science degree and by the late 1970s, was working as a programmer at telecom manufacturer Stromberg-Carlson. On the side he had become an avid home computer hobbyist, purchasing a Sphere computer in 1975 and then a TRS-80 in 1977. Shortly thereafter he discovered
Adventure
on the company time-sharing system and, like many before and after, could not quit playing until he had beaten it.
Adams decided that it would be an interesting challenge to build something similar for the TRS-80. It would have to be much smaller to fit in the sixteen kilobytes of memory he had available. The Crowther-Woods
Adventure
contained 140 distinct locations and ran to eighty kilobytes of (uncompiled) FORTRAN and fifty-four kilobytes of data for the game text. Adams’
Adventureland
was considerably smaller, with fewer than thirty-five locations—not necessarily to the detriment of gameplay; for example, the cutting lopped off most of
Adventure
’s huge and torturous mazes.
[11]
Adams’ local TRS-80 buddies were impressed enough with his game that he decided to sell it through both The TRS-80 Software Exchange and Creative Computing, who offered it on cassette for $24.95 and $14.95, respectively, in their January 1979 magazine issues. He followed up with a whole series of games, starting with
Pirate Adventure
, and ported the games from the TRS-80 to other popular computer platforms. His wife Alexis joined the venture as a business manager and game designer, co-authoring
Mystery Fun House
and
Voodoo Castle
.
[12]
Scott Adams surrounded by his Adventure series. [
Adventure international Microcomputer Software Catalog
2, 5 (1981),4]
The adventure game genre is often criticized for absurd and unfair puzzles, which can be guessed at only through trial-and-error, and tedious mazes or other navigational obfuscations. These early games from circa 1980 are among the worst offenders. In
Adventureland
, for example, a “very thin” black bear blocks your way, and the only way to get past it is to “yell” at it. Feeding this apparently hungry bear honey will prevent you from completing the game, because the honey is one of the treasures you must collect. You could easily get to state in these games where you have lost the game without knowing it.
[13]
Yelling at a bear in Adventureland. [https://www.filfre.net/2011/06/adventureland-part-1]
But these criticisms are retrospective: the contemporary press and the buying public lapped up the Adams’ adventures and all of their imitators. We have to remember that the
appeal
of this genre lay in getting immersed (one might say “lost”) in the game for hours every evening, clawing your way forward towards ultimate triumph for weeks, or even months, on end. In a market full of arcade-like games that offered the convenient but shallow fun of a bag of potato chips, adventure games provided a rich and fulfilling meal for the imagination. As one lover of the genre put it:
Adventure is the product of imagination appealing to imagination. It is not just the puzzle, or the theme, or the nonplayer characters and their personalities. It is a verbal tapestry of interwoven phrases that whisk you away to magical kingdoms of the mind. The computer becomes a tool of reaching that conveys you where it will. You go along eagerly, breathlessly, awaiting what comes next.
[14]
The catch was that this delicacy was consumable only once: a solved adventure game was no more interesting to revisit than a solved crossword puzzle. So, they
had
to provide a challenge: no one wanted to pay $24.95 for a game on the way home from work and then breeze through it before bedtime. A game that was very fair would also risk being seen as a waste of money. Despite improvements in design in future years that would banish some of the worst practices of the genre, adventure games remained trapped on the horns of this dilemma.
[15]
Zork
The Adams’ “Adventure” line made them wealthy enough to build a faux-castle outside Orlando, and kicked off one of the most popular computer game genres of the 1980s. By late 1980, half-a-dozen other companies were putting out personal computer adventure games, from The Programmer’s Guild to Mad Hatter Software, as well as a version of Crowther-Woods
Adventure
put out by Microsoft. But they are overshadowed in the historical record by a competitor that subsequently dominated both sales of and critical attention to text adventure games. It began at MIT. In the spring of 1977, the Crowther-Woods
Adventure
arrived over the ARPANET at the PDP-10 at the Laboratory for Computer Science (LCS), and sank its claws into its employees. Impressed by what Crowther and Woods had done, but convinced that it could be made even better, a group of LCS staff set out in May 1977 to one-up
Adventure
.
[16]
Dave Lebling, who had already worked on several games (including
Maze
, the first first-person shooter game), kicked off the project. Lebling played
Dungeons and Dragons
in the same Cambridge D&D group as Crowther had (though not at the same time), and based the game’s combat system on the tabletop game. Then Marc Blank, Tim Anderson, and Bruce Daniels filled in most of the core structure of the program. They gave it the place-holder name of
Zork
(a term used as an inside-joke expletive at LCS, as in “why won’t this zorking thing work”), which ended up sticking permanently. The game reached its completed state in early 1979, by which point it greatly exceeded the original
Adventure
in scale, with 191 rooms and 211 items, occupying a full megabyte of memory.
[17]
Coded in a LISP-descendant called MUDDLE or MDL,
Zork
had an elegant design that encapsulated all the information about the possible interactions of each room and item in a single block of code and data, making it much easier to extend than
Adventure
. It also had a much richer text interface: both
Adventure
and
AdventureLand
accepted only “verb noun” commands, but
Zork
also allowed for conjunctions and prepositions (for example, “TAKE SWORD AND LANTERN FROM SACK”).Though aping the basic tropes of
Adventure
(a small overland area leading to an underground treasure-hunt), its more complex architecture allowed for a richer and more clever set of puzzles.
[18]
In the spring of 1979, several key staff members of LCS were poised to leave MIT. Their supervisor, Al Vezza, proposed to keep the band together by forming a company. Incorporated in June as Infocom, its new employees and shareholders included Lebling, Blank, and Anderson.
While the various partners mulled what exactly to do with their new business, Blank and a fellow LCS alum, Joel Berez, figured out how to cram
Zork
onto a microcomputer: they cut the number of rooms and items in the game in half and removed all the features of MDL not needed for the game, creating an interpreter for a simpler language they called
Zork
Implementation Language (ZIL). The resulting program occupied just seventy-seven kilobytes. To get this to fit into a microcomputer memory half that size, they had one last trick: a virtual memory system built into the interpreter, to swap chunks of the program on and off the disk as needed (typical floppy disk capacities at the time were over 100 kilobytes, and continued to grow). This meant that
Zork
could
only
run off of a floppy drive (whose rapidly spinning disk could sync to a new data location in a fraction of a second and supply data at fifteen kilobytes per second), never a cassette (which took a minute or more to fully unwind or rewind and supplied data at 300 bits per second). Or, to put it another way, the growing market prevalence of affordable floppy drives made larger personal computer adventure games feasible: it took about twenty minutes to load a Scott Adams adventure game from tape.
[19]
In late 1979, Blank and Berez convinced a reluctant Vezza (who wanted to get into business software) to make a microcomputer
Zork
Infocom’s first product. They initially published through Personal Software, co-owned by MIT’s own Dan Fylstra, which had just recently released
VisiCalc.
But after
VisiCalc’s
smash success, Fylstra no longer wanted to deal in mere games, so Infocom became its own publisher for subsequent games—including
Zork II
and
III
, built from the remaining unused material from the original PDP-10
Zork
.
Zork
became available in December 1980 and sold 10,000 units in 1981, mostly on the Apple II, despite an eye-watering price of $39.95, at a time when most games cost fifteen to twenty-five dollars. Then, astonishingly, in an industry typically characterized by ephemerality and obsolescence, sales continued to grow, year after year. They peaked in 1984 with over 150,000 copies sold. No doubt Zork’s self-referential humor, its restrained but clever marketing, and the high quality of the game itself (certainly the most well-crafted adventure game to date) all helped to sell the game.
[20]
The evolution of Zork’s marketing strategy, from the underground-zine feel of this 1981 ad under Personal Software… [
SoftSide
(April 1981)]
To this more austere and elegant pitch under Infocom in 1982. [Softline (September 1982)]
But many sales also must have arisen from the startling impression given by sitting down in a store (or at a friend’s house) to interact with this remarkable piece of software. Bob Liddil, reviewing
Zork
for
BYTE
magazine, pointed to the fluency of the parser as the element that first pulled him in:
I was eager to test Zork’s biggest selling point, intelligent input (ie: its ability to accept free-form instructions). I typed “OPEN THE BAG AND GET THE LUNCH,” in reference to a brown paper sack inside the house. The computer complied. There was water and food, so I typed “EAT THE LUNCH AND DRINK THE WATER,” to which the computer responded with gratitude for satisfying its hunger and thirst. I was hooked.
[21]
The game seemed to
understand
the user and to have an appropriate answer (or a witty retort) ready for everything they might try, from expletives (“FUCK > SUCH LANGUAGE IN A HIGH-CLASS ESTABLISHMENT LIKE THIS!”) to attempts to outwit the command system (“FIND HANDS > WITHIN SIX FEET OF YOUR HEAD, ASSUMING YOU HAVEN’T LEFT THAT SOMEWHERE.”), to questions about the imaginary world in which the game is played (“WHAT IS A ZORKMID? > THE ZORKMID IS THE UNIT OF CURRENCY OF THE GREAT UNDERGROUND EMPIRE.”) Along with
VisiCalc
and
WordStar
,
Zork
functioned not just as a piece of software that did something, but also as an existence proof (for the owner and for skeptical friends and family) that the microcomputer could be more than merely a toy version of a real computer.
[22]
Zork sales finally fell off in the mid-1980s, not because new text adventure games had surpassed it (Infocom continued to rule that particular roost, and
Zork
remained their flagship), but because of the steady improvement in personal computer graphics and the corresponding ascendancy of graphical games over textual ones.
Mystery House
The first graphical adventure game actually appeared several months before
Zork
: On-Line Systems’
Mystery House,
created by Ken and Roberta Williams. Unlike Scott Adams and most of the early personal computer hobbyists, Ken Williams got into computers for money, not love. Raised in greater Los Angeles in an unhappy home, he was a driven and impatient young man, and graduated high school at just sixteen. Roberta Heuer, a dreamy young woman whom Williams met through a double date, was impressed enough by his intelligence and ambition to give in to his insistence that they marry in 1972, while they were both still teenagers.
With the expectation of children to come, Ken abandoned his physics program at Cal Poly Pomona for a more immediately lucrative career in data processing. His father-in-law helped him get a loan to attend Control Data Corporation’s training school (the Control Data Institute), and from there he went on to a series of positions working on “big iron” batch-processing systems, constantly bouncing from job to job and home to home in search of better opportunities and a fatter pay check. He and Roberta wanted a bigger house and more creature comforts, but most of all they dreamed of an early retirement to a life out-of-doors, far from the city.
[23]
Ken and Roberta as newlyweds in 1972.
The Williamses took no notice of microcomputers until Ken and one of his co-workers, Robert Leff, concocted a way to make money off of them: selling fellow programmers a microcomputer implementation of FORTRAN, one of the most popular data processing languages. Not only could this venture make him and Roberta still richer (always a key consideration), it could free them to finally move away from the traffic and grind of Los Angeles and to live out their dream of rural life. Initially Ken planned to write FORTRAN for the TRS-80, but he redirected his energies to the more capable Apple II after he and Roberta got themselves one for a mutual Christmas present.
Meanwhile, Roberta had gotten hooked on adventure games. Ken had an electromechanical teletype terminal in their home for one of his consulting jobs, and connected it to a computer with the Crowther-Woods
Adventure
available to play. He showed the game off to Roberta. For Ken it was a curiosity, but for Roberta it became an obsession: she would not quit until she had beaten the game, weeks later. Ken brought home a borrowed TRS-80 and cassette tapes for the Scott Adams adventure series, and she flew through those, too. Soon she had an idea for a game of her own: instead of a treasure hunt, it would be a murder mystery; a mix of
Clue
and
Ten Little Indians
set in a creepy old Victorian house.
She insisted that Ken help her create it, and, after putting her off several times, he finally relented. Roberta wanted to add pictures of each room as a way to make this new game better than what came before, taking advantage of the Apple II’s 280×192 pixel high-resolution graphics mode. Because storing dozens of bitmapped images on a floppy disk would be impossible, Ken bought a VersaWriter accessory, a tablet with a movable arm that let Roberta capture the (x, y) position of each line endpoint in her pictures and store them into the computer. He wrote code to re-create the pictures from these coordinates by drawing the lines at runtime.
[24]
A circa 1983 ad for the Atari version of the VersaWriter.
Like Crowther and Adams, Ken split the data tables apart from the code that interpreted them. This allowed Roberta to work out all of the information about the rooms in the game, the items they contain, and the actions the player can perform, without needing to write any code. This division of labor between programming and design, quite novel to computer game software, came about from the accident of Roberta’s limited technical skills (she had worked briefly as COBOL programmer, at Ken’s insistence) and Ken’s lack of interest in the game: he was still focused on launching Apple FORTRAN.
[25]
Then, while visiting local computer stores to pitch his computer language, Ken demoed an early version of Roberta’s game and everyone in the store gathered around to see it. The owners asked when they could have copies to sell. Ken realized he was backing the wrong horse: it was Roberta’s side project that would make them rich, not FORTRAN. Moreover, rather than give up a cut to a publisher like Programma International, they would take all the revenue for themselves, by publishing the game through the company name he had already registered for his never-to-be-released FORTRAN, On-Line Systems. On top of that, they could make even more money by distributing games into the stores they were already visiting on behalf of other software authors, like Scott Adams’ Florida-based Adventure International. Eventually unable to manage both publishing and distribution, he convinced his former colleague and erstwhile FORTRAN partner, Robert Leff, to buy out the distribution business, which grew into the industry behemoth Softsel.
[26]
After a month of development on nights and weekends (Ken’s pace was manic: in his memoir he writes that he always strove to be a “Triple-A” player, and his brother called him a “chronic workaholic”), the Williams’ started selling
Mystery House
in May 1980. It required forty-eight kilobytes of memory, but with chip prices falling continuously, this was not so stringent a requirement as it had been even a year before.
The game’s simplistic “mystery” ends with the player gunning down the
de facto
murderer: the only living character to be found in a houseful of victims. The puzzles are among the more poorly clued and arbitrary to be found in a genre full of such frustrations. But for adventure-starved gamers of the time it was enchanting: not only could they witness the virtual world which they were navigating, it actually changed in response to their actions (picking up an object, like the taunting note that kicked off the murders, would remove it from the scene). Roberta’s drawings, crude and child-like as they certainly are, gave the game a visual appeal that drew in new buyers, and more than justified its price of $24.95.
[27]
The entry room to Mystery House. The green and purple colors are artifacts of how the Apple II high-res graphics mode works. [https://www.filfre.net/2011/10/mystery-house-part-2]
That summer Ken and Roberta were pulling in $30,000 a month and shopping for a house far from Los Angeles, in Coarsegold, California, nestled in the foothills of the Sierra Nevada near Yosemite National Park. On-Line Systems became Sierra On-Line. A few months later a second “High-Res Adventure” followed,
The Wizard and the Princess,
which added visually-stunning color to
Mystery House’s
line drawings: Ken used dithering techniques to make the six colors available in high-res mode appear like twenty-one. Roberta’s
King’s Quest
series, which I encountered on my Apple II, did not begin until 1984. It became Sierra’s best seller: by 1987, the first three installments of the series had sold a combined 500,000 copies, at least according to Sierra’s own marketing.
[28]
A scene from The Wizard and the Princess. Nothing like this had been seen on microcomputers before. The dark blue of the man’s shirt is composed of dithered blue and black, and the light blue of his pants from blue and white, etc. The odd colors at the borders between different-colored regions were once again artifacts of the Apple II high-res color system.
It stands out, in a story populated almost entirely with male characters, that two of the earliest adventure game designers (Alexis Adams and Roberta Williams) were women. The scope of Alexis’ contributions aren’t entirely clear, but Roberta was arguably the most successful adventure game designer of all time. There was an appeal in the adventure game genre, which had more in common with a mystery novel or a logic puzzle than an arcade game and typically eschewed violence (the summary execution of
Mystery House
’s killer notwithstanding), that attracted some women to an otherwise almost entirely masculine industry.
[29]
Ken and Roberta in 1989 after winning a Software Publisher’s Association Award for Best Role-Playing or Adventure Game for
King’s Quest IV.
[
Sierra News Magazine
(Summer 1990) 8]
In a world where multiple discovery and parallel invention are the norm, it is also remarkable that all of the games we have discussed (and indeed all the computer adventure games ever made) can trace their ancestry to the Crowther-Woods
Adventure
. In the meantime, though, many other computer game authorshad drawn inspiration from
Dungeons and Dragons
, spawning an entirely different genre of computer games, more in tune with
D&D
’s wargaming roots.
[4]
Alexander Smith,
They Create Worlds: The Story of the People and Companies That Shaped the Video Game Industry, Vol. I: 1971-1982
(Boca Raton: CRC Press, 2020), 366-368; Jimmy Maher “Adventureland, Part 2,”
The Digital Antiquarian
(June 24, 2011) (
https://www.filfre.net/2011/06/adventureland-part-2
); David H. Ahl, “The First Decade of Personal Computing,”
Creative Computing
(November 1984), 30.
[5]
Smith,
They Create Worlds
, 266-267; David H. Ahl, ed.,
101 BASIC Computer Games: Microcomputer Edition
(New York: Workman Publishing, 1977).
[6]
The Wargaming Scribe, “The beginning of home computer gaming: the VDM-1 and the SOL-20” (August 16, 2023) (
https://zeitgame.net/archives/10450
); “Cromemco Dazzler Games” (Mountain View: Cromemco, 1977); Steve North, “Two Space Games (With Graphics!) For Your Home Computer,”
Creative Computing
(July/August 1977) 43-44; “Spacewar Available for the Cromemco Dazzler,”
Cromemco News
(January 1977).
[10]
Smith,
They Create Worlds
, 384-385; Jerz, “Somewhere Nearby is Colossal Cave,” 13; Jimmy Maher, “The Completed Adventure, Part 1”
The Digital Antiquarian
(June 2, 2011) (
https://www.filfre.net/2011/06/the-completed-adventure-part-1/
); Tracy Kidder,
The Soul of A New Machine
(New York: Little, Brown, 2000 [1981]), 86-89.
[12]
Robert Levering, Michael Katz, and Milton Moskowitz,
The Computer Entrepreneurs: Who’s Making It Big and How in America’s Upstart Industry
(New York: NAL Books, 1984), 114-118; Smith,
They Create Worlds
, 388.
[14]
Bob Liddil, “On the Road to Adventure,”
BYTE
(December 1980), 170.
[15]
The 1990 LucasArts adventure,
Loom
, for example, though it is an artistic masterpiece, was criticized by reviewers for being too short and too easy. Scorpia, “Scorpion’s View: ‘Conquests of Camelot’ and ‘Loom’,”
Computer Gaming World
(July-August 1990), 51, 63, Simply making the games larger, with more puzzles was technically infeasible in the early years (we have already seen that
Adventureland
had to be much smaller than
Adventure
to fit on a microcomputer); later, as the costs of game production went up, it became financially infeasible. There is an expert dissection of the sins of one early adventure game in Jimmy Maher, “The Wizard and the Princess, Part 2,”
The Digital Antiquarian
(October 21, 2011) (
https://www.filfre.net/2011/10/the-wizard-and-the-princess-part-2
).
[16]
Bob Liddil, “On the Road to Adventure,”
BYTE
(December 1980), 162.
[17]
Jimmy Maher, “The Roots of Infocom,”
Digital Antiquarian
(January 1, 2012) (
https://www.filfre.net/2012/01/the-roots-of-infocom
); Jimmy Maher, “Zork on the PDP-10,”
Digital Antiquarian
(January 3, 2012) (
https://www.filfre.net/2012/01/zork-on-the-pdp-10
); Stephen Granade and Philip Jong, “David Lebling Interview,”
Brass Lantern
(undated, ca. 2000) (
http://brasslantern.org/community/interviews/lebling.html
); Nick Montfort,
Twisty Little Passage: An Approach to Interactive Fiction
(Cambridge: MIT Press, 2003), 86. Eric Roberts, Crowther and Lebling’s dungeon master, ran a variant of D&D he called
Mirkwood Tales
. Jon Peterson,
Playing at the World
(San Diego: Unreason Press, 2012), 617-618, 622.
[18]
P. David Lebling, “
Zork
and the Future of Computerized Fantasy Simulations,”
BYTE
(December 1980), 172-182.
[23]
Levy,
Hackers
, 293-297, 302-303; Ken Williams,
Not All Fairy Tales Have Happy Endings: The Rise and Fall of Sierra On-Line
(Ken Williams, 2020), 12-24, 22-24; Jimmy Maher, “Ken and Roberta,”
The Digital Antiquarian
(October 2, 2011) (
https://www.filfre.net/2011/10/ken-and-roberta
).
[24]
Williams,
Not All Fairy Tales
, 55-56, 66-68, 88; Levy,
Hackers
, 303-304; Ken Wiliams, “Introduction to The Roberta Williams Anthology” (1996) (
https://wiki.sierrahelp.com/index.php/Introduction_to_The_Roberta_Williams_Anthology
). The account in the previous paragraphs is interpolated from the above sources, which are partially contradictory. All differ about who got the Apple II and why. Levy never mentions the TRS-80 or any adventure games besides
Adventure
, and has Roberta finishing that game after the time the Apple II was purchased, implying she never played any other adventure games before deciding to write
Mystery House
: the timeline would simply be too tight. I believe this is wrong, and either an intentional elision or a false interpolation by Levy. It is unlikely that the Williamses would later entirely hallucinate having brought home and played the whole series of Scott Adams games. The accounts also differ on whose idea it was to add pictures to the game. I’m inclined to believe it was Roberta, to whom the game idea and all the passion for it belonged.
[26]
Levy,
Hackers
, 308-310; Williams,
Not All Fairy Tales
, 73; Ken Williams, “A Message From the President,” Sierra News Magazine (Summer 1990), 35].
[29]
In later years, Sierra On-Line would employ several more women as designers—Lori Cole (the
Quest for Glory
series), Christy Marx (
Conquests of Camelot
and
Conquests of the Longbow
), and Jane Jensen (the
Gabriel Knight
series), while Amy Briggs created
Plundered Hearts
at Infocom. It is hard to get any reliable numbers on the audience for adventure games: in 1989, Sierra estimated that 35-40% of the players of
King’s Quest IV
were women, which surely was well above average for a computer game. Patricia Cignarella, “Girls Just Want To Have Fun,”
Sierra News Magazine
(Autumn 1989), 25.
Ask HN: How do you handle release notes for multiple audiences?
For those of you who ship often, when you release updates, do you typically write one set of release notes, or do you end up rewriting them for different audiences?
For example:
• technical version for developers
• simplified version for end users
• something more high-level for stakeholders etc…
In my current position I’ve seen a plethora of different ways teams, and even the company I currently work for, go about this.
What I’ve seen:
1. paste raw GitHub changelogs into customer emails (highly wouldn’t recommend if you’re currently doing this )
2. manually rewrite the same update multiple times for each audience
3. skip release notes entirely because it’s too much work
So I guess my question is: How do you or your company currently go about handling more than one set of release notes, and do you feel like more than one set is needed?
Would love to hear what’s working (or not working) for you, and if you found any tools that help mitigate this issue.
VPN location claims don't match real traffic exits
In a large-scale analysis of 20 popular VPNs, IPinfo found that 17 of those VPNs exit traffic from
different countries than they claim
. Some claim 100+ countries, but many of them point to the same handful of physical data centers in the US or Europe.
That means the majority of VPN providers we analyzed don’t route your traffic via the countries they claim to, and they claim many more countries than they actually support.
Analyzing over
150,000 exit IPs
across
137 possible exit countries,
and comparing what providers claim to what IPinfo measures, shows that:
17 in 20 providers
had traffic exiting in a different country.
38 countries
were “virtual-only” in our dataset (claimed by at least one provider, but never observed as the actual traffic exit country for any provider we tested).
We were only able to verify all provider announced locations for
3 providers
out of the 20.
Across ~150,000 VPN exit IPs tested, ProbeNet, our internet measurement platform, detected roughly
8,000 cases
where widely-used IP datasets placed the server in the wrong country — sometimes thousands of kilometers off.
This report walks through what we saw across VPN and IP data providers, provides a closer look at two particularly interesting countries, explores why measurement-based IP data matters if you care where your traffic really goes, and shares how we ran the investigation.
Which VPNs Matched Reality (And Which Didn’t)
Here is the overlap between the number of listed countries each VPN provider claims to offer versus the countries with real VPN traffic that we measured — lower percentages indicate providers whose claimed lists best match our data:
Provider
Claimed Countries
% Virtual or Unmeasurable
IPVanish
108
61
CyberGhost
100
57
ExpressVPN
105
57
NordVPN
126
53
Private Internet Access
91
52
ProtonVPN
110
51
FastVPN
112
49
X-VPN
89
43
Surfshark
100
41
BelkaVPN
63
41
ZoogVPN
76
34
VyprVPN
63
27
FastestVPN
47
26
TrustZone
39
18
PrivateVPN
62
13
TunnelBear
47
9
VeePN
84
6
IVPN
41
0
Mullvad
50
0
Windscribe
70
0
It's important to note that we used the most commonly and widely supported technologies in this research, to make comparison between providers as fair as possible while giving us significant data to analyze, so this will not be the full coverage for each provider.
These are some of the most visible names in the market. They also tend to have very long country lists on their websites. Notably, three well-known providers had zero mismatches across all the countries we tested: Mullvad, IVPN, and Windscribe
.
Country mismatches doesn’t automatically mean some providers offer “bad VPNs,” but it does mean that if you’re choosing a VPN because it claims “100+ countries,” you should know that a significant share of those flags may be labels, or virtual locations.
What “Virtual Locations” Really Mean
When a VPN lets you connect to, for example, “Bahamas” or “Somalia,” that doesn’t always mean traffic routes through there. In many cases, it’s somewhere entirely different, like Miami or London, but presented as if traffic is in the country you picked.
This setup is known as a virtual location:
The VPN app shows “Country X” (e.g. Bahamas).
The IP registry data also says “Country X” — because the provider self-declared it that way.
But the network measurements (latency and routing) show the traffic actually exits in “Country Y” — often thousands of kilometers away.
The problem? Without active network measurement, most IP datasets will rely on what the IP’s owner told the internet registry or published in WHOIS/geofeeds: a self-reported country tag. If that record is wrong or outdated, the mistake spreads everywhere. That’s where IPinfo’s ProbeNet comes in: by running live RTT tests from 1,200+ points of presence worldwide, we anchor each IP to its real-world location, not just its declared one.
Across the dataset, we found
97 countries
where at least one VPN brand only ever appeared as virtual or unmeasurable in our data. In other words, for a noticeable slice of the world map, some “locations” in VPNs never show up as true exits in our measurements.
We also found
38 countries
where every mention behaved this way: at least one VPN claimed them, but
none
ever produced a stable, measurable exit in that country in our sample.
You can think of these 38 as the “unmeasurable” countries in this study – places that exist in server lists, config files, and IP geofeeds, but never once appeared as the actual exit country in our measurements. They’re not randomly scattered – they cluster in specific parts of the map. By region, that includes:
This doesn’t prove there is zero VPN infrastructure in those countries globally. It does show that, across the providers and locations we measured, the dominant pattern is to serve those locations from elsewhere. Here are three of the most interesting examples of how this looks at the IP level.
Case Studies: Two Countries That Only Exist on the Map
To make this concrete, let’s look at three countries where every provider in our dataset turned out to be virtual:
Bahamas
, and
Somalia
.
Bahamas: All-Inclusive, Hosted in the US
In our measurements, five providers offered locations labeled as “Bahamas”: NordVPN, ExpressVPN, Private Internet Access, FastVPN, and IPVanish.
For all of them, measured traffic was in the United States, usually with sub-millisecond RTT to US probes.
Provider
Claimed as
Measured exit country
RTT to nearest ProbeNet vantage point in (evidence)
Example exit IP
NordVPN
🇧🇸
Bahamas
🇺🇸
United States
0.27 ms from Miami, United States
45.95.160.61
ExpressVPN
🇧🇸
Bahamas
🇺🇸
United States
0.15 ms from Miami, United States
64.64.117.18
Private Internet Access
🇧🇸
Bahamas
🇺🇸
United States
0.42 ms from New York, United States
95.181.238.101
FastVPN
🇧🇸
Bahamas
🇺🇸
United States
0.42 ms from Miami, United States
108.171.106.198
IPVanish
🇧🇸
Bahamas
🇺🇸
United States
0.37 ms from Miami, United States
108.171.106.207
Somalia: Mogadishu, via France and the UK
Somalia appears in our sample for only two providers: NordVPN and ProtonVPN.
Both label Mogadishu explicitly in their naming, but these RTTs are exactly what you’d expect for traffic in Western Europe, and completely inconsistent with traffic in East Africa. Both providers go out of their way in the labels (e.g. “SO, Mogadishu”), but the actual traffic is in Nice and London, not Somalia.
Provider
Claimed as
Measured exit country
RTT to nearest probe (evidence)
Example exit IP
NordVPN
🇸🇴
Somalia
🇫🇷
France
0.33 ms from Nice, France
212.32.91.11
ProtonVPN
🇸🇴
Somalia
🇬🇧
United Kingdom
0.37 ms from London, UK
74.118.126.204
When Legacy IP Providers Agree With the Wrong VPN Locations
So far, we’ve talked about VPN claims versus our measurements. But other IP data providers don’t run active RTT tests. They rely on self-declared IP data sources, and often assume that if an IP is tagged as “Country X,” it must actually be there.
In these cases, the IP legacy datasets typically “follow” the VPN provider’s story: if the VPN markets the endpoint as Country X, the legacy IP dataset also places it in Country X.
To quantify that, we looked at 736 VPN exits where ProbeNet’s measured country disagreed with one or more widely used legacy IP datasets.
We then compared the country IPinfo's ProbeNet measured (backed by RTT and routing) with the country reported by these other IP datasets and computed the distance between them. The gaps are large:
How Far Off Were the Other IP Datasets?
Distance between legacy IP databases and IPinfo country
Share of disagreement cases
> 1,000 km
83%
> 2,000 km
63%
> 5,000 km
28%
> 8,000 km
12%
The median error between ProbeNet and the legacy datasets was roughly 3,100 km
.
On the ProbeNet side, we have strong latency evidence that our measured country is the right one:
The median minimum RTT
to a probe in the measured country was
0.27 ms
.
About
90%
of these locations had a
sub-millisecond
RTT from at least one probe.
That’s what you expect when traffic is genuinely in that country, not thousands of kilometers away.
An IP Example You Can Test Yourself
This behavior is much more tangible if you can see it on a single IP.
Here's
one VPN exit IP
where ProbeNet places the server in the United Kingdom, backed by sub-millisecond RTT from local probes, while other widely used legacy IP datasets place the same IP in Mauritius, 9,691 kilometers away.
🇬🇧 United Kingdom vs 🇲🇺 Mauritius (ProtonVPN)
If you want to check this yourself, you can plug it into a public measurement tool like
https://ping.sx/
and run pings or traceroutes from different regions. Tools like this one provide a clear visual for where latency is lowest.
ProbeNet uses the same basic idea, but at a different scale: we maintain a network of 1,200+ points of presence (PoPs) around the world, so we can usually get even closer to the real physical location than public tools with smaller networks.
If you’d like to play with more real IPs (not necessarily VPNs) where ProbeNet and IPinfo get the country right and other datasets don’t, you can find a fuller set of examples on our IP geolocation
accuracy page
.
Why This Happens and How It Impacts Trust
It’s worth separating technical reasons from trust issues. There are technical reasons to use virtual or hubbed infrastructure:
Risk & regulation.
Hosting in certain countries can expose both the provider and users to local surveillance or seizure.
Infrastructure quality.
Some regions simply don’t have the same density of reliable data centers or high-capacity internet links, so running servers there is harder and riskier.
Performance & cost.
Serving “Bahamas” from Miami or “Cambodia” from Singapore can be cheaper, faster, and easier to maintain.
From this perspective, a virtual location can be a reasonable compromise: you get a regional IP and content unblocking without the downsides of hosting in a fragile environment.
Where It Becomes a Trust Problem
Three things change the picture:
Lack of disclosure.
Marking something clearly as “Virtual Bahamas (US-based)” is transparent. Listing “Bahamas” alongside “Germany” without any hint that one is virtual and the other is physical blurs the line between marketing and reality.
Scale of the mismatch.
It’s one thing to have a few virtual locations in hard-to-host places. It’s another when dozens of countries exist only as labels across your entire footprint, or when more than half of your tested locations are actually somewhere else.
Downstream reliance.
Journalists, activists, and NGOs may pick locations based on safety assumptions. Fraud systems, compliance workflows, and geo-restricted services may treat “Somalia” vs “France” as a meaningful difference. If both the VPN UI and the IP data say “Somalia” while the traffic is physically in France, everyone is making decisions on a false premise.
That last point leads directly into the IP data problem that we are focused on solving.
So How Much Should You Trust Your VPN?
If you’re a VPN user, here are some practical takeaways from this work:
Treat “100+ countries” as a marketing number, not a guarantee.
In our sample, 97 countries existed only as claims, not reality, across 17 providers.
Check how your provider talks about locations.
Do they clearly label “virtual” servers? Document where they’re actually hosted? Or do they quietly mix virtual and physical locations in one long list?
If you rely on IP data professionally, ask where it comes from.
A static “99.x% accurate worldwide” claim doesn’t tell you how an IP data provider handles fast-moving, high-stakes environments like VPN infrastructure.
Ultimately, this isn’t an argument against VPNs, or even against virtual locations. It’s an argument for honesty and evidence. If a VPN provider wants you to trust that map of flags, they should be willing, and able, to show that it matches the real network underneath.
How IPinfo Approaches IP Data Differently
Most legacy IP data providers rely on regional internet registry (RIR) allocation data and heuristics around routing and address blocks. These providers will often accept self-declared data like customer feedback, corrections, and geofeeds, without a clear way to verify them.
IPinfo takes a measurement-first approach:
Proprietary ProbeNet with 1,200+ points of presence
We maintain an internet measurement platform of PoPs in locations around the world.
Active measurements
For each visible IP on the internet, including both IPv4 and IPv6 addresses, we measure RTT from multiple probes.
Evidence-based geolocation
We combine these measurements with IPinfo’s other signals to assign a country (and more granular location) that’s grounded in how the internet actually behaves.
This measurement-first approach is unique in the IP data space. Once we realized how much inaccuracy came from self-declared data, we started investing heavily in research and building ProbeNet to use active measurements at scale. Our goal is to make IP data as evidence-based as possible, verifying with observation on how the internet actually behaves.
Our Methodology for This Report
We approached this VPN investigation the way a skeptical but well-equipped user would: start from the VPNs’ own claims, then test them.
Step 1: Collecting What Providers Say
For each of the
20 VPN providers
, we pulled together three kinds of data:
Marketing promises:
The “servers in X countries” claims and country lists from their websites. When a country was clearly listed there, we treated it as the locations they actively promote.
Configurations and locations lists:
Configurations from different protocols like OpenVPN or WireGuard were collected along with location information available on provider command-line tools, mobile applications, or APIs.
Unique provider–location entries:
We ended up with over 6,000,000 data points and a list of provider + location combinations we could actually try to connect to with multiple IPs each.
Step 2: Observing Where the Traffic Really Goes
Next, we used IPinfo infrastructure and ProbeNet to dial into those locations and watch what actually happens:
We connected to each VPN “location” and captured the exit IP addresses.
For each exit IP address, we used IPinfo + ProbeNet’s active measurements to determine a measured country, plus:
The nearest ProbeNet vantage point (e.g., US, Brazil, France)
The round-trip time (RTT) from that probe (often under 1 ms), which is a strong hint about physical proximity
Now we had two views for each location:
Expected/Claimed country
: What the VPN claims in its UI/configs/website
Measured country
: Where IPinfo + ProbeNet actually see the exit IP
Step 3: Comparing Claims vs Reality
For each location where a country was clearly specified, we asked a very simple question: Does the expected country match the measured country?
If yes, we counted it as a match. If not, it became a mismatch: a location where the app says one country, but the traffic exits somewhere else.
Acknowledgements, Limitations, and Constrains
We deliberately used a very narrow definition of “mismatch.” For a location to be counted, two things had to be true: the provider had to clearly claim a specific country (on their website, in their app, or in configs), and we had direct active measurements from ProbeNet for the exit IPs behind that location.
We ignored any locations where the marketing was ambiguous, where we hadn’t measured the exit directly, or where we only had weaker hints like hostname strings, registry data, or third-party IP databases. Those signals can be useful and true, but we wanted our numbers to be as hard-to-argue-with as possible.
The result is that the mismatch rates we show here are conservative. With a looser methodology that also leaned on those additional hints, the numbers would almost certainly be higher, not lower.
A
ll Hollywood stars grow old and die except perhaps one - Dick Van Dyke - who turns 100 today. The real world Peter Pan who used to trip over the ottoman on The Dick Van Dyke Show is still standing. The man who impersonated a wind-up toy in Chitty Chitty Bang Bang hasn’t wound down just yet. He has outlived mentors, co-stars, romantic partners and several studios. He’s even outlived the jokes about his performance in Mary Poppins. These days his
mangled cockney accent
is regarded with more fondness than contempt. It’s seen as one of the great charms of the 1964 classic, along with the carousel chase or the cartoon dancing penguins.
Accent on the charm … Dick Van Dyke with Julie Andrews in Mary Poppins.
Photograph: Donaldson Collection/Getty Images
Charm is the magic ingredient of every popular entertainer and few have possessed it in such abundance as Van Dyke, the impoverished son of a travelling cookie salesman who dropped out of high school and educated himself at the movies. “His job in this life is to make a happier world,” his Broadway co-star Chita Rivera once said - and this may explain his stubborn refusal to quit, not while times are tough and he feels that audiences still need cheering up.
Naturally his workrate has now slowed, but in the past few years he has competed on the TV show
The Masked Singer,
starred in a
Coldplay video
and enthusiastically
stumped for Bernie Sanders
. Van Dyke simply couldn’t understand why America’s older citizens were resistant to Sanders’ democratic socialist domestic policies. He said, “I want to urge my generation to get out and vote for him, please.”
Too much energy … Dick Van Dyke in Chitty Chitty Bang Bang.
Photograph: Moviestore/REX/Shutterstock
As he nudges into triple figures, he has become a piece of living history: a walking, talking chronicle of US showbusiness itself. Van Dyke began his career performing for the troops in the second world war and proceeded to rub shoulders with the likes of Phil Silvers and Walt Disney. He had one foot in music-hall slapstick and the other in screwball comedy, and possibly splayed fingers in his midwestern hometown of Danville, Illinois.
In bridging these worlds, he perfected an outward-facing public image that was one part Stan Laurel to two parts Jimmy Stewart: a pratfalling clown who was decent and honest and smarter than he first appeared. And while he was already nearing 40 when The Dick Van Dyke Show and Mary Poppins made him an international star, the actor remained irrepressibly boyish. In 1968’s
Chitty Chitty Bang Bang
, he played Caractacus Potts, the madcap inventor who dreams up a flying car, while Lionel Jeffries - six months younger - played Potts’s addled and eccentric dad.
Van Dyke, by and large, has steered clear of dark films. He famously turned down the lead role in The Omen and insists that he mostly played a version of himself. “Wholesome,” he says. “An all-round good boy.” That’s true so far as it goes, although it’s probably only half the story, because Van Dyke’s interpretation conveniently sidesteps a 25-year struggle with alcoholism that spanned his professional heyday. Possibly it also glosses over the air of dancing mischief – even wildness – that animates his most feted, family-friendly performances.
Sparring mutual respect … Mary Tyler Moore and Dick Van Dyke in The Dick Van Dyke Show (1961).
Photograph: CBS Photo Archive/CBS/Getty Images
Or to put it more bluntly, Van Dyke may have been mainstream but he never once felt conservative, nor even cosy, exactly. He brought too much energy to the room. It was as though he’d just blown in from outside and wasn’t entirely housetrained. The
Dick Van Dyke
Show – an otherwise standard 60s family sitcom – is notable for the crackling sexual chemistry and sparring mutual respect which Van Dyke cooked up with his co-star, Mary Tyler Moore.
Caractacus Potts, for his part, is the ultimate rackety dad: loving and exciting and liable to forget every birthday and dentist appointment. And then there is Bert, the sweep from
Mary Poppins
who trips across London’s rooftops like an urbanised Puck of Pook’s Hill. The evidence suggests that Bert isn’t cockney at all. He’s a spooky nature spirit, antic and mercurial, who is gamely attempting to pass himself off as a local.
Campaigning for Bernie Sanders, 2020.
Photograph: Étienne Laurent/EPA
Van Dyke is 100 and therefore no longer looks like Peter Pan. He looks, if anything, the platonic ideal of old age, with laughter lines and a thick white beard, the weathered embodiment of a life well lived. In his later years, he has grown used to people asking him for health advice, to the point where he even sat down and listed it all in a book (
100 Rules for Living to 100
).
The man is too self-aware to present himself as a paragon of good living. Instead he credits his longevity to a sprinkle of everyday magic – a combination of good genes, solid friendships and a positive mental outlook. “My life has been a magnificent indulgence,” he says. “I’ve been able to do what I love and share it with the world.”
It’s an arrangement that has sustained him for a full century on the planet. It’s fuelled a career so rewarding and fun that it barely felt like work at all. Van Dyke started out as showbusiness’s gawky gatecrasher, a controlled explosion of elastic limbs and rubber-faced double-takes, before maturing by degrees into Hollywood’s twinkling Father Time. He is ancient but evergreen, feted and cherished. And he’s altogether as lucky as lucky can be.
Former Apple, Google designer: "Are we stuck with the same Desktop UX forever?" [video]
One day, I ran into SwissTable—the kind of design that makes you squint, grin, and immediately regret every naive linear-probing table you’ve ever shipped.
This post is the story of how I tried to bring that same “why is this so fast?” feeling into Java. It’s part deep dive, part engineering diary, and part cautionary tale about performance work.
1) The SwissTable project, explained the way it
feels
when you first understand it
SwissTable is an open-addressing hash table design that came out of Google’s work and was famously presented as a new C++ hash table approach (and later shipped in Abseil).
At a high level, it still does the usual hash-table thing: compute
hash(key)
, pick a starting slot, and probe until you find your key or an empty slot.
The twist is that SwissTable separates
metadata
(tiny “control bytes”) from the actual key/value storage, and it uses those control bytes to avoid expensive key comparisons most of the time. Instead of immediately touching a bunch of keys (which are cold in cache and often pointer-heavy), it first scans a compact control bytes that is dense, cache-friendly, and easy to compare in bulk.
To make probing cheap, SwissTable effectively splits the hash into two parts:
h1
and
h2
. Think of
h1
as the part that chooses where to start probing (which group to look at first), and
h2
as a tiny fingerprint stored in the control bytes to quickly rule slots in or out. It’s not a full hash—just enough bits to filter candidates before we pay the cost of touching real keys.
So on lookup, you compute a hash, derive
(h1, h2)
, jump to the group from
h1
, and compare
h2
against all control bytes in that group before you even look at any keys. That means most misses (and many hits) avoid touching key memory entirely until the metadata says “there’s a plausible candidate here.”
Because probing stays cheap, SwissTable can tolerate higher load factors—up to about 87.5% (7/8) in implementations like Abseil’s
flat_hash_map
—without falling off a performance cliff, which directly improves memory efficiency.
The net effect is a design that is simultaneously faster (fewer cache misses, fewer key compares) and tighter (higher load factor, fewer side structures like overflow buckets).
2) Watching SwissTable become the “default vibe” in multiple languages (Go, Rust)
The first sign you’re looking at a generational design is when it stops being a cool library trick and starts showing up in standard libraries.
Starting in Rust 1.36.0,
std::collections::HashMap
switched to the SwissTable-based
hashbrown
implementation. It’s described as using
quadratic probing and SIMD lookup
, which is basically SwissTable territory in spirit and technique. That was my “okay, this isn’t niche” moment.
Then Go joined the party: Go 1.24 ships a new built-in map implementation based on the Swiss Table design, straight from the
Go team’s own blog post
. In their microbenchmarks, map operations are reported to be
up to 60% faster
than Go 1.23, and in full application benchmarks they saw about a
1.5% geometric-mean CPU time improvement
. And if you want a very practical “this matters in real systems” story,
Datadog
wrote about Go 1.24’s SwissTable-based maps and how the new layout and growth strategy can translate into serious memory improvements at scale.
At that point, SwissTable stopped feeling like “a clever C++ trick” and started feeling like
the modern baseline
. I couldn’t shake the thought: Rust did it, Go shipped it… so why not Java? And with modern CPUs, a strong JIT, and the Vector API finally within reach, it felt less like a technical impossibility and more like an itch I had to scratch.
That’s how I fell into the rabbit hole.
3) SwissTable’s secret sauce meets the Java Vector API
A big part of SwissTable’s speed comes from doing comparisons
wide
: checking many control bytes in one go instead of looping byte-by-byte and branching constantly. That’s exactly the kind of workload SIMD is great at: load a small block, compare against a broadcasted value, get a bitmask of matches, and only then branch into “slow path” key comparisons. In other words, SwissTable is not just “open addressing done well”—it’s “open addressing shaped to fit modern CPUs.”
Historically, doing this portably in Java was awkward: you either trusted auto-vectorization, used
Unsafe
, wrote JNI, or accepted the scalar loop. But the
Vector API
has been incubating specifically to let Java express vector computations that reliably compile down to good SIMD instructions on supported CPUs.
In Java 25, the Vector API is still incubating and lives in
jdk.incubator.vector
. The important part for me wasn’t “is it final?"—it was “is it usable enough to express the SwissTable control-byte scan cleanly?” Because if I can write “compare 16 bytes, produce a mask, act on set bits” in plain Java, the rest of SwissTable becomes mostly careful data layout and resizing logic. And once you see the control-byte scan as
the
hot path, you start designing everything else to make that scan cheap and predictable.
So yes: the Vector API was the permission slip I needed to try something I’d normally dismiss as “too low-level for Java.”
I began with the core SwissTable separation: a compact
control array
plus separate
key/value storage
. The control bytes are the main character—if those stay hot in cache and the scan stays branch-light, the table feels fast even before micro-optimizations.
I used the familiar
h1/h2
split idea:
h1
selects the initial group, while
h2
is the small fingerprint stored in the control byte to filter candidates. Lookup became a two-stage pipeline: (1) vector-scan the control bytes for
h2
matches, (2) for each match, compare the actual key to confirm. Insertion reused the same scan, but with an extra “find first empty slot” path once we know the key doesn’t already exist.
Where Java started pushing back was
layout realism
.
In C++ you can pack keys/values tightly; in Java, object references mean the “key array” is still an array of pointers, and touching keys can still be a cache-miss parade. So the design goal became:
touch keys as late as possible
, and when you must touch them, touch as few as possible—again, the SwissTable worldview.
Deletion required tombstones (a “deleted but not empty” marker) so probing doesn’t break, but tombstones also accumulate and can quietly degrade performance if you never clean them up.
Resizing was its own mini-project: doing a full rehash is expensive, but clever growth strategies (like Go’s use of table splitting/extendible hashing) show how far you can take this if you’re willing to complicate the design.
I also had to treat the Vector API as an optimization tool, not a magic wand. Vector code is sensitive to how you load bytes, how you handle tails, and whether the JIT can keep the loop structure stable. I ended up writing the control-byte scan as a very explicit “
load
→
compare
→
mask
→
iterate matches
” loop.
At this stage, the prototype already
worked
, but it wasn’t yet “SwissTable fast”—it was “promising, and now the real work begins.”
5) The pieces of SwissMap that actually mattered
Here’s what survived the usual round of “this feels clever but isn’t fast” refactors.
Control bytes & layout
With non-primitive keys, the real cost is rarely “a few extra byte reads” — it’s pointer chasing. Even one
equals()
can walk cold objects and pay cache-miss latency. So SwissMap treats the ctrl array as the first line of defense: scan a tight, cache-friendly byte array to narrow the search to a handful of plausible slots
before
touching any keys/values.
This matters even more in Java because “keys/values” usually means arrays of references. On 64-bit JVMs,
compressed oops
(often enabled up to ~32GB depending on alignment/JVM flags) packs references into 32 bits, making the reference arrays denser. When compressed oops is off, references widen to 64 bits and the same number of key touches can spill across more cache lines.
Either way, the ctrl array does most of the work: most misses die in metadata. Compressed oops just makes the unavoidable key touches cheaper when they do happen.
Load factor
In classic open addressing, pushing load factor up usually means the average probe chain gets longer fast — more branches, more random memory touches, and a steep cliff in miss cost. That’s why many general-purpose hash maps pick conservative defaults. Java’s
HashMap
, for example, defaults to a 0.75 load factor to keep miss costs from ballooning as the table fills.
SwissTable flips the cost model: probing is dominated by scanning the ctrl bytes first, which are dense, cache-friendly, and cheap to compare in bulk. That means “one more probe group” is often just one more ctrl-vector load + compare,
not
a bunch of key
equals()
calls. With SwissTable-style probing, the table can run denser without falling off a cliff. Abseil’s SwissTable-family maps are well known for targeting a ~7/8 (0.875) maximum load factor; even when the table is crowded, most probes are still “just metadata work.”
That trade-off is exactly what I wanted in Java too: higher load factor → fewer slots → smaller key/value arrays → fewer cache lines touched per operation, as long as the ctrl scan stays the fast path.
Sentinel padding
SIMD wants fixed-width loads: 16 or 32 control bytes at a time. The annoying part is the tail — the last group near the end of the ctrl array. In native code you might “over-read” a few bytes and rely on adjacent memory being harmless. In Java you don’t get that luxury: out-of-bounds is a hard stop.
Without padding, the probe loop picks up tail handling: extra bounds checks, masked loads, or end-of-array branches — exactly the kind of bookkeeping you don’t want in the hottest path. A small sentinel padding region at the end of
the
array lets every probe issue the same vector load, keeping the loop predictable and JIT-friendly.
H1/H2 split
Split the hash into
h1
(which selects the starting group) and
h2
(a small fingerprint stored per slot in the ctrl byte).
h1
drives the probe sequence (usually via a power-of-two mask), while
h2
is a cheap first-stage filter: SIMD-compare
h2
against an entire group of control bytes and only touch keys for the matching lanes.
SwissMap uses 7 bits for
h2
, leaving the remaining ctrl-byte values for special states like
EMPTY
and
DELETED
. That’s the neat trick: one byte answers both:
“Is this slot full/empty/deleted?”
“Is this slot even worth a key compare?”
Most lookups reject non-matches in the control plane. And if a probed group contains an
EMPTY
, that’s a definitive stop signal: the probe chain was never continued past that point, so the key can’t exist “later.”
Reusing the loaded control vector
A
ByteVector
load isn’t free — it’s a real SIMD-width memory load of control bytes. On my test box, that load alone was ~6ns per probed group. In a hash table where a
get()
might be only a few dozen nanoseconds, that’s a meaningful tax.
So SwissMap tries hard to load the ctrl vector exactly once per group and reuse it:
use the same loaded vector for the
h2
equality mask (candidate lanes)
and again for the
EMPTY
/
DELETED
masks (stop/continue decisions)
No extra passes, no “just load it again,” no duplicate work.
Tombstones
Deletion in open addressing is where correctness bites: if you mark a removed slot as
EMPTY
, you can break the probe chain. Keys inserted later in that chain would become “invisible” because lookups stop at the first empty.
Tombstones solve that by marking the slot as
DELETED
(“deleted but not empty”), so lookups keep probing past it.
Reusing tombstones on put
On
put
, tombstones are not just a correctness hack — they’re also reusable space. The common pattern is:
during probing, remember the first
DELETED
slot you see
keep probing until you either find the key (update) or hit an
EMPTY
(definitive miss)
if the key wasn’t found, insert into the remembered tombstone rather than the later empty slot
That tends to keep probe chains from getting longer over time, reduces resize pressure, and prevents workloads with lots of remove/put cycles from slowly poisoning performance.
When tombstones force a same-capacity rehash
Tombstones preserve correctness, but they dilute the strongest early-exit signal:
EMPTY
. A table with lots of
DELETED
tends to probe farther on misses and inserts — more ctrl scans, more vector loads, and more chances to touch cold keys.
So SwissMap tracks tombstones and triggers a
same-capacity rehash
when they cross a threshold (as a fraction of capacity or relative to live entries). This rebuilds the ctrl array, turning
DELETED
back into
EMPTY
and restoring short probe chains — basically compaction without changing logical contents.
A resize rehash without redundant checks
Resizing forces a rehash because
h1
depends on capacity. The obvious approach is “iterate old entries and call
put
into the new table,” but that pays for work you don’t need: duplicate checks, extra branching, and unnecessary key equality calls.
The faster path treats resize as a pure
move
:
allocate fresh ctrl/key/value arrays
reset counters
scan the old table once, and for each
FULL
slot:
recompute
(h1, h2)
for the new capacity
insert into the first available slot found by the same ctrl-byte probe loop
without
checking “does this key already exist?” (it can’t: you’re moving unique keys)
This makes resizing a predictable linear pass over memory rather than a branchy series of full
put()
operations.
Iteration is another place where “simple” becomes surprisingly expensive. You can scan linearly and yield
FULL
slots, but many designs want a stable-ish visit pattern without allocating a separate dense list. And some reinsertion/rehash interactions can even go accidentally quadratic (see the
Rust iteration write-up
).
SwissMap avoids extra buffers by iterating with a modular stepping permutation:
pick a
start
and an odd
step
(with power-of-two capacity, any odd step is coprime), then visit indices via repeated
idx = (idx + step) & mask
. This hits every slot exactly once, spreads accesses across the table, and keeps iteration as a tight loop over the same ctrl-byte state machine used elsewhere.
Here’s a trimmed version of the lookup to show how the probe loop hangs together around the control bytes:
protectedintfindIndex(Objectkey){if(size==0)return-1;inth=hash(key);inth1=h1(h);byteh2=h2(h);intnGroups=numGroups();intvisitedGroups=0;intmask=nGroups-1;intg=h1&mask;// optimized modulo operation (same as h1 % nGroups)for(;;){intbase=g*DEFAULT_GROUP_SIZE;ByteVectorv=loadCtrlVector(base);longeqMask=v.eq(h2).toLong();while(eqMask!=0){intbit=Long.numberOfTrailingZeros(eqMask);intidx=base+bit;if(Objects.equals(keys[idx],key)){// foundreturnidx;}eqMask&=eqMask-1;// clear LSB}longemptyMask=v.eq(EMPTY).toLong();// reuse loaded vectorif(emptyMask!=0){// any empty in a probed group is a definitive miss in SwissTable-style probingreturn-1;}if(++visitedGroups>=nGroups){// guard against infinite probe when table is full of tombstonesreturn-1;}g=(g+1)&mask;}}
6) Benchmarks
I didn’t want to cherry-pick numbers that only look good on synthetic cases.
So I used a simple, repeatable JMH setup that stresses high-load probing and pointer-heavy keys—the exact situation SwissTable-style designs are meant to handle.
All benchmarks were run on Windows 11 (x64) with Eclipse Temurin JDK 21.0.9, on an AMD Ryzen 5 5600 (6C/12T).
For context, I compared against
HashMap
, fastutil’s
Object2ObjectOpenHashMap
, and Eclipse Collections’
UnifiedMap
.
The headline result
At high load factors SwissMap keeps competitive throughput against other open-addressing tables and stays close to JDK HashMap performance. In some benchmarks it also comes out ahead by a large margin.
put hit
put miss
get hit
get miss
On memory, the flat layout (no buckets/overflow nodes) plus a 0.875 (7/8) max load factor translated to a noticeably smaller retained heap in small-payload scenarios—over 50% less than
HashMap
in this project’s measurements.
Caveat
These numbers are pre-release; the Vector API is still incubating, and the table is tuned for high-load, reference-key workloads. Expect different results with primitive-specialized maps or low-load-factor configurations.
Quick heads-up
You might notice a SWAR-style SwissTable variant appearing out of nowhere in the benchmark section. That version is part of a follow-up round of tuning: same SwissTable control-byte workflow, but implemented with SWAR to reduce overhead and avoid leaning on the incubating Vector API.
If SWAR is new to you, think of it as “SIMD within a register”: instead of using vector lanes, we pack multiple control bytes into a single 64-bit word and do the same kind of byte-wise comparisons with plain scalar instructions. The end result is a similar fast-path idea, just expressed in a more portable (and JDK-version-friendly) way.
I didn’t want this post to turn into three posts, so I’m saving the full “how/why” (SWAR included) for the next write-up — stay tuned.
P.S. If you want the code
This post is basically the narrative version of an experiment I’m building in public:
HashSmith
, a small collection of fast, memory-efficient hash tables for the JVM.
It includes
SwissMap
(SwissTable-inspired, SIMD-assisted probing via the incubating Vector API), plus a
SwissSet
variant to compare trade-offs side by side.
It’s explicitly
experimental
(not production-ready), but it comes with JMH benchmarks and docs so you can reproduce the numbers and poke at the implementation details.
If you want to run the benchmarks, sanity-check an edge case, or suggest a better probe/rehash strategy, I’d love issues/PRs
P.P.S. A great talk to watch
If you want the “straight from the source” version, Matt Kulukundis and other Google engineers’ CppCon talk on SwissTable is genuinely excellent — clear, practical, and packed with the kind of details that make the design click:
CppCon talk
.
Analysis finds anytime electricity from solar available as battery costs plummet
Ember’s report outlines how falling battery capital expenditures and improved performance metrics have lowered the levelized cost of storage, making dispatchable solar a competitive, anytime electricity option globally.
A report from energy think tank Ember details how cost reductions in battery storage technology are enabling dispatchable solar power to compete with conventional power sources.
Ember’s assessment draws on data from recent auctions across international markets, including Italy, Saudi Arabia, and India, supplemented by expert interviews conducted in October 2025.
This research indicates the industry is moving into a new environment where scaling of manufacturing capacity and competition have pushed costs down. The cost of core BESS equipment fell by 40% in 2024 compared with 2023, according to BloombergNEF’s global benchmark, reaching a record low of $165 per kWh.
Ember’s October 2025 data said a further large fall in 2025 is on track. Over the last 10 years, installed costs have fallen by 20% per year on average, while deployment has increased by around 80% per year.
According to the findings, the all-in capital expenditure for building a large, long-duration utility-scale battery energy storage system project in global markets outside of the U.S. and China is now approximately $125 per kWh. This figure reflects project pricing, comprising $75 per kWh for core equipment sourced from China, including battery enclosures, the power conversion system (PCS), and energy management system (EMS) and $50 per kWh for local installation, engineering, and grid connection activities.
These capital costs translate into a levelized cost of storage (LCOS) of $65 per MWh. This LCOS reflects the cost of shifting electricity to a different time period.
Image: Ember
Ember said this reduction in LCOS is driven by equipment costs and improved performance metrics, such as a 20-year design life for LFP technology, 90% efficiency, and lower project financing costs due to de-risked revenue models. Longer lifetimes, higher efficiency, and lower project risks reduce the LCOS by 35% even before accounting for the falling equipment prices, said the report.
The core implication of this analysis is the economic viability of dispatchable solar.
Storing 50% of a day’s solar output to meet night-time demand adds $33 per MWh to the generation cost. Using the 2024 global average solar price of $43 per MWh, the total cost for dispatchable solar is calculated at $76 per MWh.
This positions dispatchable solar as a cost-effective alternative to new gas power plants, particularly in regions reliant on imported LNG.
For the U.S., core equipment costs can reach $100 per kWh or higher in markets with higher tariffs or stricter standards. With the U.S. market facing various tariffs on Chinese imports and domestic content requirements via the Inflation Reduction Act (IRA), the $125 per kWh BESS project capex estimate is not directly applicable to the U.S. market.
The total equipment costs are 10% to 15% cheaper for four-hour projects, a key project duration in the U.S., as some components are sized to power rather than energy. However, even with cost variations, the U.S. is the second biggest BESS market globally, behind China, and saw record growth in Q1 2025 across all segments.
In 2024, Texas, California, Arizona, and Nevada all saw significant utility-scale battery growth, with the U.S. adding 10 GW of utility-scale batteries nationally, an 80% increase over the previous year. This growth is accelerating the integration of solar into the grid.
Ember’s conclusion is that solar has evolved beyond daytime electricity; coupled with storage, it becomes dispatchable, anytime power, positioned to meet a substantial portion of the world’s future energy needs.
This content is protected by copyright and may not be reused. If you want to cooperate with us and would like to reuse some of our content, please contact:
editors@pv-magazine.com
.
Using nvi as a Minimal and Fast Text Editor
===========================================
Overview:
---------
This document introduces the nvi text editor, a clean and efficient
implementation of the original Berkeley vi written by Keith Bostic.
nvi is designed for speed, low memory usage, and predictable behavior.
It is especially suited for system administration, SSH sessions,
editing configuration files, and handling very large text files such
as logs or JSON dumps on Unix-like systems including Slackware.
Unlike modern editors, nvi avoids complexity (plugins, scripting,
syntax highlighting) and focuses on reliability and efficiency.
1. Why Use nvi?
---------------
nvi provides the traditional vi experience but with several practical
advantages:
- extremely low memory usage
- fast access to distant lines
- excellent performance on multi-gigabyte files
- simple configuration and predictable commands
- small codebase aligned with the Unix philosophy
These characteristics make nvi ideal for minimal environments or users
who prefer a distraction-free workflow.
2. Internal Architecture
------------------------
nvi embeds its own modified, partial implementation of Berkeley DB 1.85.
This embedded database layer is used exclusively for storing and managing
the text being edited. It is not a full Berkeley DB engine.
Key components:
* Partial Berkeley DB:
The embedded DB code includes only the subsystems required by vi:
recno, btree, hash, and mpool. Features such as transactions,
multi-user locking, logging, or general database APIs are not present.
* Recno (Record Number Interface):
Each line of the file is stored as a numbered record. This provides
fast random access to line N without scanning the file sequentially,
even in files containing millions of lines.
* B-Tree:
Used for indexing and efficient lookup. The implementation inside nvi
is trimmed down specifically for the editor's needs.
* Mpool (memory pool):
Manages cached database pages in memory. Only the necessary pages are
loaded, allowing nvi to operate smoothly without loading the entire
file into RAM.
* Not stored as .db files:
Although based on Berkeley DB, the internal structures are kept purely
in memory through mpool. nvi does not write these internal DB pages to
disk as a persistent database.
* Recovery files:
Crash recovery is handled separately. nvi stores recovery metadata in:
/var/tmp/vi.recover/
Files named recover.
and vi.
allow interrupted editing sessions
to be restored with:
nvi -r
These recovery files are independent from the in-memory recno/mpool
structures used for normal editing.
This architecture explains nvi's ability to handle extremely large files
with low memory usage.
3. Minimal Configuration (.exrc)
-------------------------------
nvi uses a simple configuration file named ".exrc". A minimal setup might be:
set showmode
set showmatch
set ruler
set shiftwidth=2
set tabstop=2
No plugins, runtime files, or external tools are required.
4. Working Without Syntax Highlighting
--------------------------------------
nvi does not include syntax highlighting. This design choice promotes:
- reduced visual noise
- clearer focus on logic and structure
- consistent display across all terminals
- improved readability in minimal environments
For scripting, logs, configuration files, and remote administration,
highlighting is optional rather than necessary.
5. Undo, Marks, and Window Splits
---------------------------------
Undo and redo:
u undo last change
. repeat undo
uu redo
Marks offer precise navigation:
mx set mark x
'x jump to line of mark x
`x jump to exact cursor position
Window splits:
:vs vertical split
:E horizontal split
6. Recommended Usage with tmux
------------------------------
Since nvi does not implement tabs, tmux is the ideal companion:
- multiple panes
- persistent sessions
- fullscreen focus
- fast switching between tasks
This combination keeps the editor itself minimal while still supporting
multi-file workflows.
7. nvi on Slackware
-------------------
nvi was added to Slackware -current on:
Mon Jan 13 00:11:55 UTC 2020
This was the first appearance of nvi in the distribution, and it later became
part of the Slackware 15.0 stable release (February 2022). Slackware 14.2 and
earlier shipped with Elvis as /usr/bin/vi.
On the same day nvi was added, Elvis was rebuilt to remove its /usr/bin/vi and
/usr/bin/ex symlinks. From that point onward, nvi provides those symlinks
**only if** they are not already supplied by another editor. This preserves
Slackware’s long-standing policy that users should be able to choose their
preferred vi implementation.
Reasons for adoption:
- nvi provides UTF-8 support (Elvis lacked this capability)
- cleaner and more predictable behavior in modern terminals
- smaller and more maintainable codebase
- strong alignment with Slackware’s minimal and traditional design
Slackware applies a number of maintenance patches to nvi, addressing:
- wide-character and UTF-8 behavior
- build system portability
- memory safety in recno/mpool
- recovery stability
- accumulated bug fixes from BSD and Debian
These patches modernize nvi without altering its classic vi behavior.
Users may adjust the default vi/ex implementation using:
pkgtool -> Setup -> vi-ex
8. Essential Commands
---------------------
A concise nvi quick reference is available at:
https://4c6e.xyz/nvi.html
9. When nvi Is a Good Choice
----------------------------
nvi is ideal for:
- servers and SSH environments
- editing configuration files in /etc
- minimal or low-resource systems
- large log files or JSON datasets
- users who prefer classic vi behavior
- workflows requiring speed and simplicity
10. Conclusion
--------------
nvi follows the original Unix philosophy: simple, fast, and reliable.
Its embedded Berkeley DB layer enables excellent performance with
large files, while its minimal design keeps editing predictable and
distraction-free.
For Slackware and other Unix users seeking a lightweight, efficient
editor, nvi remains an outstanding choice.
Appendix: Minimal Comparison (nvi vs vim vs elvis)
--------------------------------------------------
nvi:
- fastest and lightest
- best for huge files (logs/JSON)
- classic vi behavior
- ideal for servers and SSH
- no syntax, no plugins, no extras
vim:
- most features and plugins
- best for programming and IDE-like workflow
- syntax highlighting, scripting, LSP
- heavier and slower with large files
elvis:
- small, portable vi clone
- unique display modes
- good for rescue/embedded systems
- limited UTF-8 support
Upstream and historical references
----------------------------------
nvi originates from the Berkeley vi developed by Keith Bostic. Historical
documentation and background can be found on the Berkeley vi home page
maintained by the original author:
https://sites.google.com/a/bostic.com/keithbostic/the-berkeley-vi-editor-home-page
------------------------------------------------------------------
Last Modified: 2025-12-13 11:37:32 UTC
Editors should have an opt-in for less assistance (2024)
I'd like text editors to be worse. Specifically, I'd like their
default
behaviour to be as close as possible to the median text input box you'd find in any piece of software, like the humble HTML
<textarea>
. More realistically, I'd like a configuration preset that lets me opt in to the same, without having to hunt for a thousand individual setting tweaks. This opt-in should apply as globally as possible, perhaps as an environment variable.
My rationale is simple: I hate context switching. I want my input sequences to always work, no matter what software I'm using. Trying to apply deeply ingrained muscle memory in the wrong context and having it not work can be extremely frustrating.
As you can maybe guess from that, modal text editors like
vim
don't work for me. Consider them to be outside the scope of this rant—I only care about text editors (and more broadly, text inputs) that are superficially "normal."
Let's take a concrete example: typing a quoted string. My go-to input sequence is to type
"
"
[left]
, i.e., typing two quotes at once and then moving the cursor back into the middle of the two, ready to type the string itself:
"|"
. That's three keystrokes, but in my brain it's a single action, and it works just about anywhere I might want to use it.
That is, everywhere
except
editors that try to be too smart. There are two common behaviours, both annoying in their own way:
Pressing
"
once produces
"|"
, and finishing the input sequence results in
"|""
.
Pressing
"
once produces
"|"
, pressing it again produces
""|
, and pressing left results in
"|"
.
The first behaviour is obviously undesirable because it produced an unexpected result.
The second one produced the correct result, but I hate it even more! I hate it because it added a whole extra layer of "too smart" behaviour to try to counteract the first. That is, pressing
"
with the cursor already in front of a
"
character will not insert an additional
"
, but merely move the cursor to the other side of it. This works out alright in this
particular
scenario, but it's extremely confusing in any other situation where you happen to want to type a
"
in front of an existing one. This might sound far-fetched, but it's the default behaviour of Firefox's dev console, and it's something I stumble over regularly.
I'm sure there's a setting I could tweak for this, but my problem is not Firefox; my problem is that these annoying behaviours are
everywhere
, and not just in relation to quoted strings.
When there's a problem "everywhere," it might seem like the best approach is to suck it up and get used to it. But I can't, because the problem isn't a specific behaviour; the problem is the
divergence
of behaviours. I can't possibly get used to all of them!
I'm going to name this annoyance "
auto-input
". Anything that inputs characters I didn't type for myself (or otherwise opt in to) is on my naughty list. This does not include input
suggestions
. For example, when typing a symbol name, an editor might offer a drop-down list of suggested completions that I can opt
in
to using the tab key. That's great because if I ignore the suggestion and continue typing obliviously, everything still works.
Text editors are encouraged to be smart, but that smartness shouldn't degrade the basics.
The one exception to my auto-input hatred is auto-indentation. When I press enter, I expect the new line to be pre-filled with the same leading whitespace as the line above. Anything that tries to be smarter than that, especially those that try to be language-aware, will likely get on my nerves too. And, if the indentation is spaces, pressing backspace should delete exactly one of them at a time. This is a matter of personal preference, but so is everything else in this article.
I'm too lazy to do any serious work towards a solution here, beyond writing this rant, but I would like to float the idea of a
NO_AUTOINPUT
environment variable. If the variable is present on first-run of an application, it should set the appropriate settings defaults to minimise auto-input behaviours. After that, the settings can be tweaked per user preference.
For the sake of homogeneity, auto-indentation should be disabled with
NO_AUTOINPUT
too. I'll accept the collateral damage of having to re-enable one setting, in the cases where I need it.
I Tried Gleam for Advent of Code, and I Get the Hype
For the last seven years, including this one, I have managed to get all the stars. I do not say that to brag. I say it because it explains why I keep coming back.
It is one of the few tech traditions I never get bored of, even after doing it for a long time. I like the time pressure. I like the community vibe. I like that every December I can pick one language and go all in.
Advent of Code is usually 25 days. This year Eric decided to do 12 days instead.
So instead of 50 parts, it was 24.
That sounds like a relaxed year. It was not, but not in a bad way.
The easier days were harder than the easy days in past years, but they were also really engaging and fun to work through. The hard days were hard, especially the last three, but they were still the good kind of hard. They were problems I actually wanted to wrestle with.
It also changes the pacing in a funny way. In a normal year, by day 10 you have a pretty comfy toolbox. This year it felt like the puzzles were already demanding that toolbox while I was still building it.
That turned out to be a perfect setup for learning a new language.
The syntax is clean. The compiler is helpful, and the error messages are super duper good. Rust good.
Most importantly, the language strongly nudges you into a style that fits Advent of Code really well. Parse some text. Transform it a few times. Fold. Repeat.
Also, pipes. Pipes everywhere. I love pipes.
One thing I did not expect was how good the editor experience would be. The LSP worked much better than I expected. It basically worked perfectly the whole time. I used the Gleam extension for IntelliJ and it was great.
It is basically a print statement that does not make you earn it. You can
echo
any value. You do not have to format anything. You do not have to build a string. You can just drop it into a pipeline and keep going.
You can quickly inspect values at multiple points without breaking the flow.
I did miss string interpolation, especially early on.
echo
made up for a lot of that.
It mostly hit when I needed to generate text, not when I needed to inspect values. The day where I generated an LP file for
glpsol
is the best example. It is not hard code, but it is a lot of string building. Without interpolation it turns into a bit of a mess of
<>
s.
It works. It is just the kind of code where you really feel missing interpolation.
Options everywhere, and why that matters for grid puzzles
#
A lot of AoC is grids.
Grids are where you normally either crash into out of bounds bugs, or you litter your code with bounds checks you do not care about.
In my day 4 solution I used a dict as a grid. The key ergonomic part is that
dict.get
gives you an option-like result, which makes neighbour checking safe by default.
For AoC you read a file every day. In this repo I used
simplifile
everywhere because you need something. It is fine, I just did not expect basic file IO to be outside the standard library.
Day 10 part 1 was my favorite part of the whole event.
The moment I saw the toggling behavior, it clicked as XOR. Represent the lights as a number. Represent each button as a bitmask. Find the smallest combination of bitmasks that XOR to the target.
It felt clean, it felt fast, and it felt like the representation did most of the work.
The least satisfying part: shelling out to
glpsol
#
Day 10 part 2 was the opposite feeling.
I knew brute force was out. It was clearly a system of linear equations.
In previous years I would reach for Z3, but there are no Z3 bindings for Gleam. I tried to stay in Gleam, and I ended up generating an LP file and shelling out to
glpsol
using
shellout
.
It worked, and honestly the LP format is beautiful.
The last day was the only puzzle I did not fully enjoy.
Not because it was bad. It just felt like it relied on assumptions about the input, and I am one of those people that does not love doing that.
I overthought it for a bit, then I learned it was more of a troll problem. The “do the areas of the pieces, when fully interlocked, fit on the board” heuristic was enough.
In my solution it is literally this:
heuristic_area <= max_area
Sometimes you build a beautiful mental model and then the right answer is a single inequality.
It has sharp edges, mostly around where the standard library draws the line and a few language constraints that show up in puzzle code. But it also has real strengths.
Pipelines feel good. Options and Results make unsafe problems feel safe. The list toolbox is better than I expected.
fold_until
is incredible. Once you stop trying to write loops and you let it be functional, the solutions start to feel clearer.
I cannot wait to try Gleam in a real project. I have been thinking about using it to write a webserver, and I am genuinely excited to give it a go.
And of course, I cannot wait for next year’s Advent of Code.
If you want to look at the source for all 12 days, it is here:
Our team of problem solvers brings a modern approach to shipping logistics. We collaborate across departments, ask challenging questions, explore new solutions, and take accountability for our wins and mistakes.
Dear Job Seekers,
We want to ensure your safety and protect you from potential scams. Recently, there have been fraudulent recruitment initiatives online that impersonate our company. These scams aim to deceive unsuspecting applicants by offering nonexistent positions and requesting personal information or upfront fees.
Remember that our company does not endorse any job postings outside our official channels. If you encounter a suspicious offer, report it through the job platform on which you found it or report email as spam.
If you need to check on the validity of an email from EasyPost, feel free to reach out directly to
recruiting@easypost.com
As industry experts, we’re working not only to help our customers make sense of the industry, but to define where it’s headed. We are looking for candidates who are approachable, dynamic, inventive, intelligent, and reliable to join our team in unpacking the future of shipping.
The future of shipping
How can modern, flexible technology improve the customer experience of shipping? What if every business was able to offer same-day shipping? How much waste would be removed from the environment if all our shipments were consolidated into one delivery per week? At EasyPost, we’re figuring out the answer to these questions and more.
Life at EasyPost
Adaptive
Embrace new challenges to grow your skill set.
Simple
Create efficient solutions that are easy to execute.
Inclusive
Share new ideas and work collaboratively across teams.
Team and technology
We’re a fun group of passionate entrepreneurs who built our own revolutionary software designed to make shipping simple. EasyPost started as an Engineering first company and we are proud to have a pragmatic approach to software development. Our team has a wealth of diverse experience and different backgrounds ranging from startups to large technology companies.Be part of a leading technology company:
CI/CD inspired workflows – we deploy dozens of times a day
Small services over monoliths – we’ve deployed hundreds of services
Strong engineering tooling and developer support
Transparency and participation around architecture and technology decisions
Culture of blamelessness and improving today from yesterday’s shortcomings
Benefits and perks
Medical, dental,
vision plans
Flexible
time-off
Stock option opportunities
Cross-functional learning
Monthly
virtual events
Start your adventure at EasyPost
Customer Success
Product Management
Sales
Engineering
YouTube channels spreading fake, anti-Labour videos viewed 1.2bn times in 2025
Guardian
www.theguardian.com
2025-12-13 17:00:37
Exclusive: More than 150 anonymous channels using cheap AI tools to spread false stories about Keir Starmer, study finds YouTube channels spreading fake, anti-Labour videos have amassed more than a billion views this year, as opportunists attempt to use AI-generated content to profit from political ...
YouTube channels spreading fake, anti-Labour videos have amassed more than a billion views this year, as opportunists attempt to use AI-generated content to profit from political division in the UK.
More than 150 channels have been detected in the last year that promote anti-Labour narratives, as well as outright fake and inflammatory accusations about
Keir Starmer
.
A study seen by the Guardian has found the channels have accumulated 5.3m subscribers and have created more than 56,000 videos, with a total of almost 1.2bn views in 2025. The network of anonymous channels includes alarmist rhetoric, AI scripts and British narrators to attract hits.
Starmer is personally targeted. The prime minister was either named in the video title or description 15,600 times.
Reset Tech, the non-profit group that produced the research, said the channels were part of a global trend to produce synthetic propaganda on the platform. It pointed to the proliferation of cheap AI tools that could be deployed to make a quick profit from divisive topics.
One channel called Britain News-night talked about Starmer and Reeves facing arrest. Another, TheUKPoliticalBrief, touted videos on the “explosive truth” about immigrant crime and marches on Westminster.
The UK NewsCore channel focused on how Nigel Farage was ousting Starmer, and claimed the prime minister was “sacked live” and thrown out of parliament.
Other videos featured bizarre, fabricated stories about a row between the royal family and the government. One channel, Gold Up!, said the dispute had left Starmer “melting down on live TV”.
Some of the videos and channels were removed by YouTube’s checks. However, all 150 were taken down when the platform was approached by the Guardian. Reset Tech said some channels had created tens or hundreds of similar videos without being deplatformed.
The research found similar channels operating in German, French, Spanish and Polish, targeting other politicians or political issues. In total, it mapped 420 problematic channels operating in Europe. Reset Tech said Russian-speaking creators operate some of the channels.
It is believed channels aimed at the UK were being driven by opportunistic creators trying to monetise political division over issues like immigration, rather than overseas political actors. However, it said their presence still posed a risk to public trust.
The content has caused concern inside
Labour
. “The rise of fake news online is a serious threat to our democracy,” a spokesperson said. “The public will be rightly alarmed that democratically elected leaders and institutions are being undermined by bad faith foreign state actors and those seeking to profit from misinformation.
“We’ve already seen attempts from overseas to influence fair elections and manipulate public opinion both here and abroad.
“The government is stepping up its efforts to work with online platforms to tackle this scourge on free and fair democracy. But it’s important that tech bosses take this threat seriously and live up to their obligations to remove this type of content wherever it’s found.”
Dylan Sparks, UK director of Reset Tech, called for YouTube to take swifter action. “Malicious actors are permitted by YouTube to spread synthetic ‘news’ that disrupts political debate in the UK, while also earning revenue from it,” he said. “This AI-generated, low cost content spreads across the platform undetected, revealing clear weaknesses in YouTube’s monetisation and content moderation systems.
“This specific network focuses on the prime minister and Labour government, but the same loopholes could be exploited by any hostile actor to push an agenda. Because social media platforms profit from engagement, their business model creates an in-built tension between enforcing their own policies and reducing the spread of malicious content that drives revenue.
“The rapid spread of AI has also introduced new risks to the online environment, and platforms need to move faster and invest more to address them.”
A YouTube spokesperson said: “Spam and deceptive practices that try to take advantage of the YouTube community are not allowed on the platform, which is why the channels flagged by the Guardian have all been removed.
“We enforce our policies consistently, regardless of political viewpoint expressed, or how the content is generated. Our teams work around the clock to monitor for harmful content, taking swift action as needed.”
YouTube is now working with Reset Tech over its findings. The platform said its systems prominently feature authoritative news content on the YouTube homepage, in search results, and through recommendations. It has removed more than 2.1m channels for violating its community guidelines.
Ministers have already formed an online advertising taskforce to see what action can be taken to address the advertising-based monetisation of harmful and misleading content.
Aging Out of Fucks: The Neuroscience of Why You Suddenly Can't Pretend Anymore
You’re in a meeting. Someone says something objectively wrong. And instead of doing your usual dance—the soft correction, the diplomatic phrasing, the careful preservation of everyone’s feelings—you just... say it.
“That’s not accurate.”
No cushioning. No apology. No emotional labor to make your truth more palatable.
And everyone looks at you like you’ve grown a second head.
Welcome to what I call the Great Unfuckening—that point in midlife when your capacity to pretend, perform, and please others starts shorting out like an electrical system that’s finally had enough.
You might think you’re becoming difficult. Impatient. One of those “bitter older women” you were warned about.
But here’s what’s actually happening: your brain is restructuring itself. And thank god for that.
Let’s start with the science, because this isn’t about you becoming a worse person. It’s about your brain finally doing some overdue maintenance.
For decades, your prefrontal cortex—the part of your brain responsible for executive function, social behavior, and impulse control—has been working overtime. It’s been monitoring social cues, calculating risks, suppressing authentic responses, and managing everyone else’s emotional experience.
This is exhausting work. And it turns out, it’s unsustainable.
Research in neuroscience shows that as we age, the brain undergoes a process called synaptic pruning. Neural pathways that aren’t essential get trimmed away. Your brain is essentially Marie Kondo-ing itself, keeping what serves you and discarding what doesn’t.
And all those neural pathways dedicated to hypervigilant people-pleasing? They’re often first on the chopping block.
Dr. Louann Brizendine, neuropsychiatrist and author of “The Female Brain,” explains that women’s brains are particularly wired for social harmony and caregiving in the first half of life—driven partly by estrogen and oxytocin. But as estrogen levels shift in perimenopause and beyond, this intense drive to please and nurture others begins to diminish.
What replaces it isn’t bitterness. It’s clarity.
Think about what you’ve been doing since you were old enough to understand social dynamics:
Reading the room. Adjusting your tone. Softening your language. Making yourself smaller to make others comfortable. Laughing at jokes that weren’t funny. Agreeing with opinions you didn’t share. Explaining things carefully so no one feels threatened by your knowledge.
You’ve been running complex social calculations every single day for decades.
There’s a concept in psychology called “decision fatigue”. The deteriorating quality of decisions made after a long session of decision-making. But what we don’t talk about enough is emotional labor fatigue.
After thousands of interactions where you’ve monitored and managed your authentic responses to maintain social harmony, something in your system starts breaking down. Not because you’re broken, but because the system was never meant to run this way indefinitely.
Your brain isn’t malfunctioning. It’s finally refusing to malfunction anymore.
Men experience aging changes too, obviously. But women tend to report this shift more dramatically, and there’s a reason for that.
From childhood, girls are socialized for social harmony in ways boys simply aren’t. Research shows that girls as young as 4 already demonstrate more awareness of others’ emotions and adjust their behavior accordingly more than boys do.
By the time you reach midlife, you’ve had 40+ years of this conditioning. That’s four decades of:
“Don’t be bossy” (translation: don’t lead)
“Don’t be pushy” (translation: don’t assert boundaries)
“Don’t be difficult” (translation: don’t have needs)
“Don’t be emotional” (translation: don’t be human)
You’ve been performing an elaborate social choreography so long it became automatic. You stopped noticing you were doing it.
Until suddenly, you can’t anymore. Or more accurately—you won’t.
Several neurological and hormonal shifts converge in midlife that contribute to this phenomenon:
Hormonal recalibration.
As estrogen declines, so does its moderating effect on emotional responses and social bonding behaviors. You’re not becoming “hormonal” in the dismissive sense. You’re becoming less chemically compelled to prioritize others’ comfort over your own truth.
Prefrontal cortex changes.
The same executive function region that helped you suppress inappropriate responses for decades starts operating differently. Some research suggests it becomes less reactive to social judgment and approval. You’re literally less neurologically invested in what others think.
Accumulated stress response.
Decades of chronic low-level stress from constant social monitoring takes a biological toll. Your stress response system—the HPA axis—can become dysregulated. What looks like “not having a filter” might actually be a stress response system that’s finally saying “enough.”
Cognitive prioritization shifts.
Your brain starts prioritizing differently. Energy becomes more precious. Time becomes more finite. The cost-benefit analysis of pretending shifts dramatically.
Here’s the part that makes this transition so uncomfortable: other people don’t like it.
When you stop performing emotional labor, systems that relied on that labor start breaking down. And instead of examining why the system needed your performance to function, people blame you for withdrawing it.
You’re suddenly:
“Not a team player”
“Going through something”
“Difficult to work with”
“Changed” (said with concern that really means disapproval)
The same directness that would be called “no-nonsense” in a man gets called “abrasive” in a woman over 40.
This backlash is proof of concept. It confirms that your people-pleasing wasn’t optional. It was required labor that kept everything running smoothly. And when you stop providing it for free, people notice.
The discomfort you’re causing? That’s not your problem to fix. That’s information about a system that was always exploiting you.
But here’s what complicates this: the liberation feels dangerous.
You’ve been rewarded your entire life for being accommodating. Easy. Pleasant. Not too much. The positive feedback loop of being liked is powerful, and you’re now breaking that loop.
You might find yourself afraid that:
You’re becoming “that woman”—the bitter, difficult one everyone avoids
You’ll lose relationships (and you might—more on this in a moment)
You’re being selfish or narcissistic
You’re overreacting or being “too sensitive” (ironic, since you’re actually being less sensitive to others’ reactions)
These fears are valid. But they’re also old programming.
The woman you’re afraid of becoming? She’s not real. She’s a cautionary tale designed to keep you compliant.
Let’s be explicit about what’s actually happening when you “lose your filter”:
You’re gaining authenticity.
The real you—the one who’s been submerged under layers of performance—is finally surfacing. This might feel harsh because authentic humans have edges. They have opinions. They have boundaries. These aren’t character flaws.
You’re gaining time.
All the energy you spent managing everyone else’s experience? That’s now available for literally anything else. The return on investment is staggering.
You’re gaining clarity.
When you stop cushioning every truth, reality becomes clearer. Problems that were obscured by diplomatic language become visible and therefore solvable.
You’re gaining real relationships.
Some relationships will end when you stop people-pleasing. These were transactional relationships sustained by your performance. What remains are connections based on who you actually are.
This is hard to talk about, but necessary: some relationships won’t survive your refusal to keep pretending.
Friendships built on shared complaining but not actual intimacy. Work relationships that relied on you doing emotional labor others weren’t doing. Family dynamics where you played mediator, peacemaker, or emotional manager.
When you stop playing these roles, one of two things happens:
The relationship evolves into something more authentic, or it dissolves because it was never based on authentic connection in the first place.
Both outcomes are information.
Losing relationships because you stopped performing isn’t actually loss. It’s clarity about what was never really there.
If you’re in the thick of this shift, here’s what helps:
Name what’s happening.
“I’m not becoming difficult—I’m becoming authentic. My brain is reorganizing around honesty instead of performance.” Language matters. The story you tell yourself about this change shapes your experience of it.
Expect resistance.
When you stop over-functioning in relationships and systems, others will push back. This isn’t evidence you’re doing something wrong. It’s evidence you were doing too much before.
Practice the pause.
You don’t have to swing from people-pleasing to brutal honesty overnight. Notice when you’re about to soften/cushion/apologize unnecessarily. Pause. Choose consciously whether to add the cushioning or not.
Find your people.
Other women going through this same shift. They exist. They’re tired of pretending too. These relationships will feel different—less performative, more substantial.
Grieve if you need to.
There’s loss here too. Loss of approval, loss of being liked by everyone, loss of your identity as “the nice one.” This grief is legitimate even as the change is ultimately positive.
Here’s what no one tells you about aging out of fucks: it’s practice for being fully alive.”
Every small death of ego, every shedding of others’ opinions, every moment you choose truth over approval, you’re rehearsing the ultimate letting go.
You’re learning to exist as yourself regardless of external validation. This is spiritual work masquerading as social rudeness.
The woman who can say “that’s not accurate” without apologizing is the same woman who can eventually face her own mortality without flinching. She’s practiced not needing everyone’s approval. She’s learned that her worth isn’t contingent on being pleasant.
You’re becoming free.
The “you” that’s emerging isn’t a worse version. It’s the version that was always there but buried under decades of social conditioning to maintain harmony at any cost.
Your brain is finally doing triage. Deciding what actually matters. Cutting away the pretense that never served you.
The filter you’re losing wasn’t protecting you. It was protecting everyone else from your truth.
And your truth? It’s not the problem.
The system that required you to hide it was always the problem.
So when someone says you’ve changed, when they say you’re not the person you used to be, when they imply something’s wrong with you now?
They’re right. You have changed.
You’ve changed into someone who’s no longer available for performance.
And that’s not difficult.
That’s development.
What’s the thing you used to bite your tongue about that you can’t anymore? Drop it in the comments. I have a feeling we’re all going through versions of the same awakening.
“I’m building a space for women who are done performing. If this resonated with you, stick around. There’s more where this came from—and we’re just getting started.”
The Infinite Loop of One LLM Talking to Another
Daring Fireball
www.instagram.com
2025-12-13 16:26:23
This is very funny, but also a good indication of just how far away these things are from actual intelligence. First, a reasonable human being would never get caught in a loop like this. Second, only humans can not only recognize what’s going on here, but also see the humor in it.
★
...
Indexed Reverse Polish Notation, an Alternative to AST
2025-12-12
"Why study compiler construction? Because knowing how a programming language is specified
and implemented makes you a better programmer." I still remember these words,
pronounced as matter-of-fact introduction to an undergrad course of compilers.
Compiler engineers have come up with many useful programming techniques and representations.
Today, I want to write about one such technique, an alternative to Abstract Syntax Trees (ASTs).
Inspired by the parse tree representation in the
Carbon compiler
,
this post explains a way to represent parsed source code using a variation of Reverse Polish Notation (RPN),
in a contiguous array.
We call this
Indexed RPN
. Ordering program parts in a linear sequence very naturally leads to
machine interpretation, which is well-known for calculators but maybe a little less well known
when there are scoped definitions and control flow structures.
This is by no means a new way of doing things, but with modern machines having plenty of memory,
there may have been less pressure to reach for techniques that memory-friendly.
1. From Arithmetic to Indices
Let’s start with an arithmetic expression example: We want to represent (3 + 4) * 5.
In a standard AST, this is a tree of pointers. In a standard stack-machine RPN, this looks like 3 4 + 5 *.
Now we want something slightly different, because we want to operate on tree structure.
For example, if we deal with this expression in a compiler that translates and optimizes.
We want to be able to refer to specific sub-expressions later.
The "Administrative Normal Form" Perspective
Before we look at the memory layout, let's imagine breaking up this expression by
naming
subexpression.
If we had local definitions in our language this would give us Administrative Normal Form (ANF).
In ANF, we give a name to every intermediate result:
// Source: (3 + 4) * 5
val t0 = 3
val t1 = 4
val t2 = add(t0, t1)
val t3 = 5
val t4 = mul(t2, t3)
The expression now takes a lot more to write, but for a compiler it is much more structured.
The arguments of the operations are always names, which makes the data flow and also the
order in which arguments get evaluated fully explicit. Here,
t2
depends entirely on
t0
and
t1
,
and
t0
is evaluated before
t1
.
The In-Memory Representation
We don't want local definitions just yet, the above is just to motivate flattening of a tree structure.
If we store the instructions in a contiguous array (e.g., a vector or Vec), the
index
of the node becomes its name.
The internal names are merely indices into the sequence of nodes.
Index (ID)
Node Kind
Operands / Data
0
IntLiteral
3
1
IntLiteral
4
2
BinaryOp
Add(lhs: 0, rhs: 1)
3
IntLiteral
5
4
BinaryOp
Mul(lhs: 2, rhs: 3)
This is similar to Reverse Polish Notation (RPN), but there is a difference. In standard RPN, there is an implicit stack from which
Add
consumes items blindly. In
Indexed RPN
,
Add
explicitly refers to indices 0 and 1. This provides a stable reference to every sub-expression, allowing us to traverse the code and locate nodes without necessarily having to build up a stack.
2. Dealing with "Let" and Scope
Let us make the language more realistic by adding local let-declarations and scoping.
// Source
let a = 10;
let b = a + 5;
Here we face a problem: Source variables (a, b) are different from our internal indices (0, 1, 2...). We need a node that bridges this gap — an
"Introducer"
.
The Bind Node
We introduce a Bind node. This node represents the action of bringing a name into existence in the current scope.
Depending on the language you are working with, a binding may have semantic significance. For example, if references to the binding are objects of the language like Rust or C++ references.
Index
Node Kind
Operands
Meaning
0
IntLiteral
10
The raw value 10.
1
Bind
name: "a", val: 0
Introducer
: "a" exists, bound to Index 0.
2
NameRef
ref: 1
A reference back to the introducer node.
3
IntLiteral
5
The raw value 5.
4
BinaryOp
Add(2, 3)
Adds the NameRef and the Literal.
5
Bind
name: "b", val: 4
Introducer
: "b" exists, bound to Index 4.
The Stack Returns (For Compilation)
In order to deal with this data, we traverse it but we will also want to build up a stack. You might ask:
If we flattened the tree, why do we need a stack?
While the
storage
is flat, the
compilation process
requires a stack to handle scopes. Because let declarations can be nested, we cannot simply scan linearly and remember everything forever. We need to handle when names go
out
of scope (shadowing).
Let's add
BlockStart
and
BlockEnd
nodes that indicate nested blocks.
// Source Code
let x = 10; // Outer 'x'
{
let x = 20; // Inner 'x' (shadows outer)
print(x); // Should print 20
}
print(x); // Should print 10
The resolve_names Algorithm
I am too lazy for full code examples, just the idea.
We use a SymbolTableStack during the resolution pass.
We iterate through the array once. We maintain a stack of scopes, where each scope maps a string name to an integer index.
function resolve_names(nodes):
# A stack of scopes. Each scope is a Map: String -> Index
scope_stack = [ new Map() ]
for i, node in enumerate(nodes):
match node.kind:
case BlockStart:
# Push a new, empty scope onto the stack
scope_stack.push( new Map() )
case BlockEnd:
# Pop the top scope. Inner variables are forgotten.
scope_stack.pop()
case Bind(name, value_index):
# Register the variable in the CURRENT (top) scope.
current_scope = scope_stack.top()
current_scope.set(name, i)
case NameRef(name):
# Look for the name, starting from the top scope down.
target_index = find_in_stack(scope_stack, name)
# PATCH THE NODE:
# The node no longer holds "x". It holds the index (e.g., 4).
node.resolved_index = target_index
After this pass, the stack is discarded. The IR is now "wired." Every variable usage points directly to the instruction that created it.
When representing source as AST, we would use an algebraic data type. One could use mutable data structures there, or build up a symbol table.
3. Breaking the Line: Control Flow
So far, execution has been linear: Index 0, then 1, then 2. But branching constructs like if, else, and while break this line.
In a tree-based AST, an If node has children pointers to "Then" and "Else" blocks. In our flat array, we may prefer to have in the same contiguous vector, instead of blocks
floating in separate memory. So we introduce
Jump
nodes.
The Linear Layout
Consider this source:
if (a) { print(1); } else { print(2); }
print(3);
Here is the Indexed RPN layout. Note the use of
BrFalse
(Branch if False) and
Jmp
(Unconditional Jump).
Index
Node Kind
Data
Explanation
0
NameRef
"a"
Load variable
a
.
1
BrFalse
target: 5
If
a
is false, jump to Index 5 (Else).
2
Int
1
Start of "Then" block.
3
Print
2
4
Jmp
target: 7
Jump over the "Else" block.
5
Int
2
Start of "Else" block (Target of node 1).
6
Print
5
7
Int
3
Merge Point.
Execution continues here.
8
Print
7
Building It: Backpatching
When we emit the BrFalse instruction at index 1, we haven't written the Else block yet, so we don't know the target index.
It is quite straightforward to deal with that:
Emit BrFalse with a placeholder target. Save the index.
Emit the "Then" block.
Emit Jmp with a placeholder target. Save the index.
Mark the current index as the start of "Else".
Backpatch
(update) the BrFalse at index 1.
Emit the "Else" block.
Mark the current index as the end.
Backpatch
the Jmp at index 4.
This effectively flattens the logic of the program into a shape that mirrors how hardware executes instructions: predictable, linear memory access with explicit jumps.
4. Towards Interpretation and Code Gen
We have successfully flattened our source code. We have resolved variable names into absolute indices and lowered high-level control flow into jumps. Now comes the reward.
The Interpreter: The "Big Switch"
Because our code is a flat array, we can come up with a Virtual Machine (VM) that looks exactly like a hardware CPU: it has an Instruction Pointer (ip) and a big loop.
As a reminder — I will never get tired of repeating this — the difference between a virtual machine and abstract machine is that a virtual machine
has
instructions
, whereas an abstract machine has
transitions
.
A translation to a low-level format and a virtual machine plays the role of an interpreter, which provides an implementation of our language. We
can
also use it to
specify the
operational semantics
, roughly: however you implement this language,
it should produce the same result as the reference interpreter. For "real" languages, often specification comes as an afterthought but there
are plenty of situations where one really would like to know how a piece of source code is supposed to behave. For example, to find out
if the "real" implementation is correct. Somehow, educated people who really should know better can parrot statements like "undefined behavior is all about compiler optimizations"
and completely ignore that "undefined behavior" is first and foremost a gap in the specification.
Back to our interpreter: we can do something really simple: since we used ANF, (where every node index represents a runtime value), we don't even need a runtime stack for intermediate calculations. We can simply map the nodes array to a parallel values array. A real implementation would not do this, but if we use an interpreter solely to specify behavior, this is sufficient, and we can defer optimizations.
Note that since we have already resolved names to indices, instructions like
Bind
or
BlockStart
are effectively metadata. The interpreter can simply skip them.
function run_vm(nodes):
# Holds the runtime result of every node.
values = new Array(size=len(nodes))
ip = 0
while ip < len(nodes):
node = nodes[ip]
match node.kind:
case IntLiteral:
values[ip] = node.raw_value
ip += 1
case Add:
# Direct access by index! No stack popping needed.
lhs_val = values[node.lhs_index]
rhs_val = values[node.rhs_index]
values[ip] = lhs_val + rhs_val
ip += 1
case BrFalse:
if values[node.cond_index] == False:
ip = node.target_index # JUMP
else:
ip += 1
case Jmp:
ip = node.target_index # Unconditional JUMP
case _:
# Skip metadata nodes (Bind, BlockStart, etc.)
ip += 1
The Code Generator
We could also do a source-to-source translation and generate C code. The Indexed RPN shines again, because the complexity of the source language is reduced quite a bit.
Since instructions are topologically sorted and dependencies are explicit, generating C can be as simple as a single for loop where every node becomes a temporary variable t{i}.
This is maybe not a great way to specify what a language means, but a clear implementation advantage of translating to an existing language is that one can build
on top of an existing whole compiler, with optimizations, native code generation backends. How exactly the semantics and runtime aspects of the source language
and the target language are connected is of course a design choice and can be wildly different.
function generate_c_code(nodes):
output = StringBuilder()
output.append("int main() {\n")
for i, node in enumerate(nodes):
# 1. Create a label for every instruction so Jumps can find it
# e.g., "L_0:", "L_1:", etc.
output.append(f"L_{i}: ;\n")
# 2. Create a variable name for this node's result
# e.g., "t_0", "t_1"
var_name = f"t_{i}"
match node.kind:
case IntLiteral:
# int t_0 = 10;
output.append(f" int {var_name} = {node.value};\n")
case Add:
# int t_2 = t_0 + t_1;
lhs = f"t_{node.lhs_index}"
rhs = f"t_{node.rhs_index}"
output.append(f" int {var_name} = {lhs} + {rhs};\n")
case Print:
# printf("%d\n", t_5);
arg = f"t_{node.arg_index}"
output.append(f" printf(\"%d\\n\", {arg});\n")
case BrFalse:
# if (!t_1) goto L_5;
cond = f"t_{node.cond_index}"
target = f"L_{node.target_index}"
output.append(f" if (!{cond}) goto {target};\n")
case Jmp:
# goto L_7;
target = f"L_{node.target_index}"
output.append(f" goto {target};\n")
output.append(" return 0;\n}")
return output.toString()
Conclusion
People will always build more languages, especially domain-specific ones.
A realistic work-in-progress language that uses indexed RPN is
Carbon
.
By moving from a tree to an
Indexed RPN
, we replace heap allocations with a single contiguous vector. What was recursive tree-walking of AST can in many cases become index lookups.
So there should be a lot less memory-traffic, and when programs get large, memory traffic can have a significant impact on performance.
If you are like me and build toy programming language implementations for fun, consider trying this out and see how it works for you!
LG TV's new software update installed MS Copilot, which cannot be deleted
Your request has been blocked due to a network policy.
Try logging in or creating an account
here
to get back to browsing.
If you're running a script or application, please register or sign in with your developer credentials
here
. Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again. if you're supplying an alternate User-Agent string,
try changing back to default as that can sometimes result in a block.
I've been working on a personal project recently, rewriting an old jQuery + Django project into SvelteKit. The main work is translating the UI templates into idiomatic SvelteKit while maintaining the original styling. This includes things like using semantic HTML instead of div-spamming, not wrapping divs in divs in divs, and replacing bootstrap with minimal tailwind. It also includes some logic refactors, to maintain the original functionality but rewritten to avoid years of code debt. Things like replacing templates using boolean flags for multiple views with composable Svelte components.
I've had a fairly steady process for doing this: look at each route defined in Django, build out my `+page.server.ts`, and then split each major section of the page into a Svelte component with a matching Storybook story. It takes a lot of time to do this, since I have to ensure I'm not just copying the template but rather recreating it in a more idiomatic style.
This kind of work seems like a great use case for AI assisted programming, but I've failed to use it effectively. At most, I can only get Claude Code to recreate some slightly less spaghetti code in Svelte. Simple prompting just isn't able to get AI's code quality within 90% of what I'd write by hand. Ideally, AI could get it's code to something I could review manually in 15-20 minutes, which would massively speed up the time spent on this project (right now it takes me 1-2 hours to properly translate a route).
Do you guys have tips or suggestions on how to improve my efficiency and code quality with AI?
New perk: Easily find new routes and hidden gems, upcoming running events, and more near you. Your weekly Local Running Newsletter has everything you need to lace up!
Subscribe today
.
While rowing across the Pacific in 2008, wind pushing him and waves battering him, Italian explorer
Alex Bellini
felt an unsettling lack of control.
Playing in his mind was the story of another Italian explorer, Umberto Nobile, who crashed his zeppelin north of Svalbard after a 1928 polar expedition. Seven men died. The survivors, including Nobile, spent a month wandering the free-floating pack ice, at one point shooting and eating a polar bear, until their rescue. How people react to unpredictable situations fascinates Bellini, a dedicated student of psychology. In the Arctic Ocean, unpredictable situations are a way of life.
“All adventure is based on hypothesis, which can be very different to reality,”
says
Bellini. “An adventurer must adapt himself to the environment he faces.”
Bellini’s
newest adventure
highlights and relishes that lack of control. Sometime next winter, he plans to travel to Greenland’s west coast, pick an iceberg, and live on it for a year as it melts out in the Atlantic.
Sometime next winter, Alex Bellini plans to travel to Greenland’s west coast, pick an iceberg, and live on it for a year as it melts out in the Atlantic.
(Courtesy of Alex Bellini)
This is a precarious idea. Bellini will be completely isolated, and his adopted dwelling is liable to roll or fall apart at any moment, thrusting him into the icy sea or crushing him under hundreds of tons of ice.
His task: experience the uncontrollable nature of an iceberg at sea without getting himself killed. The solution: an indestructible survival capsule built by an aeronautics company that specializes in tsunami-proof escape pods.
“This adventure is about waiting for something to happen,”
says
Bellini. “But I knew since the beginning I needed to minimize the risk. An iceberg can flip over, and those events can be catastrophic.”
Icebergs tend to get top-heavy as they melt from their submerged bottoms, so flips can be immediate and unpredictable. And, of course, so is the weather.
Bellini spent two years searching for the appropriate survival capsule, but most were too heavy to plant on a berg. But then, in October, he contacted aeronautical engineer Julian Sharpe, founder of
Survival Capsule
, a company that makes lightweight, indestructible floating capsules, or “personal safety systems.
”
They can hold from two to ten people, depending on the model, and
are made from aircraft-grade aluminum in what’s called a continuous monocoque structure, an interlocking frame of aluminum spars that evenly distribute force, underneath a brightly painted and highly visible aluminum shell. The inner frame can be stationary or mounted on roller balls so it rotates, allowing the passengers to remain upright at all times.
Inside are a number of race car–style seats with four-point seatbelts,
arranged
facing either outward from the center or inward around the circumference, depending on the number of chairs. Storage compartments, including food and water tanks, sit beneath the seats. Two watertight hatches open inward to avoid outside obstructions. Being watertight, it
’
s a highly buoyant vessel, displacing water like a boat does.
“I fell in love with the capsule,”
says
Bellini. “I’m in good hands.”
He selected a three-meter, ten-person version, for which he’ll design his own interior.
Sharpe got the idea for his capsules after the 2004 Indonesian tsunami. He believes fewer people would have died had some sort of escape pod existed. With his three-man team, which includes a former NOAA director and a Boeing engineer, he brought the idea to fruition in 2011. Companies in Japan that operate in the line of fire for tsunamis expressed the most interest. But Sharpe hopes the products will be universal—in schools, retirement homes, and private residences, anywhere there is severe weather. The first testing prototypes of the capsules, which range from $12,000 to $20,000, depending on size, were shipped to Tokyo in 2013. Four are in Japan; two are in the United States. His two-person capsule is now for sale; the others will follow later this year.
“Right now there’s only horizontal and vertical evacuations,”
Sharpe said. “We want to offer a third option: riding it out.”
The company intends to rely on an increasing market for survival equipment as sea level and the threat of major storms rise. Sharpe designed the capsules to be tethered to the ground using 20 to 50 meters of steel cable and to withstand a tsunami or storm surge. Each will have a water tank and a sophisticated GPS beacon system in case the tether snaps. Survival Capsule advises storing seven to ten days of food in each capsule.
The product appeals to Bellini because it
’
s strong enough to survive a storm at sea or getting crushed between two icebergs. It will rest on top of the ice using either its own weight or a specially designed stand that will detach if the berg rolls. The circular shape is crucial for avoiding a crushing blow. The capsule will just roll off any incoming mass, and the water will provide an equal and opposite reaction to any force exerted on the capsule.
“
A multicurved surface is almost uncrushable,
”
Sharpe said. “If you imagine shooting an arrow at a wooden ball, unless you hit dead center, it’ll ricochet.”
The capsule is strong enough to survive a storm at sea or getting crushed between two icebergs. It will rest on top of the ice using either its own weight or a specially designed stand that will detach if the berg rolls.
The basic model ensures survival, but there’s more to life on an iceberg than just surviving. You can add windows, extra space, and other modular additions, even surround sound and color options. “You can trick your crib out all you want,”
Sharpe said. And that’s exactly what Bellini plans to do. He doesn’t have a layout yet, but he has hired Italian designer Pietro Santoro to customize his ten-person pod. He will remove the other nine seats for extra room.
Other than modifications to keep him safe and healthy, the capsule is basic, Bellini said. It will carry 300 to 400 kilograms of food, a wind generator, solar panels, and an EPIRB beacon so rescuers can find him. He’ll have Wi-Fi to update his team and the public. The layout will consist of a work table, electronic panels, and a bed. “A foldable bed,
”
Bellini added.
“
I want to have room to work out.”
Bellini will spend almost all of his time in the capsule with the hatch closed, which will pose major challenges. He’ll have to stay active without venturing out onto a slippery, unstable iceberg. If it flips, he’ll have no time to react. He’s working with a company to develop nanosensors able to detect movement in the iceberg so he has advance warning of a flip. “Any step away from [the iceberg] will be in unknown territory,”
he said. “You want to stretch your body. But then you risk your life.”
He fears a lack of activity will dull his ability to stay safe. “I cannot permit myself to get crazy,”
he said. “I need to keep my body fit, not for my body, but for my safety.”
He is working on a routine of calisthenics that can be done in the capsule, and he might install a stationary bike, most likely a
Ciclotte
.
Lack of sunlight is another challenge of spending a year in an aluminum sphere. It will be winter in the Arctic, with maybe five hours of light each day. Bellini and Sharpe are working on a lighting system that will simulate natural light, allowing Bellini to get vitamins and maintain his circadian rhythm.
Bellini’s model is in development, and he expects it to be ready in about a year. He plans to write during his mission and will bring plenty of nonfiction books, especially psychology.
The capsule won’t ease his isolation, maybe his greatest challenge, but Bellini remains undaunted: “It’s the key to the inner part of myself.” The first step is relinquishing control.
Z8086: Rebuilding the 8086 from Original Microcode
After
486Tang
, I wanted to go back to where x86 started. The result is
z8086
: a 8086/8088 core that runs the
original Intel microcode
. Instead of hand‑coding hundreds of instructions, the core loads the recovered 512x21 ROM and recreates the micro‑architecture the ROM expects.
z8086 is compact and FPGA‑friendly: it runs on a single clock domain, avoids vendor-specific primitives, and offers a simple external bus interface. Version 0.1 is about 2000 lines of SystemVerilog, and on a Gowin GW5A device, it uses around 2500 LUTs with a maximum clock speed of 60 MHz. The core passes all ISA test vectors, boots small programs, and can directly control peripherals like an SPI display. While it doesn’t boot DOS yet, it’s getting close.
Why another x86?
The 8086 is where the x86 story began. If you want to understand why x86 feels like x86 — segmented addressing, ModR/M, the prefetch queue, the oddball string instructions — this is the chip to study.
Also, reverse-engineering of the 8086 has reached a surprisingly level of maturity. We now have Ken Shirriff’s massive
8086 blog series
and Andrew Jenner’s
disassembled microcode
. Combined with the original
8086 patent
, these resources make it possible to rebuild a
faithful
core instead of a functional approximation.
My goals were simple:
Faithful where it counts.
Accurately replicate the microarchitectural behavior of the original 8086 wherever it matters most.
Designed to be explorable and educational.
The code is thoroughly commented to make it clear and easy to understand. Aims to be a good teaching resource.
FPGA-friendly and practical.
z8086 is built to be an effective, useful CPU IP core for real FPGA projects.
Re‑creating the 8086
Here’s the high‑level view:
(You can cross-reference function blocks against the
die shot
.)
This is like the original chip’s split. The
BIU
(bus interface unit) runs ahead, fetching bytes into a 6‑byte queue whenever the bus is idle. The
EU
(execution unit) consumes bytes from that queue, decodes them, and drives the microcode engine. When the EU needs memory, it issues a Type‑6 micro‑op; the BIU yields the bus and prefetch pauses. That overlap is why the 8086 feels “pipelined” despite being a late‑70s design.
Microcode is the glue here. Each 21‑bit micro‑instruction encodes a
move
(5‑bit source → 5‑bit destination on an internal bus) plus an
action
(ALU op, short/long jump, bookkeeping, or a bus cycle). The sequencer advances through
{AR, CR}
addresses until the microcode asserts “run next instruction.”
Some key pieces:
Microcode engine.
The sequencer keeps
{AR, CR}
(plus
SR
for calls), fetches 21‑bit words from
ucode.hex
, and executes them as a tight move→action loop.
ROME
marks active execution. When microcode wants a queue byte (
LOC_Q
) but the queue is empty, or when an EU bus cycle is in flight, a
stall
signal freezes
CR
so the ROM sees exactly the timing it expects.
Translation + group decode.
The original 8086 uses ROMs to (1) classify opcodes into ~15 “group” signals (“has ModR/M,” “prefix,” “uses w‑bit,” “grp3/4/5,” etc.), and (2) map
{opcode, ModR/M}
to microcode entry points for effective‑address and control‑flow routines. z8086 implements these as combinational replicas (
group_decode()
and
translate()
), derived from the dumped ROM truth tables. This is what lets the recovered microcode drop straight in without being rewritten.
Bus + unaligned access.
Externally you get
rd/wr/io/word/ready
with aligned cycles, so FPGA memory is easy to hook up. Internally the EU still issues Type‑6 bus micro‑ops with the right segment defaults and overrides. If a word access lands on an odd address, the bus FSM automatically splits it into two byte cycles (
BUS_UNALIGNED
), so software sees real 8086 semantics while the outside world stays aligned.
ALU + flags.
The ALU is implemented as a classic 16×1‑bit slice, controlled by signals modeled after Intel’s original logic. The initial ALU design used Verilog primitives, but this updated bit‑slice version is both smaller and faster, closely replicating the behavior of the original chip’s ALU.
One concrete example: for a ModR/M instruction like
ADD AX, [BX+SI+4]
, the loader’s
FC
grabs the opcode,
SC
grabs the ModR/M byte,
translate()
jumps into the right effective‑address micro‑routine, the EU reads the operand through a Type‑6 bus cycle into
OPR
, the ALU updates
SIGMA
and flags, and a final Type‑6 writeback happens only if the instruction targets memory.
Interesting discoveries
Microcode is super efficient
The 8086 shipped with ~29K transistors and still delivered a very rich CISC ISA: segmented addressing, ModR/M base+index+disp modes, and weirdly specialized instructions like
DAA
and
XLAT
. The trick was microcode. A small internal datapath plus ROM sequencing let Intel implement a huge instruction surface area without exploding logic.
The contrast with other CPUs is striking. The 6502 (~4.5K transistors) and Z80 (~8.5K) are elegant, mostly hardwired, and highly minimalist designs. In comparison, the 8086 features a much wider datapath, significantly more instructions and features, yet manages to do so with less than four times the transistor count of the Z80. The 68000 (~68K transistors) takes a different approach, using far more silicon for its fully hardwired CISC design. Remarkably, the 8086 achieves a similar feature set with less than half the transistor count of the 68000. This efficiency carries over to z8086: the core fits into just 2,500 LUT4s — dramatically smaller than ao486, which is about ten times larger.
The patent’s FC/SC formulas are wrong (or at least incomplete)
Interestingly, the patent’s explanation of FC and SC signal generation turns out to be inconsistent. The formulas it provides are:
Here, “MT” refers to “a signal generated by Q control circuitry indicating that the queue is empty…”. In reality, however, the correct logic should be “
not MT
”" rather than MT, contrary to the documentation. Testing and implementation confirm that this change results in the expected loader behavior.
The “8086 interrupt bug"
The original 1978 8086 had an interrupt-related bug: If an interrupt occurs immediately after a
MOV SS,xxx
or
POP SS
instruction, the CPU may push data to an incorrect stack address, corrupting memory. The problem arises because both the Stack Segment (SS) and Stack Pointer (SP) must be updated to ensure correct stack operations. If an interrupt arrives between these updates, the CPU could save flags/IP/CS to the wrong location. Intel later resolved this by automatically disabling interrupts for one instruction following operations like
POP SS
.
z8086 faithfully reproduces this edge case using a
delay_interrupt
register. This register is set whenever one of three events occurs: when
SC
decodes a
prefix
(
g_prefix
), a
stack segment load
(
POP SS
), or a
segment register move
(
MOV sr, r/m
, detected by
g_seg_reg_bits
). This mechanism disables interrupt handling for exactly one instruction, matching the original 8086’s behavior.
The prefetch queue bus is 8-bit
The prefetch queue is a 6-byte buffer that continuously feeds the execution engine. Its output, called the “Q Bus,” is an 8-bit bus delivering the next instruction byte. Notably, while the 8086 is architecturally a 16-bit CPU, it fetches instruction bytes one at a time—consuming at most a single byte per cycle. This design ultimately limits performance, a bottleneck that later Intel CPUs overcome; for instance, the 386 features a 32-bit wide Q bus.
Working on ao486 for 486Tang underscored just how crucial the prefetch queue is to overall performance and Fmax. The intricate x86 instruction set makes optimizing the queue challenging. Balancing width, depth, and flexibility in its design truly tests the designer’s skill.
Reflections and next steps
Overall, this project has been incredibly fun — like piecing together a giant puzzle. It involves gathering information from many sources, making educated guesses about the original design, and testing those theories until everything clicks into place.
Getting code to work is the definitive proof of truly understanding a system. The fact that z8086 functions as intended demonstrates that the community now possesses deep, practical insight into the original x86 chip.
Intel packed an impressive array of features into the 8086. Some attribute this to it being designed by a
software developer
. While many of these features have become less relevant over time — and some of the 8086’s success was undoubtedly lucky, such as being chosen for the IBM PC — the developer-friendly design played a big role in kickstarting the x86 ecosystem.
This release is an early preview and comes with several limitations: it is not yet cycle accurate, the interrupt circuitry is still under-tested, the original 8086 bus cycles are not fully replicated, and it has not yet been used to run large programs.
Here are some directions I plan to work on:
More extensive testing on FPGA boards
Booting DOS
Compiling to WebAssembly for interactive 8086 visualization in the browser?
z8086
should work on most FPGAs, with sample projects provided for DE10-Nano, Xilinx Artix7 and Tang Console 60K. If low-level CPU archaeology interests you – or you’d like to try a real-microcode 8086 as a soft CPU in your own project – check out the project on GitHub:
👉 z8086 on GitHub
.
Feedback, issues, and PRs are always welcome. Thanks for reading!
Gavin Newsom pushes back on Trump AI executive order preempting state laws
Guardian
www.theguardian.com
2025-12-13 15:00:35
California governor says order pushes ‘grift and corruption’ instead of innovation just hours after president’s dictum The ink was barely dry on Donald Trump’s artificial intelligence executive order when Gavin Newsom came out swinging. Just hours after the order went public Thursday evening, the Ca...
The ink was barely dry on
Donald Trump’s artificial intelligence executive order
when Gavin Newsom came out swinging. Just hours after the order went public Thursday evening, the California governor issued a statement saying the presidential dictum, which seeks to block states from regulating AI of their own accord, advances “grift and corruption” instead of innovation.
“President Trump and David Sacks aren’t making policy – they’re running a con,” Newsom said, referencing
Trump’s AI adviser and crypto “czar”
. “Every day, they push the limits to see how far they can take it.”
Trump’s executive order is a major victory for tech companies that have
campaigned against legislative barriers
to developing and deploying their AI products. It also sets up a clash between state governments and the White House over the future of AI regulation. The immediate backlash from groups including child safety organizations, unions and state officials has highlighted the deeply contentious nature of the order and diverse range of interests it affects.
Several officials and organizations have already questioned the legality of the executive order, stating that Trump does not have the power to undermine state legislation on AI and denouncing the decree as the result of tech industry lobbying.
California
, home to some of the world’s most prominent AI companies and one of the most active states legislating AI, has been a locus for pushback against the order.
“This executive order is deeply misguided, wildly corrupt, and will actually hinder innovation and weaken public trust in the long run,” California Democratic representative Sara Jacobs said in a statement. “We will explore all avenues – from the courts to Congress – to reverse this decision.”
After a draft version of Trump’s order leaked in November, state attorney general Rob Bonta
said
that his office would “take steps to examine the legality or potential illegality of such an executive order”, teeing up a precedent-setting duel between California and the White House.
Legislative loggerheads
In September, Newsom
signed a landmark AI law
that would compel developers of large, powerful AI models known as “frontier models” to provide transparency reports and promptly report safety incidents or face fines up to $1m. The governor touted the Transparency in Frontier Artificial Intelligence act as an example for how to regulate AI companies nationwide.
“Our state’s status as a global leader in technology allows us a unique opportunity to provide a blueprint for well-balanced AI policies beyond our borders,” Newsom said in an address to the California state senate. “Especially in the absence of a comprehensive federal AI policy framework and national AI safety standards.”
The September bill and more California legislation could be in Trump’s crosshairs. Thursday’s executive order calls for an AI litigation taskforce that would review state laws that do not “enhance the United States’ global AI dominance” and then pursue legal action or potentially withhold federal broadband funding. The taskforce will also consult with the administration’s AI and crypto “czar” to determine which laws to target.
Although Trump has framed the executive order as a means of streamlining legislation and removing onerously patchwork regulation, critics have alleged that the government has never provided any comprehensive federal framework for regulating AI to replace state laws. The order follows attempts to include similar AI moratoriums
in bills earlier this year
, which failed due to bipartisan backlash. Instead, opponents view the order as a gift to major tech companies that have cozied up to the administration over the course of the year.
“President Trump’s unlawful executive order is nothing more than a brazen effort to upend AI safety and give tech billionaires unchecked power over working people’s jobs, rights and freedoms,” AFL-CIO president, Liz Shuler, said in a statement.
Nationwide backlash
Within hours of Trump signing the order, opposition loudened among lawmakers, labor leaders, children’s advocacy groups and civil liberties organizations that decried the policy. Other California Democratic leaders said the executive order was an assault on state rights and the administration should instead focus on federal agencies and academic research to boost innovation.
“No place in America knows the promise of artificial intelligence technologies better than California,” said Alex Padilla, a senator for California. “But with today’s executive order, the
Trump administration
is attacking state leadership and basic safeguards in one fell swoop.”
Similarly, Adam Schiff, another California senator, emphasized: “Trump is seeking to preempt state laws that are establishing meaningful safeguards around AI and replace them with … nothing.”
Lawmakers from Colorado to Virginia to New York also took issue with the order. Don Beyer, a Virginia congressmember called it a “terrible idea” and said that it would “create a lawless Wild West environment for AI companies”. Likewise, Alex Bores, a New York state assemblymember, called the order a “massive windfall” for AI companies, adding that “a handful of AI oligarchs bribed
Donald Trump
into selling out America’s future”.
Even Steve Bannon, Trump loyalist and former adviser, criticized the policy. In a
text message to Axios
, Bannon said Sacks had “completely misled the President on preemption”. Mike Kubzansky, the CEO of Omidyar Network, a philanthropic tech investment firm that funds AI companies, similarly said “the solution is not to preempt state and local laws” and that ignoring AI’s impact on the country “through a blanket moratorium is an abdication of what elected officials owe their constituents”.
Blowback against the order has also included child protection organizations that have long expressed concerns over the effects of AI on children. The debate over child safety has intensified this year in the wake of multiple lawsuits against AI companies over
children who died by suicide
after interacting with popular chatbots.
“The AI industry’s relentless race for engagement already has a body count, and, in issuing this order, the administration has made clear it is content to let it grow,” said James Steyer, the CEO of child advocacy group Common Sense Media. “Americans deserve better than tech industry handouts at the expense of their wellbeing.”
A group of bereaved parents and child advocacy organizations have also spoken out. They have been working to pass legislation to better protect children from harmful social media and AI chatbots and
released a national public service announcement
on Thursday opposing the AI preemption policy. Separately, Sarah Gardner, the CEO of Heat Initiative, one of the groups in the coalition, called the order “unacceptable”.
“Parents will not roll over and allow our children to remain lab rats in big tech’s deadly AI experiment that puts profits over the safety of our kids,” Gardner said. “We need strong protections at the federal and state level, not amnesty for big tech billionaires.”
I’ve been moving all my MCPs to skills, including the remaining one I still
used: the Sentry MCP
1
. Previously I had already moved entirely away from
Playwright to a Playwright skill.
In the last month or so there have been discussions about using
dynamic tool
loadouts
to defer
loading of tool definitions until later. Anthropic has also been toying around
with the idea of wiring together MCP calls via code, something
I have
experimented with
.
I want to share my updated findings with all of this and why the deferred tool
loading that Anthropic came up with does not fix my lack of love for MCP. Maybe
they are useful for someone else.
What is a Tool?
When the agent encounters a tool definition through reinforcement learning or
otherwise, it is encouraged to emit tool calls through special tokens when it
encounters a situation where that tool call would be appropriate. For all
intents and purposes, tool definitions can only appear between special tool
definition tokens in a system prompt. Historically this means that you cannot
emit tool definitions later in the conversation state. So your only real option
is for a tool to be loaded when the conversation starts.
In agentic uses, you can of course compress your conversation state or change
the tool definitions in the system message at any point. But the consequence is
that you will lose the reasoning traces and also the cache. In the case of
Anthropic, for instance, this will make your conversation significantly more
expensive. You would basically start from scratch and pay full token rates plus
cache write cost, compared to cache read.
One recent innovation from Anthropic is deferred tool loading. You still
declare tools ahead of time in the system message, but they are not injected
into the conversation when the initial system message is emitted. Instead they
appear at a later point. The tool definitions however still have to be static
for the entire conversation, as far as I know. So the tools that could exist
are defined when the conversation starts. The way Anthropic discovers the tools
is purely by regex search.
Contrasting with Skills
This is all quite relevant because even though MCP with deferred loading feels
like it should perform better, it actually requires quite a bit of engineering
on the LLM API side. The skill system gets away without any of that and, at
least from my experience, still outperforms it.
Skills are really just short summaries of which skills exist and in which file
the agent can learn more about them. These are proactively loaded into the
context. So the agent understands in the system context (or maybe somewhere
later in the context) what capabilities it has and gets a link to the manual for
how to use them.
Crucially, skills do not actually load a tool definition into the context. The
tools remain the same: bash and the other tools the agent already has. All it
learns from the skill are tips and tricks for how to use these tools more
effectively.
Because the main thing it learns is how to use other command line tools and
similar utilities, the fundamentals of how to chain and coordinate them together
do not actually change. The reinforcement learning that made the Claude family
of models very good tool callers just helps with these newly discovered tools.
MCP as Skills?
So that obviously raises the question: if skills work so well, can I move the
MCP outside of the context entirely and invoke it through the CLI in a similar
way as Anthropic proposes? The answer is yes, you can, but it doesn’t work
well. One option here is Peter Steinberger’s
mcporter
. In short, it reads the
.mcp.json
files and exposes the MCPs behind it as callable tools:
And yes, it looks very much like a command line tool that the LLM can invoke.
The problem however is that the LLM does not have any idea about what tools are
available, and now you need to teach it that. So you might think: why not make
some skills that teach the LLM about the MCPs? Here the issue for me comes from
the fact that MCP servers have no desire to maintain API stability. They are
increasingly starting to trim down tool definitions to the bare minimum to
preserve tokens. This makes sense, but for the skill pattern it’s not what you
want. For instance, the Sentry MCP server at one point switched the query
syntax entirely to natural language. A great improvement for the agent, but my
suggestions for how to use it became a hindrance and I did not discover the
issue straight away.
This is in fact quite similar to Anthropic’s deferred tool loading: there is no
information about the tool in the context at all. You
need
to create a
summary. The eager loading of MCP tools we have done in the past now has ended
up with an awkward compromise: the description is both too long to eagerly load
it, and too short to really tell the agent how to use it. So at least
from my experience, you end up maintaining these manual skill summaries for MCP
tools exposed via mcporter or similar.
Path Of Least Resistance
This leads me to my current conclusion: I tend to go with what is easiest, which
is to ask the agent to write its own tools as a skill. Not only does it not
take all that long, but the biggest benefit is that the tool is largely under my
control. Whenever it breaks or needs some other functionality, I ask the agent
to adjust it. The Sentry MCP is a great example. I think it’s probably one of
the better designed MCPs out there, but I don’t use it anymore. In part because
when I load it into the context right away I lose around 8k tokens out of the
box, and I could not get it to work via mcporter. On the other hand, I have
Claude maintain a skill for me. And yes, that skill is probably quite buggy and
needs to be updated, but because the agent maintains it, it works out better.
It’s quite likely that all of this will change, but at the moment manually
maintained skills and agents writing their own tools have become my preferred
way. I suspect that dynamic tool loading with MCP will become a thing, but it
will probably quite some protocol changes to bring in skill-like summaries and
built-in manuals for the tools. I also suspect that MCP would greatly benefit
of protocol stability. The fact that MCP servers keep changing their tool
descriptions at will does not work well with materialized calls and external
tool descriptions in READMEs and skill files.
The Brand-New Pentagon Press Corps Is Gaga for Hegseth
Intercept
theintercept.com
2025-12-13 14:23:34
The Department of War has cracked the code on making the perfect press corps by welcoming in only its biggest cheerleaders.
The post The Brand-New Pentagon Press Corps Is Gaga for Hegseth appeared first on The Intercept....
Pentagon press secretary Kingsley Wilson conducts a press briefing at the Pentagon, Washington, D.C., on Dec. 2, 2025.
Photo: U.S. Navy Officer Eric Brann/Office of the Secretary of War
The welcome was
so warm it could’ve been the first day of school for a new class of kindergarteners, and with the so-called reporters’ level of skepticism for the administration, they might as well have been.
“I would also like to take a moment today to welcome all of you here to the Pentagon briefing room as official new members of the Pentagon press corps. We’re glad to have you,” Pentagon press secretary Kingsley Wilson said in her December 2 briefing. “This is the beginning of a new era.”
Wilson also said that “legacy media chose to self-deport from this building,” a cute way of noting that
dozens of news organizations
— among them the New York Times, the Washington Post, the major broadcast news outlets, and even Fox News and Newsmax — gave up their press passes rather than sign on to the administration’s blatantly
anti-First Amendment set of rules
for reporting on Pete Hegseth’s Department of War. Among those rules was a provision allowing journalists to be expelled for reporting on anything, whether classified or unclassified,
not approved for official release
.
To test-drive the absurdity of this new “press corps,” Wilson granted the second question of the “new era” to disgraced former congressman Matt Gaetz, once Donald Trump’s pick for attorney general and now a host on the feverishly pro-Trump One America News Network. Gaetz, who was wearing a rather dated performance fleece jacket embroidered with “
Representative Matt Gaetz
,” asked two questions about regime change in Venezuela, a policy the administration is
actively fomenting
as it carries out
strikes on boats
it claims are carrying “narcoterrorists” smuggling drugs in the Caribbean Sea and Pacific Ocean.
The substance of the questions mattered less than the opening they provided for Wilson to
parrot
the administration’s line on these strikes: “Every single person who we have hit thus far who is in a drug boat carrying narcotics to the United States is a narcoterrorist. Our intelligence has confirmed that.” Somewhat puzzlingly, Wilson also said the Department of War is “a planning organization” with “a contingency plan for everything.”
There was no further follow-up from the member of the “press” whom the
House Ethics Committee found
engaged in sexual activity with a 17-year-old girl in 2017. (Gaetz has denied wrongdoing.)
Since the briefing took place just days after the killing of a member of the National Guard blocks from the White House, multiple members of the Pentagon’s new Fourth Estate asked weighty questions in the wake of the tragedy, including whether the service member would receive a medal for distinguished service or a military burial at Arlington National Cemetery. (Both are TBD.)
It wasn’t all softball questions, but every assembled member served their purpose by running interference for the administration in general and Hegseth in particular. One interlocutor, following up on a question about selling weapons to Qatar despite its ties to the Muslim Brotherhood from the indefatigable Laura Loomer, asked without a hint of irony whether the U.S. would be “reassessing our relationship with Israel” over Israeli media reports that the country’s government “funded Hamas.”
Without missing a beat, the War Department flak replied that that would be a “better question for the State Department” and moved right along.
Another member of the press corps asked whether any actual drugs have been recovered from these alleged drug-smuggling boats that the U.S. military has been drone striking —
twice, in one case
— a question well worth asking, and one that’s almost certainly being posed by the deposed mainstream journalists now reporting on the Pentagon from outside its walls. Wilson, standing in for the U.S. government, responded by essentially asking that we trust her, trust the intelligence, and trust that Hegseth’s War Department is telling the truth. The matter was, once again, closed.
Along with Loomer, a noted Trump sycophant and conspiracy theorist, I spotted “Pizzagate” promoter Jack Posobiec, who asked about Democratic Sen. Mark Kelly, and
Project Veritas
founder James O’Keefe in the assembled crowd. In a video of the briefing, an open laptop in one member of the “new” media’s lap was emblazoned with stickers that read “feminine, not feminist” and “homemaking is hot.” A
statement
from the department trumpeting news of the new corps features an interviewer in front of a backdrop emblazoned with logos for “LindellTV,” the media venture by MyPillow founder Mike Lindell — who is now
running for governor
of Minnesota. (LindellTV’s
IMDB
page describes the programming as: “Aging man with many internet connectivity issues, screaming into his cell phone, has discussions with a tired looking news anchor,” although it’s not clear whether that’s the official network tagline.)
The Pentagon press corps has always been a
gilded cage
— a perch for big-name reporters who want a plush-sounding posting without too much hassle. The most essential, critical reporting never comes from briefings, where reporters sit with their mouths open like baby birds looking up for a news morsel from their press secretary mother. But like with so many things under Trump, by giving up on any semblance of respecting norms, he’s revealed how neutered the institution was to begin with. Critical reporting on the War Department has, and will, continue, even without reporters in the physical building. It’s worth asking if they should ever go back.
Quoting Obie Fernandez
Simon Willison
simonwillison.net
2025-12-13 14:01:31
If the part of programming you enjoy most is the physical act of writing code, then agents will feel beside the point. You’re already where you want to be, even just with some Copilot or Cursor-style intelligent code auto completion, which makes you faster while still leaving you fully in the driver...
If the part of programming you enjoy most is the physical act of writing code, then agents will feel beside the point. You’re already where you want to be, even just with some Copilot or Cursor-style intelligent code auto completion, which makes you faster while still leaving you fully in the driver’s seat about the code that gets written.
But if the part you care about is the decision-making around the code, agents feel like they clear space. They take care of the mechanical expression and leave you with judgment, tradeoffs, and intent. Because truly, for someone at my experience level, that is my core value offering anyway. When I spend time actually typing code these days with my own fingers, it feels like a waste of my time.
—
Obie Fernandez
,
What happens when the coding becomes the least interesting part of the work
Earth-Like Planets Are More Common Than We Thought, Study Says
403 Media
www.404media.co
2025-12-13 14:00:24
Normally, it’s bad news to be next to an exploding star. But ancient supernovae may have aided the formation of our home world—and perhaps Earthlike planets elsewhere....
Welcome back to the Abstract! These are the studies this week that got hosed with star spray, mounted a failed invasion, declined to comment, and achieved previously unknown levels of adorability.
First, a study about how the solar system wasn’t destroyed 4.5 billion years ago (phew!). Then: a human touch on an ancient boat, the duality of posters and lurkers, and an important update on toadlets.
Earth was cosmically conceived in part by a massive shockwave from a nearby supernova, which seeded our home world and neighboring rocky planets with telltale radioactive signatures, according to a new study.
The solar system’s rocky planets contain short-lived radionuclides (SLRs), which are ancient elements that were likely barfed out from exploding stars. For this reason, scientists have long suspected that stars must’ve detonated next to the gassy disk that gave rise to the solar system. The heat generated from these radioactive elements helped the building blocks of the rocky planets—Mercury, Venus, Earth, and Mars—melt together so they could become whole worlds, which means we owe our existence to these ancient supernovas.
Now, a team has developed a new model to explain how the primordial pyrotechnics didn’t just blow up the nascent solar system. The results suggest that rocky Earth-like worlds may be common in the universe, with potential implications for the search for extraterrestrial life.
“A key question in astronomy is how ubiquitous Earth-like rocky planets are,” said researchers led by Ryo Sawada of the University of Tokyo. “The formation of terrestrial planets in our Solar System was strongly influenced by the radioactive decay heat of SLRs, particularly aluminum-26, likely delivered from nearby supernovae.”
“However, the supernova injection scenario faces an unresolved problem in that existing supernova models could not reproduce both the relative and absolute abundances of SLRs without disrupting the protosolar disk,” an event that “would likely prevent the Solar System formation altogether,” the team added.
In other words, it’s hard to explain how the solar system got its high abundance of SLRs without killing it in the cradle. Sawada and his colleagues propose a solution that involves at least one star exploding about three light years of the disk, sparking a shockwave that created a cosmic-ray “bath.”
Schematic picture of the system assumed in this study. Image: Sawada et al., Sci. Adv. 11, eadx7892
In this “immersion mechanism,” energetic cosmic rays trapped in the bath triggered SLR-producing reactions directly within the disk. This contrasts with the hypothesis that the SLRs were largely injected and then mixed up in the disk through some unknown process. This new solution can account both for the high abundance of certain SLRs, like aluminum-26, and the fact that the solar system was not destroyed, as evidenced by its apparent continued existence.
“Our results suggest that Earth-like, water-poor rocky planets may be more prevalent in the
Galaxy than previously thought,” the team said, noting that many disks are rocked by similar supernova-shockwaves. “This challenges previous interpretations that classified the Solar System as an outlier with a particularly high [aluminum-26] abundance.”
In addition to offering a new hypothesis for an old astronomical problem, the study gets bonus points for its extremely poetic title: “Cosmic-ray bath in a past supernova gives birth to Earth-like planets.” If you say this enchanted phrase three times, somewhere an Earth-like world will be born.
Stars aren’t the only things leaving their dirty fingerprints in unexpected places this week. Archeologists working on the mysterious Hjortspring boat, a 2,400-year-old Scandinavian vessel, discovered a tantalizing partial human fingerprint in its caulking, providing “a direct link to the ancient seafarers who used this boat,” according to the study.
Photo of caulking fragment showing fingerprint on the left and high-resolution x-ray tomography scan of fingerprint region on the right. Image: Photography by Erik Johansson, 3D model by Sahel Ganji
The ridges of the fingerprint “fall within average distributions for both adult male and females as well as for juvenile adults, making it difficult to say much about the individual who produced the print,” said researchers led by Mikael Fauvelle of Lund University. “The most likely interpretation, however, is that it was made during repairs by one of the crew members on the boat itself, providing a direct link to the seafarers of the ancient vessel.”
Regardless of this person’s identity, their voyage didn’t end well. Researchers think the crew of the Hjortspring boat probably sailed from the eastern Baltic Sea to attack the Danish island of Als, where they were defeated. “The victors [deposited] the weapons of their vanquished foes together with one of their boats into the bog,” where they remained for millennia until they were rediscovered in the 1880s, the team said.
It’s a timeless reminder for would-be invaders: Don’t get caulky.
At last, scientists have investigated the most elusive online demographic: the humble lurker. A team recruited 520 Redditors in the U.S. to participate in small subreddits focused on a variety of political topics during the summer of 2024. The aim was to probe why some people became prolific “power-users” that post with voluminous confidence, while others remained wallflowers.
“Online political discussions are often dominated by a small group of active users, while most remain silent,” said researchers led by Lisa Oswalt of the Max Planck Institute for Human Development. “This visibility gap can distort perceptions of public opinion and fuel polarization.”
The team found that “lurking (posting nothing) was most common among users who perceived discussions as toxic, disrespectful, or unconstructive.” Lurkers were offered small payments to post in the experiment, which succeeded in motivating some to contribute to discussions. As a result, the study concluded that “future interventions may be able to make online political discussions more representative by offering more positive social rewards for lurkers to post.”
At last, an opportunity to unionize the lurkers of the world. Solidarity (in silence) forever.
We will close, as
we have before
, with an impossibly cute toadlet. Scientists have discovered this new species of “pumpkin toadlet” in the “cloud forests” of Brazil, a sentence so twee that it’s practically its own fairy tale. The tiny toad
Brachycephalus lulai,
pictured below on a pencil tip, belongs to a family of “flea toads” that are among the smallest vertebrates on Earth.
Basically it is very smol: Brachycephalus lulai is a tiny pumpkin toadlet measuring less than 14 mm in length. Photo: Luiz Fernando Ribeiro. Image credit 1: Luiz Fernando Ribeiro, CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
“Our team sought to better document the individual variation of all
Brachycephalus
species in southern Brazil, looking for them in the field over the past seven years,” said researchers led by Marcos R. Bornschein of São Paulo State University. “As a result of this work, we discovered and herein described a population collected on the eastern slope of Serra do Quiriri as a new species.”
The team also reported that the toads are actively colonizing newly formed cloud forests, which are high-altitude woods shrouded in mist. The researchers propose making these unique habitats into refuges for the adorable anurans.
The following subscription-only content has been made available to you
by an LWN subscriber. Thousands of subscribers depend on LWN for the
best news from the Linux and free software communities. If you enjoy this
article, please consider
subscribing to LWN
. Thank you
for visiting LWN.net!
The ability to write kernel code in Rust was explicitly added as an
experiment — if things did not go well, Rust would be removed again. At
the 2025 Maintainers Summit, a session was held to evaluate the state of
that experiment, and to decide whether the time had come to declare the
result to be a success. The (arguably unsurprising) conclusion was that
the experiment is indeed a success, but there were some interesting points
made along the way.
Headlines
Miguel Ojeda, who led the session, started with some headlines. The Nova
driver for NVIDIA GPUs is coming, with pieces already merged into the
mainline, and the
Android binder driver
was
merged for 6.18. Even bigger news, he said, is that Android 16
systems running the 6.12 kernel are shipping with the Rust-written
ashmem
module.
So there are millions of real devices running kernels with Rust code now.
Meanwhile, the Debian project has, at last, enabled Rust in its kernel
builds; that will show up in the upcoming "forky" release. The amount of
Rust code in the kernel is "
exploding
", having grown by a factor of
five over the last year. There has been an increase in the amount of
cooperation between kernel developers and Rust language developers, giving
the kernel project significant influence over the development of the
language itself. The Rust community, he said, is committed to helping the
kernel project.
The
rust_codegen_gcc
effort, which grafts the GCC code generator onto the rustc compiler, is
progressing.
Meanwhile the fully GCC-based
gccrs
project is making good progress. Gccrs is now able to compile the kernel's
Rust code (though, evidently, compiling it to
correct
runnable code is
still being worked on). The gccrs developers see building the kernel as
one of their top priorities; Ojeda said to expect some interesting news
from that project next year.
With regard to Rust language versions, the current plan is to ensure that
the kernel can always be built with the version of Rust that ships in the
Debian stable release. The kernel's minimum version would be increased
6-12 months after the corresponding Debian release. The kernel
currently specifies a minimum of Rust 1.78, while the current version
is (as of the session) 1.92. Debian is shipping 1.85, so Ojeda suggested
that the kernel move to that version, which would enable the removal of a
number of workarounds.
Jiri Kosina asked how often the minimum language version would be
increased; Ojeda repeated that it would happen after every Debian stable
release, though that could eventually change to every other Debian
release. It is mostly a matter of what developers need, he said. Linus
Torvalds said that he would be happy to increase the minimum version
relatively aggressively as long as it doesn't result in developers being
shut out. Distributors are updating Rust more aggressively than they have
traditionally updated GCC, so requiring a newer version should be less of a
problem.
Arnd Bergmann said that the kernel could have made GCC 8 the minimum
supported version a year earlier than it did, except that SUSE's SLES was
running behind. Kosina answered that SUSE is getting better and shipping
newer versions of the compiler now. Dave Airlie worried that problems
could appear once the enterprise distributors start enabling Rust; they
could lock in an ancient version for a long time. Thomas Gleixner noted,
though, that even Debian is now shipping GCC 14; the situation in
general has gotten better.
Still experimental?
Given all this news, Ojeda asked, is it time to reconsider the
"experimental" tag? He has been trying to be conservative about asking for
that change, but said that, with Android shipping Rust code, the time has
come. Airlie suggested making the announcement on April 1 and saying
that the experiment had failed. More seriously, he said, removing the
"experimental" tag would help people argue for more resources to be
directed toward Rust in their companies.
Bergmann agreed with declaring the experiment over, worrying only that Rust
still "
doesn't work on architectures that nobody uses
". So he
thought that Rust code needed to be limited to the well-supported
architectures for now. Ojeda said that there is currently good support for
x86, Arm, Loongarch, RISC-V, and user-mode Linux, so the main architectures
are in good shape. Bergmann asked about PowerPC support; Ojeda answered
that the PowerPC developers were among the first to send a pull request
adding Rust support for their architecture.
Bergmann persisted, asking about s390 support; Ojeda said that he has
looked into it and concluded that it should work, but he doesn't know the
current status. Airlie said that IBM would have to solve that problem, and
that it will happen. Greg Kroah-Hartman pointed out the Rust upstream
supports that architecture. Bergmann asked if problems with big-endian
systems were expected; Kroah-Hartman said that some drivers were simply
unlikely to run properly on those systems.
With regard to adding core-kernel dependencies on Rust code, Airlie said
that it shouldn't happen for another year or two. Kroah-Hartman said that
he had worried about interactions between the core kernel and Rust drivers,
but had seen far fewer than he had expected. Drivers in Rust, he said, are
indeed proving to be far safer than those written in C. Torvalds said
that some people are starting to push for CVE numbers to be assigned to
Rust code, proving that it is definitely not experimental; Kroah-Hartman
said that no such CVE has yet been issued.
The DRM (graphics) subsystem has been an early adopter of the Rust
language. It was still perhaps surprising, though, when Airlie (the DRM
maintainer) said that the subsystem is only "
about a year away
" from
disallowing new drivers written in C and requiring the use of Rust.
Ojeda returned to his initial question: can the "experimental" status be
ended? Torvalds said that, after nearly five years, the time had come.
Kroah-Hartman cited the increased support from compiler developers as a
strong reason to declare victory. Steve Rostedt asked whether function
tracing works; the answer from Alice Ryhl was quick to say that it does
indeed work, though "
symbol demangling would be nice
".
Ojeda concluded that the ability to work in Rust has succeeded in bringing
in new developers and new maintainers, which had been one of the original
goals of the project. It is also inspiring people to do documentation
work. There are a lot of people wanting to review Rust code, he said; he
is putting together a list of more experienced developers who can help
bring the new folks up to speed.
The session ended with Dan Williams saying that he could not imagine a
better person than Ojeda to have led a project like this and offered his
congratulations; the room responded with strong applause.
The Palisades Nuclear Generating Station is nestled between sand dunes on the eastern shore of Lake Michigan. It shut down for financial reasons in 2022. Three years later, it’s on the cusp of reopening, with hundreds of workers streaming through its security barriers every day.
Palisades is on track to restart in early 2026. When it does, it will be the first nuclear plant in the United States to generate electricity again after being
decommissioned
. Nick Culp of Holtec, the company that owns the plant, said its revival is a response to a surge in demand for electricity.
“We have seen [Michigan]’s baseload generation go offline at a rapid rate as they’ve moved away from fossil generation,” Culp said. “How do you backfill that when you see demand on the horizon like [artificial intelligence], like data storage, like keeping the lights on at home, and new manufacturing?”
Palisades Nuclear Generating Station is on track to restart in 2026. (Chris Bentley/Here & Now)
Nuclear is part of the answer to that question, Culp said, and the government agrees. Michigan gave $300 million to the restart — part of its
goal
to have 100% carbon-free electricity by 2040 — and the federal government gave the project
a loan
of more than $1.5 billion.
That money is part of the Trump administration’s investment in what it’s calling a “nuclear energy renaissance.” In May, the White House released a plan to
quadruple
American nuclear power by 2050, following a similar pledge from the Biden administration.
Meeting that goal would require dozens of new reactors. But whether they’re traditional power plants or
new designs
, nuclear reactors are expensive and slow to build. Facing a crunch between climate goals and rising electricity demand, Michigan, Pennsylvania, and Iowa are reopening plants that closed just a few years ago.
Powering up
When the Palisades plant in Michigan closed in 2022, Jim Byrd said he left his office of more than two decades “with a heavy heart.”
He was working at a nuclear plant in Mississippi last year when he heard about the plan to reboot Palisades. Then he got the call he had been waiting for, asking him to come back.
“Palisades is my home. These people are my family,” Byrd said. Since his return, he’s been training new employees in an exact replica of the reactor control room, right down to its 1960s pink-and-green color scheme.
While the plant was in decent shape, recommissioning still required repairing equipment and overcoming mountains of paperwork.
“We are creating a roadmap on how to do this, and the whole industry is watching,” said Byrd. “I had existing licensed operators that had a license from the Nuclear Regulatory Commission when we shut down, so we had to work on getting those back.”
All that work is worth it, he said, to get the plant back up and running.
“What we're doing here is exciting,” said Byrd. “Having a reliable power source that keeps your electricity costs low, everybody should be rooting for that.”
Paul Rhodes (left) is an operations shift manager at the Palisades Nuclear Generating Station, and Jim Byrd (right) is the assistant operations manager. (Chris Bentley/Here & Now)
The restart also attracted employees from elsewhere in the industry. The plant’s new chief nuclear officer, Rich Burroni, came from New York’s Indian Point Energy Center, which closed in 2021.
“The trend five years ago was a lot of work on decommissioning,” he said, “and now that’s all changed.”
More change may be coming for Palisades. The Department of Energy said this month it will give Holtec up to $400 million in federal funding to build small modular reactors in Michigan. That technology could help speed up the deployment of new nuclear power in the future, according to many in the industry, but so far has not been commercially viable.
For now, restarting a plant costs less than a third of what it would take to build a new one, said Culp of Holtec.
“When you factor in how long it takes to construct a new nuclear power plant, especially here in the United States, and the amount of money that goes into it,” he said, “it’s a pretty good value proposition.”
‘Taken for granted’
Many of Palisades’ employees live within 10 miles of the plant, which means they could be exposed to a radioactive plume
in an emergency
.
That
zone
also includes the town of Covert, Michigan. Township supervisor Daywi Cook’s father helped build the plant in the 1960s.
Covert, Michigan, township supervisor Daywi Cook’s father helped build the plant in the 1960s. (Chris Bentley/Here & Now)
“I grew up with the sirens being tested. I think it was every last Saturday of the month,” Cook said. “It was just a normal thing.”
Having friends and family members who worked at the plant helped demystify nuclear power, she said, and she came to see the plant as part of the community.
At one point, taxes from the plant made up 40% of the township’s revenue. Now, as Covert’s township supervisor, Cook said she’s glad the plant is reopening.
“Having that stability and having that employment available for folks who live here is something that I think was taken for granted for a very long time,” she said. “I think what's important is that we educate ourselves as residents near the plant and that Holtec continues to be a good neighbor in being transparent with the community.”
Zach Morris, head of the economic development group Market One, said Pallisades is an important piece of the local economy.
“Southwest Michigan is a beautiful area. It's just a wonderful community of small towns. I call it Americana,” Morris said. “Americana needs electricity. So the good news is we have a really reliable source of power that is clean. It pays its employees well. So we're excited about being able to keep that online.”
While nuclear power does have a
record of safety
, many Americans remember the 1979 disaster at central Pennsylvania’s Three Mile Island. One of the two reactors on the island had a partial meltdown and released radioactive gases into the environment. There were no deaths, and the Nuclear Regulatory Commission
said
the accident “had no detectable health effects on plant workers or the public.”
That left the plant with only one working reactor, which produced power until 2019, when it shut down for financial reasons. Today, that reactor, like Palisades in western Michigan, is in the process of coming back online.
“When you walk through the plant now, all the equipment is still there, but it's deathly quiet. You don't hear the hum of the motors, the steam going through the lines,” said Craig Smith, who is in charge of bringing back the plant at Three Mile Island, renamed
the Crane Clean Energy Center
. “It's an eerie kind of feeling when you walk through the plant.”
The nuclear plant at Three Mile Island was renamed the Crane Clean Energy Center. (Chris Bentley/Here & Now)
That eerie feeling may soon be gone. A red LCD clock in Smith’s office counts down the hours until the plant’s reopening in late 2027, which is backed by a billion-dollar loan from the Trump Administration.
The recommissioned reactor on Three Mile Island will pump 835 megawatts into the regional grid, but all that electricity is spoken for by Microsoft, which
agreed
to buy an equivalent amount of power from the grid for the next 20 years to feed its data centers.
“The dynamics of the energy economy have changed significantly, mainly because of artificial intelligence,” Smith said.
Nuclear is well-suited to the moment, in his view, because of its consistency.
“Hottest days of the year, coldest days of the year, freezing weather, the plant continues to operate,” Smith said. “As far as a reliable power source, you can’t beat it.”
Smith was in high school in nearby Hershey in 1979 and remembers the evacuation after the disaster at Three Mile Island. That failed to dissuade him from going into a career in nuclear power, and he said today, the industry is safer because of regulations put in place after the partial meltdown.
“People at the plant here take that personally,” he said. “The standards of the industry are greatly improved, and we've made significant improvements to the design of the plants and how we operate them.”
‘No viable solution’
Gene Stilp has a different take. He’s one of many people in the area who say the official story of the 1979 disaster failed to account for long-term health problems they believe are related to the accident.
Stilp has been fighting nuclear power on Three Mile Island since before the plant opened, and said the recommissioning is an unnecessary risk to public safety.
“We’re sticking up for the people who live here rather than the shareholders of Microsoft and Constellation,” said Stilp, who often appears in public wearing a blazer with “NO TMI RESTART” sewn on the back.
“What they’re proposing for evacuation does not work, and so that’s my line in the sand,” he said, pointing out the 10-mile Emergency Planning Zone includes a major hospital complex and several schools. “The population increases in Central Pennsylvania, the realization that there are so many people at risk here, the best you can do is take away that risk.”
Another longtime opponent of the power plant, Eric Epstein of
Three Mile Island Alert
, said the country is making mistakes in its rush to power data centers. He said the economics might have changed for nuclear power, but the risks have not.
“There was no public discussion about whether or not we’re going to restart Three Mile Island,” said Epstein. “You had this psychic tear in the fabric of the community that can't be papered over. You can put all the green paint you want on nuclear power, but there has been no viable solution to isolate nuclear waste.”
Constellation said the spent fuel on site has been safely stored on the island for decades, in fortified containers required by the government to withstand natural disasters, and that all the waste created in 40 years fits in an area about the size of two tennis courts.
Dauphin County Commissioner Justin Douglas said he’s listening to local concerns about the plant’s reopening.
“I personally am very interested in transparency and accountability for this in the sense of ensuring that it's as safe as it possibly can be, that we're tracking the cost and ensuring that the taxpayers aren't carrying any of the burden, that we have a good plan for the waste management and that ultimately the community impact is positive,” said Douglas. “We plan for the worst, and we hope for the best.”
‘A slam dunk’
Meeting the country’s rising demand for electricity will take a lot more than reviving a few recently decommissioned plants.
“It is a brilliant idea. It's sort of a slam dunk. The downside is that there are not many reactors out there that are realistically able to restart,” said Jacopo Buongiorno, professor of nuclear science and engineering at Massachusetts Institute of Technology. “You’re looking at a little bit less than three gigawatts of electricity, out of 50 that apparently are required for data centers and AI.”
There are also technical tweaks called
uprates
that can squeeze more power out of existing plants, which could help blunt the immediate electricity crunch.
“You probably have potential for another five to eight gigawatts across the whole fleet. So you add that up to the two or three that we get from the restarts, you're looking at 10 [gigawatts],” Buongiorno said, or only about a fifth of the
total AI power demand
expected by 2030.
“If that demand continues in the 2030s, then you can make the investment now to build new reactors,” he said, “and then nuclear can actually capture a lot more than 20%.”
Please stop using middleware to protect your routes (2024)
When talking about auth, there seems be a certain group that’s adamant on using middleware to handle authorization. Middleware here refers to functions that run before every request.
I’m just confused at this point since you’re just re-implementing routing logic within middleware, an API provided by your routing library. And what do you do when you need to protect routes based on user roles?
const adminOnlyRoutes = ["/admin/*"];app.middleware((req, res, next) => { if (!isProtected(req.path)) { return next(); } const user = validateRequest(req); if (user) { let requiresAdminRole = false; for (const route of adminOnlyRoutes) { requiresAdminRole = matchRoute(route, req.path); } if (requiresAdminRole && !user.admin) { res.writeHeader(401); return; } return next(); } res.writeHeader(401);});
While route-level middleware (middleware that only applies to certain routes) may help in this simple example, routes in real-world applications aren’t often organized by their required permissions. What happens if you have multiple roles? What if you need to implement different rate-limiting on each route based on user roles? How about API access token permissions and scopes?
Abstractions aren’t the problem here. The issue is that middleware is the wrong abstraction. It’s just the most obvious solution that seems to make sense in a smaller scale.
But, we first have to answer: Do we need to abstract in the first place?
This goes beyond this rant but I feel, at least in the JavaScript ecosystem, people seems to go
too
far on abstractions and “simplicity.” It isn’t surprising given how
loosey-goosey
powerful JS can be. Auth, which includes both authentication and authorization, seems to be particularly vulnerable to this since people are overtly scared of it. But auth is not an independent system from your application. It’s an integral part of it that affects and is affected by everything else. This makes it extra-hard to abstract without introducing unwanted complexity since it any abstraction that’s useful require some level of flexibility.
Getting back to the middleware discussion, why not just add the auth check on each route?
If you’re too lazy to write some basic if checks, maybe that’s a you problem. But on a serious note, if you need to abstract, use wrapper functions. This is a much better approach than middleware since you don’t have to worry about routing. I also like that all the logic is defined in a single location instead of scattered across your project.
If you deal with multiple permission level (e.g. roles, scopes), you can just create a helper function for checking them. Again, abstractions themselves aren’t bad. You just need to implement them at the right level.
This doesn’t mean middleware is useless. It works for global-level stuff like CSRF protection and providing data to each route. Actually, authenticating requests and passing the user object to each route is a great use of middleware (but letting each route handle authorization). But even then, you should probably replace it once you need to deal with exceptions and multiple patterns.
One common response I get to this opinion is that using middleware prevents developers from accidentally forgetting to add an auth check.
That’s why you test your code
. You should be testing your auth logic regardless of your implementation. Given that, adding auth checks to each route is less bug-prone and easier to debug than forcing an abstraction with middleware.
Object storage is the backbone of modern data infrastructure. AWS S3, Google Cloud Storage, MinIO, Ceph, newer players like Tigris Data—the market is saturated. So why build another one?
Because the fundamental assumptions behind these systems are shifting. High performance is no longer optional—but having high performance available isn’t the same as being able to afford using it.
Beyond “Cold Storage”: Why Performance Matters Now
Traditional object storage had a clear priority order: cost first, performance later. This worked fine for archiving backups and storing large, rarely accessed files.
But today, object storage is increasingly the primary data layer for AI, analytics, and cloud-native applications. Latency directly translates to compute costs—stalled GPUs waiting on I/O are expensive GPUs doing nothing.
High-performance object storage exists now. S3 Express One Zone, for example, delivers single-digit millisecond latency. But there’s a catch: the per-request pricing makes it prohibitively expensive to actually
use
at high IOPS. As one analysis put it, it’s “the right technology, at the right time with the wrong price” [1]. You have the performance on paper, but you can’t afford to run your workload at full speed. That’s the high-performance trap.
The New Challenge: AI and Analytical Workloads
Modern workloads, especially in AI, impose demands that strain traditional designs:
Small Objects at Scale
: AI training datasets often consist of millions of small files (images, text snippets, feature vectors). A study of typical AI training workloads found over 60% of objects are 512KB or smaller [2]. This shifts the bottleneck from bandwidth to metadata performance.
Latency Sensitivity
: Training loops and inference pipelines are bottlenecked by I/O. When fetching thousands of small objects per batch, per-object latency compounds quickly, stalling expensive GPUs.
The Need for Directories
: S3’s flat namespace is a mismatch for many workflows. Data scientists expect atomic renames and efficient directory listings—operations that are either slow or missing in classic object stores.
”Why Not Just Use a Filesystem?”
A reasonable question: if you want directories and atomic rename, why not just use a filesystem like AWS EFS? Object stores and filesystems are different concepts—why blur the line?
The answer is that the line is
already
blurring, driven by real workload demands. AWS themselves recognized this when they introduced S3 Express One Zone with explicit “directory bucket” semantics and atomic rename support (currently single-object) [3]. Google Cloud has made similar moves toward hierarchical namespace support [4]. The industry is converging on this because the clean separation between “object storage for scale” and “filesystem for semantics” doesn’t match how modern applications actually work.
We’re not trying to build a POSIX filesystem. But the subset of filesystem semantics that matter for data workflows—efficient directory listings, atomic rename for safe data handoffs—these belong in object storage. The alternative is forcing every application to build fragile workarounds on top of a flat namespace.
Where Current Solutions Hit a Wall
Existing systems struggle with these patterns in predictable ways:
The High-Performance Trap
: High-performance tiers like S3 Express One Zone solve the latency problem, but the per-request cost means you can’t actually use that performance at scale. At 10K PUT/s, you’re looking at ~$29K/month in request fees alone. The performance is there; the economics aren’t.
The Small Object Tax
: With cloud object storage, you pay per request. Storing billions of 4KB objects means your API request costs can exceed your storage costs. The more objects you have, the worse it gets.
Missing Directory Semantics
: The lack of atomic rename forces complex workarounds in applications, limiting what you can build directly on object storage. Most systems with rename support rely on inode-like structures that struggle with scalability and performance—adding to the per-IOPS cost burden.
Introducing FractalBits
We built FractalBits to break out of the high-performance trap: delivering performance you can actually afford to use at scale. In our benchmarks, we achieved nearly 1M GET/s on 4KB objects with a cluster totaling 64 cores across all data and metadata nodes.
Our focus:
High IOPS at a cost that makes sense—so you can actually run your workload at full speed.
Native directory semantics, including atomic rename.
Here’s what the gap looks like for a small-object intensive workload (4KB objects, 10K IOPS):
Metric
S3 Express One Zone
FractalBits
Reduction
Monthly Cost for 10K PUT/s
~$29,290
~$166
~150×
Monthly Cost for 10K GET/s
~$778
~$42
~15×
Storage (1 TB Per Month)
~$110
$0 (included)
—
S3 costs based on public pricing ($0.00113/1K PUTs, $0.00003/1K GETs, $0.11/GB/Month). FractalBits estimated using 1-year reserved instance pricing for required compute (e.g., i8g.2xlarge for data, m7g.4xlarge for metadata). Your savings will vary based on workload, but the magnitude is indicative.
At our core is a metadata engine built on an on-disk radix tree, optimized for path-like keys.
Most object stores use LSM-trees (good for writes, variable read latency) or B+ trees (predictable reads, write amplification). We chose a radix tree because it naturally mirrors a filesystem hierarchy:
Prefix Sharing
: Common path segments (e.g.,
/datasets/cifar10/
) are stored once, saving memory and speeding up traversal.
Efficient Directory Operations
: Listing a directory becomes a subtree scan. Atomic rename is essentially updating a pointer at the branch point, not copying data.
Crash Consistency
: We use physiological logging to ensure metadata integrity and fast recovery.
Unlike most systems that use inode-based (or inode-like) structures to support directory features, we use a full-path approach for better scalability and performance.
By the way, we implemented the core engine in Zig for control and predictable performance.
Why Zig?
comptime
metaprogramming generates optimized code paths for different node types at compile time
Manual memory management means no GC pauses and predictable latency
Direct SIMD access for parallel key comparisons within tree nodes
io_uring in std library, so that we can easily try more recent io_uring kernel features (registered buffers, nvme IOPoll etc).
The Gateway: Rust-Based S3-Compatible API server
Our S3-compatible API server, built in Rust, manages the data path:
Safety & Concurrency
: Rust’s ownership model gives us thread safety without a garbage collector—important for high-concurrency request handling.
Async I/O
: Built on Tokio for handling thousands of concurrent connections.
Production-Ready Frameworks
: We support both axum and actix-web, defaulting to actix-web. Its thread-per-core architecture aligns with our design for maximum performance.
The Model: Bring Your Own Cloud (BYOC)
FractalBits deploys as a managed software layer within your own cloud account (currently AWS only).
For you:
Cost transparency—you pay the cloud provider’s raw costs for VMs and disks, no egress fees to us
Data sovereignty—your data never leaves your cloud tenant
Low latency—deploy in the same region/VPC as your compute
For us:
We leverage the cloud’s proven infrastructure instead of building it from scratch, letting us focus on the storage engine itself.
Looking Ahead
The object storage market has high-performance options, but the economics often make that performance unusable at scale. And systems that do offer directory semantics often struggle with performance or scalability. Getting both at a reasonable cost is still rare. We think there’s room for a different approach.
FractalBits is our answer. We’re early in this journey and learning from users who are pushing these limits.
Hitting the performance or cost wall with your current object storage? We’d be interested to hear about your use case.
Germany's new Intercity Express train is seen in Berlin prior to its official presentation by railway operator Deutsche Bahn, on Oct. 17.
Tobias Schwarz/AFP via Getty Images
hide caption
toggle caption
Tobias Schwarz/AFP via Getty Images
EN ROUTE TO BERLIN — As the 12:06 p.m. Intercity Express train to Berlin leaves the Swiss city of Bern and crosses the border into Germany, passengers reluctantly bid farewell to punctuality — a guarantee in the Alpine republic where trains run like clockwork.
Fifty-seven-year-old Elisabeth Eisel regularly takes this seven-hour train journey. "Trains in Switzerland are always on time, unless they're arriving from Germany," she says. "Harsh but true, sadly. It didn't used to be the case."
Chronic underinvestment in Germany has derailed yet another myth about Teutonic efficiency. The German railway Deutsche Bahn's long-distance "high-speed" trains are now
among
the least punctual in
Europe
. In October, the national rail operator
broke its own poor record
with roughly only half of all long-distance trains arriving without delay.
Waning reliability is but one of many problems for state-owned Deutsche Bahn, which is
operating at a loss
and regularly subjects its passengers to poor or no Wi-Fi access, seat reservation mix-ups, missing train cars and "technical problems" — a catch-all reason commonly cited by conductors over the train intercom.
German Transport Minister Patrick Schnieder (second from left) and Evelyn Palla (third from left), CEO of Deutsche Bahn, get off the train at the premiere of the new Intercity Express train at Berlin Ostbahnhof, Oct. 17.
Christoph Soeder/picture alliance via Getty Images
hide caption
toggle caption
Christoph Soeder/picture alliance via Getty Images
After
decades of neglect
, the government has announced a
100-billion-euro
investment in rail infrastructure. But Lukas Iffländer, vice chair of the railway passenger lobby group Pro Bahn, says it will take
more than money
to get German trains back on track.
"We are now paying the price for years and years of neglect, basically since 1998," Iffländer says. It's not just crumbling tracks and sticky signals that need attention, he explains, but the network operator's overly bureaucratic infrastructure.
"Every process at Deutsche Bahn is really complicated," Iffländer says. "It takes forever and that frustrates the people that actually want to do something."
Iffländer says Deutsche Bahn is top heavy: While there are not enough train engineers and signal operators, there are too many managers sitting at desks.
German news weekly
Der
Spiegel
recently reported
that upper management has allegedly approved canceling long-distance trains to bump up punctuality ratings because canceled trains are not recorded in the statistics.
Deutsche Bahn declined NPR's requests for an interview, but in a written statement it denied embellishing its data. It said that the
Spiegel
report is "based on chat messages between dispatchers," not "actual data used for collecting statistics."
On a
different
train — the 11:18 a.m. from Munich to Berlin — passengers are packed like sardines at double capacity because another fully booked Intercity Express was canceled at the very last minute.
The mood is surprisingly jolly, despite the fact that half of the passengers have been standing for more than four hours now — with no hope of getting through the crowded carriages to use the restroom.
Catherine Launay, 51, is lucky enough to have a seat. She's from France and says she's surprised passengers are not kicking up more of a fuss.
"If this had been a French train, there'd have been more of an uproar!" Launay quips. "In fact, French passengers would have revolted by now."
In an effort to prevent aggressive passenger behavior toward train staff, Deutsche Bahn has launched a
mockumentary series
for TikTok, Instagram and YouTube about a train crew struggling to cope under increasingly preposterous conditions.
YouTube
The fictional train staff's dance routine to a techno beat, while singing "
zenk yoo for träveling wiz Deutsche Bahn
," has gone down surprisingly well with passengers, even if they can't actually watch it on board because the Wi-Fi can't cope with streaming.
And as our train rattles along the track, it's difficult to differentiate between Deutsche Bahn parody and reality. The train conductor wishes passengers a pleasant journey "as far as it's possible," adding "we should just about make it to Berlin." The train car chortles.
But Deutsche Bahn is no laughing matter for Federal Transport Minister Patrick Schnieder, who recently
warned
that "many equate the malfunctioning of railways with the malfunctioning of our state."
Many are putting their hopes in the railway company's new CEO, Evelyn Palla, based on her track record at Austrian Federal Railways.
Palla announced plans this week to make Deutsche Bahn more trim and efficient by eliminating executive positions, but she warned that there's so much to fix, it will take time.
As we finally pull into Berlin's main train station, passengers are resigned to the fact that — whether it's signal failure, humor failure or state failure — Germany's trains appear to have gone off the rails.
YouTube's CEO limits his kids' social media use – other tech bosses do the same
Neal Mohan, the CEO of YouTube speaks during a panel for the Summit for Democracy on March 30, 2023 in Washington, DC.
Anna Moneymaker | Getty Images
YouTube's CEO Neal Mohan is the latest in a line of tech bosses who have admitted to limiting their children's social media use, as the harms of being online for young people have become more evident.
Mohan, who took the helm of YouTube's leadership in 2023, was just named Time's 2025 CEO of the Year. He said in an
interview with the magazine
that his children's use of media platforms is controlled and restricted.
"We do limit their time on YouTube and other platforms and other forms of media. On weekdays we tend to be more strict, on weekends we tend to be less so. We're not perfect by any stretch," Mohan said in one
TikTok video
posted by Time Magazine on Thursday.
He stressed "everything in moderation" is what works best for him and his wife, and that extends to other online services and platforms. Mohan has three children: two sons and one daughter.
Experts have continued to sound the alarm on how excessive smartphones and social media use has harmed children and teenagers. Jonathan Haidt, NYU professor and author of "The Anxious Generation," has advocated for
children to not have smartphones
before the age of 14 and no access to social media before the age of 16.
"Let them have a flip phone, but remember, a smartphone isn't really a phone. They could make phone calls on it, but it's a multi-purpose device by which the world can get to your children," Haidt said in an interview with CNBC's Tania Bryer earlier this year.
This week, Australia
became the first country to formally bar
users under the age of 16 from accessing major social media platforms. Ahead of the legislation's passage last year, a
YouGov survey
found that 77% of Australians backed the under-16 social media ban. Still, the rollout has
faced some resistance
since becoming law.
Mohan said in a more
extensive interview with Time
on Wednesday that he feels a "paramount responsibility" to young people and giving parents greater control over how their kids use the platform. YouTube Kids was launched in 2015 as a child-friendly version of the Google-owned platform.
He said his goal is "to make it easy for all parents" to manage their children's YouTube use "in a way that is suitable to their household," especially as every parent has a different approach.
Bill Gates, Mark Cuban
Several tech bosses
have taken a similar approach.
YouTube's former CEO Susan Wojcicki
, also barred her children from browsing videos on the app, unless they were using YouTube Kids. She also limited the amount of time they spent on the platform.
"I allow my younger kids to use YouTube Kids, but I limit the amount of time that they're on it," Wojcicki told CNBC in 2019. "I think too much of anything is not a good thing."
Bill Gates, Microsoft's co-founder, is amongst the tech titans who are against allowing young people too much screen time. With three children, now adults, Gates openly talked about not giving them cell phones until they were in their teens.
"We don't have cell phones at the table when we are having a meal, we didn't give our kids cell phones until they were 14 and they complained other kids got them earlier," Gates said years ago.
Meanwhile, billionaire Mark Cuban would even resort to installing Cisco routers and using management software to monitor which apps his children were on and shut off their phone activity.
It's true. The odds are finally in your favor.
The Typeframe PX-88 is an integrated system that has been perfectly arranged to guarantee a superior outcome for the operator. Leave it to Typeframe to integrate these critical elements into one commanding machine.
The PX-88 delivers all the power and specialized features expected from a professional system - but built around a dedicated, uncompromising user experience. Is it a cyberdeck or a writerdeck? It's whatever you need it to be. The reliable Raspberry Pi 4 B core handles demanding web-based editors and complex tasks with robust performance. The compact size belies the strength within.
A mechanical keyboard provides a superior, tactile input experience - a professional tool unmatched by common consumer electronics. Furthermore, the system is designed for simple construction with minimal required soldering, and maintenance is streamlined - all internal components are easily reached via sliding access panels.
If you have been looking for a portable, professional computer where input quality meets core performance, look at the PX-88.
Typeframe. Built for your best work, built by you.
Rich Headers: leveraging this mysterious artifact of the PE format
Smaller = faster, and we all want faster. Moore's law is over, Dennard scaling isn't affordable any more, smaller feature sizes are getting absurdly difficult and therefore expensive to fab. So if we want our computers to keep getting faster as we've got used to over the last 40-50 years then the only way to keep delivering that will be to start ruthlessly optimising, shrinking, finding more efficient ways to implement what we've got used to.
Smaller systems are better for performance.
* The smaller the code, the less there is to go wrong.
Smaller doesn't just mean faster, it should mean simpler and cleaner too. Less to go wrong. Easier to debug. Wrappers and VMs and bytecodes and runtimes are bad: they make life easier but they are less efficient and make issues harder to troubleshoot. Part of the Unix philosophy is to embed the KISS principle.
So that's performance and troubleshooting. We aren't done.
* The less you run, the smaller the attack surface.
Smaller code and less code means fewer APIs, fewer interfaces, less points of failure. Look at djb's decades-long policy of offering rewards to people who find holes in qmail or djbdns. Look at OpenBSD. We all need better more secure code. Smaller simpler systems built from fewer layers means more security, less attack surface, less to audit.
Higher performance, and easier troubleshooting, and better security. There's 3 reasons.
Practical examples...
The Atom editor spawned an entire class of app: Electron apps, Javascript on Node, bundled with Chromium. Slack, Discord, VSCode: there are multiple apps used by tens to hundreds of millions of people now. Look at how vast they are. Balena Etcher is a, what, nearly 100 MB download to write an image to USB? Native apps like Rufus do it in a few megabytes. Smaller ones like USBimager do it in hundreds of kilobytes. A dd command in under 100 bytes.
Now some of the people behind Atom wrote Zed.
It's 10% of the size and 10x the speed, in part because it's a native Rust app.
The COSMIC desktop looks like GNOME, works like GNOME Shell, but it's smaller and faster and more customisable because it's native Rust code.
GNOME Shell is Javascript running on an embedded copy of Mozilla's Javascript runtime.
Just like dotcoms wanted to dis-intermediate business, remove middlemen and distributors for faster sales, we could use disintermediation in our software. Fewer runtimes, better smarter compiled languages so we can trap more errors and have faster
and
safer compiled native code.
Smaller, simpler, cleaner, fewer layers, less abstractions: these are all
goods things
which are
desirable
.
Dennis Ritchie and Ken Thompson knew this. That's why Research Unix evolved into Plan 9, which puts way more stuff through the filesystem to remove whole types of API. Everything's in a container all the time, the filesystem abstracts the network and the GUI and more. Under 10% of the syscalls of Linux, the kernel is 5MB of source, and yet it has much of Kubernetes in there.
Then they went further, replaced C too, made a simpler safer language, embedded its runtime right into the kernel, and made binaries CPU-independent, and turned the entire network-aware OS into a runtime to compete with the JVM, so it could run as a browser plugin as well as a bare-metal OS. Now we have ubiquitous virtualisation so lean into it: separate domains. If your user-facing OS only runs in a VM then it doesn't need a filesystem or hardware drivers, because it won't see hardware, only virtualised facilities, so rip all that stuff out. Your container host doesn't need to have a console or manage disks.
This is what we should be doing. This is what we need to do. Hack away at the code complexity. Don't add functionality, remove it. Simplify it. Enforce standards by putting them in the kernel and removing dozens of overlapping implementations. Make codebases that are smaller and readable by humans.
Leave the vast bloated stuff to commercial companies and proprietary software where nobody gets to read it except LLM bots anyway.
How a US Citizen Was Scanned With ICE's Facial Recognition Tech
403 Media
www.404media.co
2025-12-13 08:01:06
Jesus Gutiérrez told immigration agents he was a U.S. citizen. Only after they scanned his face, did the agents let him go....
This article is a partnership between Reveal and 404 Media.
Jesus Gutiérrez, 23, was walking home one morning from a Chicago gym when he noticed a gray Cadillac SUV with no license plates. He kept walking, shrugging it off. Then the car pulled over and two men got out.
The federal immigration officials told him not to run. They then peppered Gutiérrez with questions: Where are you going? Where are you coming from? Do you have your ID on you?
Gutiérrez is a U.S. citizen. He told the officials this. He didn’t have any identification on him, but, panicking, he tried to find a copy on his phone. The agents put him into the car, where another two agents were waiting, and handcuffed him. Just sit there and be quiet, they said.
Without Gutiérrez’s ID, the agents resorted to another approach. They took a photo of his face. A short while later, the agents got their answer: “Oh yeah, he’s right. He’s saying the right thing. He does got papers,” Gutiérrez recalled the agents saying.
💡
Has this happened to you or someone you know? Do you have any videos of ICE or CBP scanning people's faces? Do you work for either agency? I would love to hear from you. Using a non-work device, you can message me securely on Signal at joseph.404 or send me an email at joseph@404media.co.
Gutiérrez’s experience, which he recounted to
Reveal
, is one snapshot of something that federal authorities have acknowledged to 404 Media that they are doing across the country: scanning people’s faces with a facial recognition app that brings up their name, date of birth, “alien number” if they’re an immigrant, and whether they have an order of deportation.
404 Media previously obtained
internal Immigration and Customs Enforcement (ICE) emails revealing the agency’s facial recognition app, called Mobile Fortify, and catalogued social media videos showing agents scanning people’s faces
to verify their citizenship
.
Now,
Reveal
has spoken to a person who appears to have had that technology used against them. Gutiérrez sent
Reveal
a copy of his passport to verify his citizenship.
“You just grabbing, like, random people, dude,” Gutiérrez said he told the agents after they scanned his face. The officials eventually dropped off Gutiérrez after driving for around an hour. For several days, he didn’t go anywhere, not even to the gym. Gutiérrez told his father at the time that he “got kidnapped.”
“This is a flagrant violation of rights and incompatible with a free society,” said Nathan Freed Wessler, deputy project director for the American Civil Liberties Union’s (ACLU) Speech, Privacy, and Technology Project. “Immigration agents have no business scanning our faces with this glitchy, privacy-destroying technology—especially after often stopping people based on nothing more than the color of their skin or the neighborhood they live in.”
A screenshot of an internal DHS document obtained by 404 Media.
Available here
.
Mobile Fortify is available to ICE and Customs and Border Protection (CBP) officials on their work-issued phones. After an agent scans someone’s face, the app queries an unprecedented collection of U.S. government databases, including one run by the FBI and another that checks for outstanding state warrants, according to user manuals seen by 404 Media. The app runs the person’s face against a database of 200 million images,
according to internal ICE material
404 Media viewed.
“The photograph shown [in the app’s results] is the photograph that was taken during the individual’s most recent encounter with CBP, however the matching will be against all pictures CBP may maintain on the individual,” said an internal Department of Homeland Security (DHS) document
404 Media obtained
. The app turns the system usually used for verifying travelers at the border inward against people on U.S. streets.
The need for Mobile Fortify, according to that internal document, is for immigration authorities to identify people who can be removed from the country. But it acknowledges that it may be used against U.S. citizens, like in Gutiérrez’s case.
“It is conceivable that a photo taken by an agent using the Mobile Fortify mobile application could be that of someone other than an alien, including U.S. citizens or lawful permanent residents,” the document reads.
Rep. Bennie G. Thompson, ranking member of the House Homeland Security Committee,
previously told 404 Media
that ICE will prioritize the results of the app over birth certificates. “ICE officials have told us that an apparent biometric match by Mobile Fortify is a ‘definitive’ determination of a person’s status and that an ICE officer may ignore evidence of American citizenship—including a birth certificate—if the app says the person is an alien,” he said. “ICE using a mobile biometrics app in ways its developers at CBP never intended or tested is a frightening, repugnant, and unconstitutional attack on Americans’ rights and freedoms.”
404 Media has found
other instances in which ICE and CBP agents have used a facial recognition app to verify someone’s identity and citizenship. In one that appeared to take place in Chicago, a Border Patrol officer stopped two young men on bicycles before asking his colleague, “Can you do facial?” The other official then scanned one of the boy’s faces, according to a video posted on social media. In another, a group of ICE officers surrounded a man driving a car. He said he was an American citizen. “Alright, we just got to verify that,” one of them said. A second then pointed their phone’s camera at the man and asked him to remove his hat. “If you could take your hat off, it would be a lot quicker,” the officer said. “I’m going to run your information.”
In Gutiérrez’s case, there is little indication that he was stopped for any reason beyond the color of his skin. He is of Mexican descent, he said. Stops of people based on their race, use of Spanish, or location (such as a car wash or bus stop) have become known among critics as “Kavanaugh stops,” after Supreme Court Justice Brett Kavanaugh justified the method
in a September opinion
.
“The Government sometimes makes brief investigative stops to check the immigration status of those who gather in locations where people are hired for day jobs; who work or appear to work in jobs such as construction, landscaping, agriculture, or car washes that often do not require paperwork and are therefore attractive to illegal immigrants; and who do not speak much if any English,” the opinion says. (Gutiérrez speaks Spanish but conducted his interview with
Reveal
in English.) “If the officers learn that the individual they stopped is a U.S. citizen or otherwise lawfully in the United States, they promptly let the individual go. If the individual is illegally in the United States, the officers may arrest the individual and initiate the process for removal.”
The ACLU’s Wessler added: “In the United States, we should be free to go about our business without government agents scanning our faces, accessing our personal information, saving our photos for years, and putting us at risk of misidentifications and wrongful detentions. ICE and CBP’s use of Mobile Fortify on the streets of America should end immediately.”
DHS Assistant Secretary Tricia McLaughlin said in a statement, “DHS is not going to confirm or deny law enforcement capabilities or methods.” CBP said that the agency built the app to support ICE operations and that it has been used by ICE around the country.
A CBP spokesperson added in a statement, “Mobile Fortify is a law enforcement app developed by U.S. Customs and Border Protection for ICE agents and officers. It helps field personnel gather information during immigration inspections, but agents must consider all circumstances before deciding on someone's immigration status. CBP personnel working with ICE teams can access the app after completing required training. Further details cannot be shared due to law enforcement sensitivities.”
Gutiérrez said that at the end of his encounter, while he was still in the car, the agents were laughing.
About the author
Joseph is an award-winning investigative journalist focused on generating impact. His work has triggered hundreds of millions of dollars worth of fines, shut down tech companies, and much more.
In the last few days I’ve managed to finalize work on the UringMachine fiber
scheduler. Beyond making sure the fiber scheduler is feature complete, that is,
it implements all the different Fiber Scheduler hooks and their expected
behaviour. To make sure of this, I also spent a couple of days writing test
cases, not only of the fiber scheduler, but also of UM’s low-level API.
Beyond the tests, I wrote a series of benchmarks to have an idea of how
UringMachine compares to other concurrency solutions:
You can consult the full results
here
.
I’ll refrain from making overly generalized statements about what these
benchmark results mean, but I think they demonstrate the promise of working with
fibers to create concurrent Ruby apps.
So, as these benchmarks show, the Fiber Scheduler can bring significant benefits
to concurrent Ruby apps, with minimal changes to the code (basically, instead of
Thread.new
you’ll use
Fiber.schedule
). The fact that the scheduler does the
I/O transparently behind the scenes and integrates with the rest of the Ruby
ecosystem feels almost like magic.
So I think this really validates the approach of Samuel Williams in designing
how the fiber scheduler interfaces with the rest of the Ruby runtime. And the
fact that the web server he authored,
Falcon
, is now used in production at
Shopify, is an even stronger validation!
Here’s a detailed report of my work this last week:
Samuel has
fixed
the issue with the
hanging
#pwrite
(it turns out the the
#io_pwrite
hook was being invoked
with the GVL released.)
Added support for
SQPOLL
mode
when setting up a
UringMachine instance. It’s not clear to me what are the performance
implications of that, but I’ll try to make some time to check this against
TP2
, a UringMachine-based web server I’m
currently using in a bunch of projects.
started looking at getting
#io_close
to work, and found out that Samuel has
already done the work, that is the code was already there, but was commented
out. Samuel explained that it was impossible to get it to work due to the
complexity of the implementation of
IO#close
, and indeed when I tried it
myself I saw that in fact it was just not possible the way the IO state is
managed when an IO is closed. I then had the idea that maybe we could pass the
underlying fd instead of the IO object itself to the
#io_close
hook. The
only issue is that this breaks the convention where the different
io_xxx
hooks take an io as their first argument. Nevertheless, I suggested this idea
to Samuel and gladly he accepted when he saw this is the only we can make this
hook work. Samuel then proceeded to prepare a
PR
and merge it.
Added the
#io_close
hook to the UringMachine fiber scheduler, as well as a
#yield
hook for dealing with thread interrupts in response to another
PR
by Samuel. I also added missing
docs for the different methods in the fiber scheduler.
Spent a lot of time writing lots of tests for the fiber scheduler. I tried to
cover the entire
IO
API - both class- and instance methods. I also wrote
some “integration” tests - different scenarios not unlike those in the
benchmarks, which exercise the different hooks in the fiber scheduler.
Added some new APIs to help with testing:
UM#await_fibers
is a method for
waiting for one or more fibers to terminate. Unlike
UM#join
, it doesn’t
return the return values of the given fibers, it just waits for them to
terminate. Another new API is
UM.socketpair
, which is like
Socket.socketpair
except it returns raw fd’s.
Fixed some small issues in the UM fiber scheduler and in the UM low-level API
implementation.
Added and streamlined metrics that indicate the following:
The ring size
Total number of ops
Total number of fiber switches
Total number of waits for CQEs
Current number of pending ops
Current number of unsubmitted ops
Current size of runqueue
Current number of transient ops
Current number of free ops
I also added some basic time measurements:
Total CPU time
Total time spent waiting for CQEs
These are off by default, but can be enabled by calling
UM#profile(true)
.
I’d like to do a lot more with profiling, like measuring the CPU time spent on
each fiber, but I’m a bit apprehensive of the performance costs involved, as
getting the
CLOCK_THREAD_CPUTIME_ID
clock is relatively slow, and then
managing this for each fiber means getting and setting a couple of instance
variables, which can
really
slow things down. On top of that, I’m not that
sure this is really needed.
What’s Next for UringMachine
One of the ideas I discussed with Samuel is to add support for registered
buffers that integrates with the
IO::Buffer
class. While UringMachine
already has support for buffer rings, it uses a custom implementation of
buffers. So I might start by converting this to use
IO::Buffer
instead.
I’d also like to do a bit more work on performance tuning the UringMachine
low-level API, specifically to be able to control the maximum number of fiber
context switches before doing I/O work, i.e. submitting ops and checking for
completions.
Beyond that, I also want to spend some time documenting the UringMachine API,
as it is sorely lacking, and I’d like for other people to be able to play with
it.
I present to you, dear reader, a spiral containing every
0
Unicode 14 character in the
GNU Unifont
. Starting at the centre with the control characters, spiralling clockwise through the remnants of ASCII, and out across the entirety of the Basic Multi Lingual Plane. Then beyond into the esoteric mysteries of the Higher Planes
1
.
Zoom in for the massiveness
. It's a 10,000x10,000px image. Because the Unifont displays individual characters in a 16x16px square, it is quite legible even when printed out on a domestic laser printer at 600dpi:
I also made it as a square spiral - which fits into a smaller space.
Again, printed out at 600dpi it is readable. Just!
Printed onto A0 - 841mm square - it's a bit better. The ASCII set is readable:
But characters in CJK weren't particularly legible:
If I wanted the 16px symbols to each be 5mm wide, I'd need to print this on paper over 3 metres wide!
Because visualising one-dimensional data structures in two-dimensional space is
fun!
That's why 😃
I was inspired by seeing two lovely piece of artwork recently.
The first was 2015's
Unicode in a spiral
by Reddit user cormullion.
(Click to embiggen.)
It's gorgeous, but doesn't include all characters. Oh, and you also have to rotate your head to read each character.
There's a larger version which covers a lot more of the Basic Multilingual Plane
It's an
18MB PDF
. And, because of the resolution of the resolution of the font, it needs to be printed out on a 1 metre square at a minimum.
The second interesting thing I found was a 2016 Hilbert Curve of Unicode:
The
Hilbert Curve poster
is beautiful. But it only goes up to Unicode 10 - and we're on Unicode 14 by now. Despite the æsthetically pleasing nature of fractal curves, I find them quite un-intuitive.
Neither show off the
gaps
in Unicode. That is, where there is space to fit more symbols.
So I wanted to do something which satisfied these criteria:
Although I wanted
every
character, there are some practical problem. Firstly:
Unifont only stores one glyph per printable Unicode code point. This means that complex scripts with special forms for letter combinations including consonant combinations and floating vowel marks such as with Indic scripts (Devanagari, Bengali, Tamil, etc.) or letters that change shape depending upon their position in a word (Indic and Arabic scripts) will not render well in Unifont.
So there are some scripts which will look a bit ugly. And some characters which won't be well represented.
The second issue is one of size. Some of the newer characters are simply too big:
Scripts such as Cuneiform, Egyptian Hieroglyphs, and Bamum Supplement will not be drawn on a 16-by-16 pixel grid. There are plans to draw these scripts on a 32-by-32 pixel grid in the future.
That means it misses out on characters like
𒀰
,
𒁏
and, of course,
𒀱
. Which, to be fair, would be hard to squeeze in!
The third problem is that Unicode is updating all the time. Although the Unifont is at Version 14 -
Python's Unicode Database
is stuck at V13. Luckily, there is a library called
UnicodeData2
which includes V14.
But, given those limitations, I thought it was possible to craft something nice.
Python 3defspiral_points(arc=1, separation=1):
# Adapted from https://stackoverflow.com/a/27528612/1127699"""generate points on an Archimedes' spiral with `arc` giving the length of arc between two points and `separation` giving the distance between consecutive turnings
- approximate arc length with circle arc at given distance
- use a spiral equation r = b * phi
"""defpolar_to_cartesian(r, phi):
return ( round( r * math.cos(phi) ),
round( r * math.sin(phi) )
)
# yield a point at originyield (0, 0)
# initialize the next point in the required distance
r = arc
b = separation / (2* math.pi)
# find the first phi to satisfy distance of `arc` to the second point
phi =float(r) / b
whileTrue:
yield polar_to_cartesian(r, phi)
# advance the variables# calculate phi that will give desired arc length at current radius (approximating with circle)
phi +=float(arc) / r
r = b * phi
Python 3n =12
nested_list= [[0for i inrange(n)] for j inrange(n)]
low=0
high=n-1
x=1
levels=int((n+1)/2)
for level inrange(levels):
for i inrange(low,high+1):
nested_list[level][i]= x
x+=1for i inrange(low+1,high+1):
nested_list[i][high]= x
x+=1for i inrange(high-1,low-1,-1):
nested_list[high][i]= x
x+=1for i inrange(high-1,low,-1):
nested_list[i][low]= x
x+=1
low+=1
high-=1for i inrange(n):
for j inrange(n):
print(nested_list[i][j],end="\t")# print the row elements with# a tab space after each elementprint()# Print in new line after each row
Python 3from fontTools.ttLib import TTFont
font = TTFont(fontpath) # specify the path to the font in questiondefchar_in_font(unicode_char, font):
for cmap in font['cmap'].tables:
if cmap.isUnicode():
iford(unicode_char) in cmap.cmap:
returnTruereturnFalse
But, of course, it is a bit more complicated than that. The Unifont contains some placeholder glyphs - the little black square with hex digits in them that you see here:
I didn't want to draw them. But they exist in the font. So how do I skip them?
Using the
Python Unicode Database
it's possible to look up the name of a Unicode code-point. e.g.
chr(65)
is
LATIN CAPITAL LETTER A
. So if there is
no name
in the database, skip that character.
But, of course, it is a bit more complicated than that! The Unicode database only goes up to Unicode 13. And, for some reason, the control characters don't have names. So the code becomes a tangled mess of
if...else
statements. Ah well!
Drawing the characters
should
have been easy. I was using
Pillow to draw text
. But, despite the pixely nature of the font itself Pillow was performing anti-aliasing - creating unwanted grey subpixels.
Sadly,
Pillow can't draw non-printable glyphs
- even when the font contains something drawable. This is because it can't pass the correct options to the harfbuzz library.
So, I went oldskool! I converted every glyph in the font to a PNG and saved them to disk.
Python 3from fontforge import*
font =open("unifont_upper-14.0.04.ttf")
for i inrange( len(font) ) :
try:
font[i].export( "pngs/"+str(i) +".png", pixelsize=16, bitdepth=1)
except Exception as e:
print ( str(i) )
print ( e )
Look, if it's hacky but it works; it isn't hacky! Right?
From there, it's a case of opening the .png and pasting it onto the canvas:
And now we hit the final problem. The image was over 20,000 pixels wide. Why? The
Variation Selectors
! The last of which is at position
U+E01EF
. Which means the spiral looks like this:
The GNU Unifont has a dual licence.
GPL2 and OFL
. The image is a "document" for the purposes of the OFL and the GPL font exemption. But, I guess you could reverse engineer a font-file from it. So, if you use the image to generate a font, please consider that it inherits the
original licence
. If you just want to print it out, or use it as art, then the image itself is
CC BY-SA
.
I would like to print this out on paper. At 200dpi, it would be about 1.5m squared. Which I guess is possible, but might be expensive.
At 600dpi, the square will just about fit on A3 paper. But the quality is atrocious. Even at A0 it wasn't great. Realistically, it needs to be at least 3.3 metres along each side! No idea where I can find a printer which will do that. Or where in my house I'd have space for it!
Of course, it will need updating whenever there is a new release of either Unicode or Unifont.
If you have any suggestions or feedback - please drop them in the comment box!
A spoofed email address and an easily faked document is all it takes for major tech companies to hand over your most personal information.
Photograph: Maxkabakov/Getty Images
When a privacy
specialist at the legal response operations center of Charter Communications received an emergency data request via email on September 4 from Officer Jason Corse of the Jacksonville Sheriff’s Office, it took her just minutes to respond, with the name, home address, phone numbers, and email address of the “target.”
But the email had not in fact come from Corse or anyone else at the Jacksonville Sheriff’s Office. It was sent by a member of a hacking group that provides doxing-as-a-service to customers willing to pay for highly sensitive personal data held by tech companies in the United States.
“This took all of 20 minutes,” Exempt, a member of the group that carried out the ploy, told WIRED. He claims that his group has been successful in extracting similar information from virtually every major US tech company, including Apple and Amazon, as well as more fringe platforms like video-sharing site Rumble, which is popular with far-right influencers.
Exempt shared the information Charter Communications sent to the group with WIRED, and explained that the victim was a “gamer” from New York. When asked if he worried about how the information he obtained was used against the target, Exempt said: “I usually do not care.”
The victim did not respond to WIRED’s requests for comment.
“It is definitely concerning to hear criminals impersonating officers in such a manner, more so when they are claiming to be one of our employees,” says Christian Hancock, the media relations manager at the Jacksonville Sheriff’s Office. Officer Corse declined to comment.
Charter Communications declined to comment.
This method of tricking companies into handing over information that can be used to harass, threaten, and intimidate victims has been
known about for years
. But WIRED has gained unprecedented insight into how one of these doxing groups operates, and why, despite years of warnings, it is still happening so often.
The Charter Communications incident was one of up to 500 successful requests Exempt claims to have made in recent years. To back up his claims, the hacker shared multiple documents and recordings with WIRED, including what he claimed were screenshots of email requests, fake subpoenas, responses from tech companies, and even a video recording of a phone call with one company’s law enforcement response team, which was seeking to verify a request. Exempt also shared evidence suggesting that a current law enforcement officer (Exempt refused to provide the officer’s location or name) was in contact with the group about allegedly working with them to submit requests from his own account in return for a cut of the profits.
“All I need is an IP address, which I can gain pretty easily, [and] next thing you know I have names, addresses, emails, and cell numbers,” says Exempt, adding that he can then use that information to make emergency data requests. “And with a subpoena and search warrant, I can access DMs, texts, call logs. That’s someone’s full life in my hands in the space of hours, depending on the response times of the company or provider.”
This type of doxing appears to be a lucrative business. Exempt claims his group brought in over $18,000 in the month of August alone. In one case, Exempt says he was paid $1,200 for a single dox of a person who was supposedly “grooming minors on an online gaming platform he owns. The individual was then allegedly promptly swatted.”
WIRED reviewed the information posted online about a 23-year-old from the southwestern US, which includes their home address, phone number, email addresses, and social media accounts. The person did not respond to WIRED’s request for comment. WIRED was unable to independently confirm if the person was swatted.
In the US, federal, state, and local law enforcement agencies who need to identify the owner of a social media account, or details about a specific phone, send the relevant company a subpoena or warrant requesting the information.
All major companies operating in the US have departments and specific staff assigned to dealing with these requests, which are typically sent via email. The companies, once they review the subpoena and see it has come from what looks like a law enforcement agency, typically comply with the requests, sometimes taking additional verification steps such as phoning the officer involved to confirm that they did indeed send the request.
But officers can also make emergency data requests, or EDRs, in cases involving a threat of imminent harm or death. These requests
typically bypass
any additional verification steps by the companies who are under pressure to fulfill the request as quickly as possible.
This is the loophole that hackers like Exempt, who says he is “a Gen Z male located within the Europe area,” can exploit.
The problem partly stems from the fact that there are around 18,000 individual law enforcement agencies in the US, all of which use their own email naming conventions and domain registrations, including .us, .net, .org, .gov, and .com.
The hackers typically use one of two ways to trick companies into making them believe the emails are coming from real law enforcement agencies. In some cases, they use authentic law enforcement email accounts that they have compromised via social engineering or using credentials stolen in previous hacks. Other times, they create convincing fake domains that closely mimic legitimate police departments.
“This was an email address that looked like the real thing,” says Exempt, explaining the mechanics of how he tricked Charter Communications. “The real domain of the Jacksonville Sheriff’s Office in Florida is jaxsheriff.org. We purchased jaxsheriff.us and then spoofed our number as the department’s, so that when we called them to verify receipt of the legal process, when they searched the number, it would come back to the sheriff’s office, giving them no reason to doubt it. We use real badge numbers and officer names as well.”
The hackers also craft highly convincing fake official documents by mimicking official records.
“We look at real subpoenas through public records where available and use the legally correct wording and sections of the law in the subpoena so that everything is legally correct and binding, so that we realistically have zero percent chance of them second-guessing it,” says Exempt. This has worked in multiple states and courts in the US, he claims.
“As an extra verification step, we sometimes check online to see if the named judge is actually in court that day, so that if a company was to phone up and verify, they would be in the building but most likely be too busy to be able to verify the singular document,” says Exempt.
In many cases, Exempt says, the email and attached subpoena is enough to extract the information. In one example shared with WIRED, Exempt claims that his group, which he says is made up of around nine people located across Europe and the US, was able to obtain the information used to register the official Rumble account belonging to British far-right activist Tommy Robinson.
Robinson and Rumble did not respond to requests for comment.
Even in cases where companies do take additional steps to verify the subpoenas are coming from real officers, the hackers are able to circumvent this.
In a recording of a phone call shared with WIRED, a representative from Amazon’s law enforcement response team called the number included in the faked email Exempt sent, and spoke with Exempt to verify that they had received the documents she had sent him via an online portal.
“Amazon identified and blocked someone that was requesting data from us while impersonating law enforcement,” says Adam Montgomery, an Amazon spokesperson. “The impersonator received basic account data for fewer than 10 customers. We quickly took steps to protect these customer accounts, and have put additional safeguards in place to prevent this from happening again.”
When asked for details of what those safeguards were, Amazon declined to comment.
While the hackers are clearly exploiting massive loopholes in the system, in some cases, the tech companies themselves have laid out step-by-step guides on how to craft these requests.
“In order to request that Apple voluntarily disclose information on an emergency basis, the requesting government or law enforcement officer should complete the Emergency Government & Law Enforcement Information Request form and transmit it directly from their official government or law enforcement email address to [a specific @apple.com email address] with the words “Emergency Request” in the subject line,” Apple
writes
.
Exempt shared with WIRED an example of a request he made to Apple using a fake subpoena as well as the information Apple sent back to him that included an iCloud account holder’s home address, cell phone number, and email addresses. Apple did not respond to a request for comment.
One
online database
maintained by SEARCH, a nonprofit criminal justice support organization, lists direct contact details for the law enforcement divisions of over 700 internet service providers and other online content providers.
“The core issue isn't companies being careless, it's that traditional communications channels, like email, weren't built for the level of identity verification, context evaluation, and real-time decisioning that modern investigations and legal compliance require,” says Matt Donahue, a former FBI agent who left the agency in 2020. Soon after, Donahue founded Kodex, a company that works with business clients to build secure online portals that law enforcement can use to make data requests.
While technologies like Kodex provide a much safer alternative to email, over 80 percent of the companies listed on the SEARCH database still accept emergency data requests via emails, according to one review
conducted by Kodex
,
But even those who only use Kodex are not in the clear. Exempt claims that he was able to make requests through Kodex for a period of time, using compromised law enforcement email accounts. However, because of Kodex’s enhanced safety features, including whitelisting specific devices from which requests can be made, Exempt and his group have now lost access to the system.
The hacker claims, however, that they are now working to regain access via another avenue.
“We are in talks with a deputy from a large sheriff’s office … who we got paid to dox [and] who is now interested in either renting his Kodex account to us or he may submit the requests for us on his side,” says Exempt. “This is in [the] very early stages of talks. He would want a percentage of the money we make and his dox removed on a well-known doxing site.”
To back up his claim, Exempt shared a screenshot of an alleged text exchange with the officer, including a blurred image that he refers to as his ID card. “Y’all have the SSN and the rest of the info you need about me and my fam,” the alleged officer wrote in a message. “I’m on the fence about it right now, but we will all get what we want out of this if we do a d[eal].”
When asked if he thought it was possible the officer was trying to entrap them, Exempt said probably not, “just for the fact he has been doxed, and within that dox, some pretty damning stuff about said officer came out, which he clearly wants removed. So I’m pretty certain he is being honest about the fact he is considering it.”
Donahue says Kodex’s system could flag such behavior because it is able to “pattern-match” the behavior of law enforcement agents and how they interact with companies that use the Kodek platform. “We can and do detect behavioral changes that allow us to protect our customers on a continuous basis as opposed to a one-time verification,” says Donahue.
While the hackers are taking advantage of the weakness in email security, they are also taking advantage of companies’ desire to help law enforcement save lives.
“Public/private-sector coordination is an incredibly complex and nuanced space that could very well be the difference between a kid being found in a trunk, or not,” says Donahue. “Lawful government data requests sit at the very unique intersection of data privacy, public safety, security, legal compliance, and civil rights, so anyone suggesting these requests are carelessly responded to in minutes has little to no understanding of the subject matter.”
David Gilbert
is a reporter at WIRED covering disinformation, online extremism, and how these two online trends impact people’s lives across the globe, with a special focus on the 2024 US presidential election. Prior to joining WIRED, he worked at VICE News. He lives in Ireland. ...
Read More
Apple has locked my Apple ID, and I have no recourse. A plea for help
Summary:
A major brick-and-mortar store sold an Apple Gift Card that Apple seemingly took offence to, and locked out my entire Apple ID, effectively bricking my devices and my iCloud Account, Apple Developer ID, and everything associated with it, and I have no recourse. Can you help? Email paris AT paris.id.au (and read on for the details). ❤️
Here’s how Apple “Permanently” locked my Apple ID.
I am writing this as a desperate measure. After nearly 30 years as a loyal customer,
authoring technical books on Apple’s own programming languages (Objective-C and Swift)
, and spending tens upon tens upon tens of thousands of dollars on devices, apps, conferences, and services, I have been locked out of my personal and professional digital life with no explanation and no recourse.
The Situation
My Apple ID, which I have held for around 25 years (it was originally a username, before they had to be email addresses; it’s from the iTools era), has been permanently disabled. This isn’t just an email address; it is my core digital identity. It holds terabytes of family photos, my entire message history, and is the key to syncing my work across the ecosystem.
The Trigger:
The only recent activity on my account was a recent attempt to redeem a $500 Apple Gift Card to pay for my 6TB iCloud+ storage plan. The code failed. The vendor suggested that the card number was likely compromised and agreed to reissue it. Shortly after, my account was locked.
An Apple Support representative suggested that this was the cause of the issue: indicating that something was likely untoward about this card.
The card was purchased from a major brick-and-mortar retailer (Australians, think Woolworths scale; Americans, think Walmart scale), so if I cannot rely on the provenance of that, and have no recourse, what am I meant to do? We have even sent the receipt, indicating the card’s serial number and purchase location to Apple.
The Consequence:
My account is flagged as “closed in accordance with the Apple Media Services Terms and Conditions”.
The Damage:
I effectively have over $30,000 worth of previously-active “bricked" hardware. My iPhone, iPad, Watch, and Macs cannot sync, update, or function properly. I have lost access to thousands of dollars in purchased software and media.
Apple representatives claim that only the “Media and Services” side of my account is blocked, but now my devices have signed me out of iMessage (and I can’t sign back in), and I can’t even sign out of the blocked iCloud account because… it’s barred from the sign-out API, as far as I can tell.
I can’t even login to the “Secure File Transfer” system Apple uses to exchange information, because it relies on an Apple ID. Most of the ways Apple has suggested seeking help from them involve signing in to an Apple service to upload something, or communicate with them. This doesn’t work as the account is locked.
I can’t even download my iCloud Photos, as:
There are repeated auth-errors on my account, so I can’t make Photos work;
I don’t have a 6TB device to sync them to, even if I could.
The Support Nightmare
I contacted Apple Support immediately (Case ID: 102774292094). The experience was terrifyingly dismissive:
No Information:
Support staff refused to tell me
why
the account was banned or provide specific details on the decision.
No Escalation:
When I begged for an escalation to Executive Customer Relations (ECR), noting that I would lose the ability to do my job and that my devices were useless, I was told that “an additional escalation won’t lead to a different outcome”.
Many of the reps I’ve spoken to have suggested strange things, one of the strangest was telling me that I could physically go to Apple’s Australian HQ at Level 3, 20 Martin Place, Sydney, and plead my case. They even put me on hold for 5 minutes while they looked up the address.
The “New Account” Trap
Most insultingly, the official advice from the Senior Advisor was to “create a new Apple account… and update the payment information”.
This advice is technically disastrous:
The Legal Catch:
Apple’s Terms and Conditions rely on “Termination of Access.” By closing my account, they have revoked my license to use their services.
The Technical Trap:
If I follow their advice and create a new account on my current devices (which are likely hardware-flagged due to the gift card error), the new account will likely be linked to the banned one and disabled for circumventing security measures.
The Developer Risk:
As a professional Apple Developer, attempting to “dodge” a ban by creating a new ID could lead to my Developer Program membership being permanently blacklisted, amongst other things.
I am asking for a human at Apple to review this case. I suspect an automated fraud flag regarding the bad gift card triggered a nuclear response that frontline support cannot override. I have escalated this through my many friends in WWDR and SRE at Apple, with no success.
I am desperate to resolve this and restore my digital life. If you can help, please email paris AT paris.id.au
To
encrypt email in 1998
you’d run GnuPG from a terminal, importing the recipient’s public key into your local keyring then copying your email text into a file then encrypting the file for that public key:
gpg -e -r alice file
. Finally you’d copy the encrypted message into your email client and send it out.
In
2025
, it’s pretty much the same. In some respects, it’s worse:
It feels like fewer people care about email encryption today than they did in 2010.
Web-based email has become dominant, and that shift works against PGP usage. Desktop clients at least offered
some
support (native in Thunderbird, third-party extensions for Outlook and Apple Mail.) Most webmail services, by contrast, offer no native PGP support at all.
Proton
is a notable exception.
“But there’s S/MIME!”
S/MIME (
RFC 2311
) was standardized around the same time as OpenPGP (
RFC 2440
), in 1998. PGP’s trust model is the “web of trust”, though often TOFU in practice, while S/MIME’s model is the more organization-friendly hierarchical PKI model.
As a result, S/MIME is more common than PGP in enterprise email. It’s also better supported by email clients. Even
Gmail
for organizations
supports S/MIME
. You need a basic PKI and to generate key pairs, and then to distribute them manually:
What about
Microsoft/Azure
, the dominant enterprise stack? You’d expect managed endpoints to support key generation and distribution across an organization—centrally administered, cross-platform. In practice, Microsoft makes this harder than it should be. The process remains largely manual, poorly documented, and needlessly tedious.
Why nobody seems to care?
Auditors obsess over
encryption at rest
—from laptop FDE to databases’ security theaterish at-rest encryption—and over
encryption in transit
, usually meaning TLS. But they seldom bring up email encryption and send confidential email text and attachments like there’s no tomorrow.
The reality is blunt: most email traffic doesn’t
enforce
encryption, as
MTA-STS
adoption remains very low. Opportunistic encryption (
STARTTLS
) is more common, but obviously vulnerable to downgrade attacks.
There are even fewer incentives to fix this today, now that we have session-based messaging systems—mostly
Signal
, but also Olvid, Threema, and WhatsApp. Their statefulness enables protocols that, unlike PGP or S/MIME, protect against replays and provide forward secrecy (and the less critical “post-compromise security”).
Another factor is simple displacement. We use email far less than we did in 2005. Most internal written communication now happens over Slack or Teams or similar platforms. These systems are not encrypted save for the client-to-server link, with the server often running in third-party infrastructure
So expect less and less PGP and S/MIME and, if we’re lucky, a bit more MTA-STS.
If you’re familiar with nearly any mainstream programming language,
and I asked you to draw a diagram of an array, the array indices, and
the array elements, odds are good you’d produce a diagram something
like this:
In this post, I want to persuade you to replace that image, or, at
least, to augment it with an alternate view on the world.
I want to argue that, rather than numbering elements of an array, it
makes just as much sense, and in many cases more, to number the spaces
between
elements:
With this representation, we do have to relearn how to refer to
indexing: We refer to
A[i]
no longer as “The element at index
i
”,
but rather “The element to the right of index
i
”.
I’ll run through a few reasons why I prefer this representation, but
most of them boil down to representing
ranges of elements
.
Suppose we have an array, and we want a way to refer to a certain
subset of it, like so:
One obvious answer is with a start and a length:
start=2 length=3
,
but it’s often convenient to represent a range as a pair of
(start, end)
indices. The latter representation, for example, lets you check
if an index falls into the range directly.
If we number elements, it’s not immediately apparent which index to
use for
end
:
Both
(1, 3)
and
(1, 4)
seem initially defensible. But if we number
between
elements, there’s a clear, unique answer:
The indices we want are the ones that lie between the included and
excluded elements:
(1, 4)
.
With this model, the rules of range manipulation and comparison become
straightforward:
Two ranges are adjacent if
left.end == right.start
One range is a subset of another if
inner.start >= outer.start && inner.end <= outer.end
A range contains
end - start
elements:
In order to answer the question “if I dereference an index, is the
result contained in the range?”, we need to remember that
A[i]
is
now defined as the element
after
index
i
. With that in mind, it
becomes easy to see that for a range
(start, end)
, elements indexed
by
start <= i < end
are within the range.
Indexing between elements, instead of indexing elements, helps avoid a
large class of off-by-one errors. I’ll run through a number of
examples using Python, but the fundamental issues apply to array APIs
in many more languages, or to any time you’re manipulating an
array-like data structure via any of these operations.
If you want to insert an element into an array, how do you specify the
location? If you name an existing element, does the new element go
before or after that element?
Python’s
standard library documentation
somewhat awkwardly
specifies that “The first argument is the index of the element before
which to insert,” clarifying that this means
insert(0, X)
inserts
X
at the start of the array.
But if we number gaps between elements, instead of numbering elements,
the story is perfectly clear:
0
names the gap before the first
element, and so of course inserting at
0
should prepend an
element. Similarly,
1
names the gap between the first and second
element, and all the way on.
How do we refer to a partial subset of an array that we want to
extract? Python, like many other languages, lets you use a pair of
indexes:
>>> [1,2,3,4][1:3]
[2, 3]
The
documentation
, however, has to resolve the same
ambiguity noted above: Is the final index excluded or included? Ruby
even helpfully offers you both choices:
As discussed earlier, if we adjust our view of indexes, there is no
ambiguity at all. Conveniently, this also gives us the same semantics
as Python and most other languages: there are
good reasons
half-inclusive ranges are generally preferable, and most languages
converge on this choice.
If we want to remove a single element from an array, it does seem
simpler to index elements directly – we can just name directly the
index which we want to eliminate.
However, if we want to adopt the more general primitive, of removing
slices, (Python’s
del array[x:y]
), we run into the same problem as
extracting slices, previously. Once again, shifting our thinking to
index between elements removes all ambiguity.
Suppose we’re walking through an array, consuming it by elements or
groups of elements at a time. Perhaps we’re parsing a string,
consuming tokens as we go.
How do we keep track of our current position? Should we keep the index
of the last element we’ve processed, or of the first element we have
yet to process?
If we shift our perspective, this problem too vanishes: We can store
the index between the last item consumed, and the next one to be
consumed. Our index neatly partitions the buffer into “processed” and
“to-be-processed”, with no ambiguity at all.
With pointers in C, or with iterators in C++ (which were essentially
designed to mimic C’s pointer semantics), we speak of pointers into an
array or of iterators as referring to a specific element in memory.
However, both systems allow for this additional “valid” iterator or
pointer, which points “just past the end” of a container. This
pointer/iterator does not name a valid element, but is a valid pointer
or iterator. The C specification is full of awkward verbiage to
address this special-case:
both [pointers] shall point to elements of the same array object, or
one past the last element of the array object;
(N1256 §6.5.6p9). And with a C++
std::vector
,
v.begin()
and
v.end()
are both valid iterators, but
v.end()
points “one past the
end” and cannot be dereferenced.
These apparent odd inconsistencies and special cases vanish if you
shift your thinking slightly in just the way I’ve been arguing:
Instead of thinking of iterators as referring to individual elements,
we hold that they name the interstitial points between elements.
If we do so, the “one-past-the-end” iterator is no longer “past” the
end – it points directly at the end, which is no more fundamentally
special than the “start” iterator which points directly at the
beginning.
It’s still the case that we cannot
dereference
v.end()
, but that
behavior is a function of the “dereference” operation, which selects
the element
after
an iterator. The iterators themselves are no
longer special cases.
It used to be popular, and still is in some circles, to debate whether
programming languages ought start array indexing at
0
or
1
. And
there are still a few holdouts, like Matlab, which number their arrays
starting from 1, causing no end of pain and confusion to those poor
souls who have to switch between them and more mainstream languages.
Once I started thinking of pointers or iterators or indexes as
indexing between elements, one of the more mind-bending realizations
that followed was that this model can harmonize the “index from 0” and
“index from 1” camps!
Let’s consider an array with interstices labeled again:
The first element,
A[0]
in C or Python, is bracketed by indexes
0
and
1
. The decision, then to name it as
1
, does not involve
changing the picture at all; If you draw indices between elements, the
statement “Arrays start at 1” is simply a decision that “The deference
operator refers to the element to the left of an index,” in exactly
the same way that I described dereference in a 0-indexed language as
taking the element to the right. And presented that way – “should
dereference take the left or the right element?” – it becomes clear
that
0
or
1
really is an arbitrary choice.
I credit my personal shift in thinking – from labeling elements to
labeling interstices – to my reading of the excellent
The Craft Of Text Editing
, which introduces the concept both for
its notion of a
mark
, and specifically as an implementation idea
while talking about
buffer representations
.
I recommend giving the book a read even if you never aspire personally
to implement an Emacs; It’s filled with a great number of interesting
ideas, and possessed of a profound clarity of thought throughout.
Show HN: Claude Code recipes for knowledge workers
This collection contains 100 practical recipes for using Claude Code to automate, accelerate, and enhance your professional work. Each recipe provides step-by-step instructions, ready-to-use prompts, and real-world examples that you can apply immediately.
Whether you're drafting emails, analyzing data, preparing presentations, or managing complex projects, there's a recipe here that will save you hours of work.
Want all 200 recipes as ready-to-use slash commands?
The
Premium Collection
includes 200 slash commands you can install in seconds. Just type
/recipe-001
and Claude does the rest. No copying prompts. No setup.
$79.99 one-time purchase.
Version 1.0 — Built for Claude Code — December 2025
Quoting OpenAI Codex CLI
Simon Willison
simonwillison.net
2025-12-13 03:47:43
How to use a skill (progressive disclosure):
After deciding to use a skill, open its SKILL.md. Read only enough to follow the workflow.
If SKILL.md points to extra folders such as references/, load only the specific files needed for the request; don't bulk-load everything.
If scripts/ exist, prefer...
After deciding to use a skill, open its
SKILL.md
. Read only enough to follow the workflow.
If
SKILL.md
points to extra folders such as
references/
, load only the specific files needed for the request; don't bulk-load everything.
If
scripts/
exist, prefer running or patching them instead of retyping large code blocks.
If
assets/
or templates exist, reuse them instead of recreating from scratch.
Description as trigger: The YAML
description
in
SKILL.md
is the primary trigger signal; rely on it to decide applicability. If unsure, ask a brief clarification before proceeding.
Surveillance Pricing: The Invisible Way We're Being Gouged. Know Your Rights When Dealing With ICE. Why Gerrymandering Won't Save Republicans. Brad Lander for Congress. Rescuing the Internet From “Enshittification” | The Daily Show.
Google has removed dozens of new Sci-Hub domain names from its search results in the United States. Unlike typical DMCA takedowns, the removals were triggered by a dated court order that was not enforced for several years. This appears to be one of the first times Google has deindexed an entire pirate site in the U.S. based on a 'site blocking' style injunction.
In 2017, American Chemical Society (ACS), a leading source of academic publications in the field of chemistry, won a lawsuit against Sci-Hub and its operator, Alexandra Elbakyan.
The ‘Pirate Bay of Science’ had failed to appear at a Virginia federal court, resulting in an easy win for the publisher and a
$4.8 million default judgment
award for damages.
A Broad Anti-Piracy Injunction (2018)
More important, perhaps, was the broad permanent injunction that the Virginia federal court
signed off on in 2017
. This order effectively gave ACS free rein to take down
existing and newly registered
Sci-Hub domain names.
The injunction also required all parties “in active concert or participation” with Sci-Hub to “cease facilitating access” to these domain names, including search engines, hosting providers, ISPs, and domain name registrars, the order clarified.
From the 2018 injunction
On paper, this injunction enabled ACS to request American ISPs and search engines to ‘block’ existing and future Sci-Hub domains. However, there was no sign that the publisher was doing so. Aside from a few suspended domains, Sci-Hub remained widely accessible.
Whether ACS did not feel the need to enforce the order against search engines and other intermediaries or if these companies actively objected to the requested actions was unknown. And as time passed, the injunction became a distant memory, at least for a few years.
Google Complies with Zombie Injunction? (2025)
Earlier this week we spotted a unique request in the
Lumen Database
, where the 2018 injunction was cited. The
notice in question
asks Google to deindex 34 (sub)domains linked to Sci-Hub.
None of these domains were referenced in the 2018 injunction but are indeed linked to Sci-Hub. Many of the partially redacted domains appear to be domain variations of the scihubtw.tw mirror network, such as edu.scihubtw.tw and freeus.scihubtw.tw.
Court order notice
It’s surprising to see this type of enforcement seven years after the injunction was issued, but the request is legitimate. Google is certainly taking it seriously and has deindexed these domains from its search results in America. In other countries, the same domains remain accessible.
First “US-Only” Sci-Hub Removals
The December 2 notice was sent by
UK law firm Wiggin LLP
, which sent a similar request in September this year, targeting a few dozen other Sci-Hub domains. In total, we spotted seven notices, with the earliest dating back to 2022.
The results of these removals are also clearly visible in Google search. Those who search for Sci-Hub in the U.S. will see the following notice at the bottom of the results.
Removed by legal request
It’s not clear why it took five years before ACS urged Google to take action in response to the injunction. However, these removals are similar to Google’s
removal of pirate site domains
in other countries in response to ISP-blocking orders. Voluntary cooperation by Google was
uncovered
shortly before ACS first notified the search engine.
“In Active Concert”?
Google’s voluntary cooperation with ISP blocking orders in Australia, the Netherlands, France, the UK, and elsewhere also brings up an important question. Is Google cooperating with the permanent injunction in the U.S. because it feels legally compelled to do so, or is that a voluntary gesture too?
The 2018 injunction requires all parties “in active concert or participation” with Sci-Hub to take action. While search engines are mentioned as an example, Google and other tech companies have
previously argued
that neutral third-party services are not necessarily “in active concert or participation”.
It is likely that Google maintains this stance, opting to voluntarily comply with orders targeting other third parties. That would mirror its response to site-blocking orders elsewhere.
We contacted Google hoping to hear answers to these questions, but the company did not respond to our request for comment.
The most complete YM2149/AY-3-8910 ecosystem in Rust.
What is the YM2149?
The
Yamaha YM2149
(and its compatible sibling, the General Instrument
AY-3-8910
) is a
Programmable Sound Generator (PSG)
— a dedicated audio chip that defined the sound of an entire computing era.
Three square-wave channels. One noise generator. Hardware envelopes. Pure 8-bit/16-bit retro soul.
If you've ever heard music from an
Atari ST
,
Amstrad CPC
,
ZX Spectrum 128
,
MSX
, or countless arcade machines from the 1980s/90s, you've heard this chip. It powered everything from game soundtracks to the legendary European demoscene, where programmers pushed (and still push) these simple waveforms to create surprisingly complex and powerful music.
The YM2149 doesn't do wavetables or samples (mostly). It doesn't do FM synthesis. What it does is generate raw, characterful square waves with programmable frequencies, a shared noise source, and distinctive hardware envelopes — all mixed through a logarithmic DAC that gives it that unmistakable warm, buzzy,
chiptune
sound.
This crate brings that sound to Rust
— cycle-accurate, format-complete, and ready for your emulator, game, or nostalgia project.
Why YM2149-RS?
For Demoscene Enthusiasts & Chiptune Artists:
Play back your entire collection of YM, SNDH, AY, and Arkos Tracker files with authentic sound reproduction — in the terminal, browser, or your next retro-inspired game.
For Game Developers:
Drop authentic PSG audio into Bevy games with a single plugin. Playlists, crossfades, visualizations, and audio-reactive gameplay hooks included.
For Emulator Authors:
A clean, well-tested YM2149 core with configurable backends. Integrate the chip into your Atari ST, CPC, or custom system emulator.
For the Curious:
Explore how classic sound chips work. The codebase is documented, tested, and designed to be readable.
What Makes This Special
Feature
Description
Cycle-Accurate Core
Precise emulation of all PSG features — envelopes, noise, mixer, SID voice, Sync Buzzer, and digi-drum effects
Multi-PSG Emulation
Run multiple YM2149 chips in parallel — natively supported via Arkos Tracker format for authentic dual/triple-chip music
Seven Format Replayers
YM (1-6), YMT1/YMT2, GIST (.snd), Arkos Tracker (.aks), ZXAY/EMUL (.ay), and SNDH with full 68000 CPU emulation
Zero-Compromise Bevy Integration
Not a wrapper around C code — pure Rust from chip to speaker
Runs Everywhere
CLI, native apps, WASM browser player, Bevy games — same codebase
Cycle-accurate Yamaha YM2149 tooling for Rust — from raw PSG emulation and YM/YMT/SNDH importers to Arkos Tracker playback, CLI/export pipelines, Bevy integrations, visualization stacks, and a one-click WASM demo.
Arkos Tracker is the de-facto “modern” workflow for YM2149/AY musicians: it blends a classic step-sequencer with a visual instrument designer, supports multiple PSGs per song, and lets composers mix hardware envelopes with software macros. Native support matters because:
Multi-PSG music
– Arkos sequences can target two or more AY chips; our replayer handles that natively, both in the CLI and Bevy.
Modern authoring tools
– Musicians can stay in the Arkos editor (PC/Mac) and drop the
.aks
export straight into any crate in this repo—no external tracker runtime or C++ bridge required.
Feature parity
– Hardware effects (Sync Buzzer, DigiDrum, SID), custom arps, and per-channel envelopes all map to the same PSG core shared with YM/AY playback.
Cross-target builds
– The same Rust replayer powers desktop, web (WASM), and Bevy integrations, so Arkos rips behave identically everywhere.
In short: Arkos lets artists work with modern ergonomics, and this workspace lets those songs run anywhere Rust does.
Quick Start
Use the Core Library
[dependencies]
# Core emulator only (minimal dependencies)ym2149 = "0.7"# With streaming audio outputym2149 = { version = "0.7", features = ["streaming"] }
# YM file parsing and playbackym2149-ym-replayer = "0.7"
use ym2149_ym_replayer::{load_song,ChiptunePlayer,ChiptunePlayerBase,PlaybackMetadata};fnmain() -> anyhow::Result<()>{let data = std::fs::read("song.ym")?;let(mut player, summary) = load_song(&data)?;// Use the unified ChiptunePlayerBase interface for playback
player.play();let samples = player.generate_samples(summary.samples_per_frameasusize);// Access metadata via ChiptunePlayer trait (extends ChiptunePlayerBase)let meta = player.metadata();println!("{} by {} • {} frames", meta.title(), meta.author(), summary.frame_count);Ok(())}
Run the CLI Player
# Real-time playback with scope overlay
cargo run -p ym2149-replayer-cli -- examples/ym/ND-Toxygene.ym
# Play SNDH files from the Atari ST demoscene
cargo run -p ym2149-replayer-cli -- examples/sndh/Mad_Max/Buzzer.sndh
# Play GIST sound effects (.snd)
cargo run -p ym2149-gist-replayer --example player -- examples/gist/alien.snd
# Interactive demo with Bevy visualization
cargo run -p bevy_ym2149_examples --example basic_example
Export to Audio Files
use ym2149_ym_replayer::{load_song, export::export_to_wav_default, export::ExportConfig};fnmain() -> anyhow::Result<()>{let data = std::fs::read("song.ym")?;let(mut player, info) = load_song(&data)?;// Export to WAV (feature: export-wav)export_to_wav_default(&mut player, info,"output.wav")?;Ok(())}
Note: MP3 export was removed because the system-dependent LAME/Autotools toolchain proved too brittle. Export WAV instead and transcode externally (e.g.
ffmpeg -i output.wav -b:a 192k output.mp3
).
Add the Bevy Plugin
use bevy::prelude::*;use bevy_ym2149::{Ym2149Playback,Ym2149Plugin};use bevy_ym2149_viz::Ym2149VizPlugin;fnmain(){App::new().add_plugins((DefaultPlugins,Ym2149Plugin::default(),Ym2149VizPlugin::default())).add_systems(Startup, |mutcommands:Commands| {
commands.spawn(Camera2d);
commands.spawn(Ym2149Playback::new("assets/music/song.ym")).insert(Name::new("Tracker"));}).run();}
examples/
– curated list of
.ym
,
.aks
,
.ay
, and
.sndh
files for regression tests and the wasm demo
Need to refresh the wasm demo bundle? Run
scripts/build-wasm-examples.sh
from the repo root to rebuild via
wasm-pack
and copy the output into
crates/ym2149-wasm/examples/pkg/
.
Testing
# Entire workspace
cargo test --workspace
# Focus a crate
cargo test -p ym2149
cargo test -p bevy_ym2149
# Feature-specific tests
cargo test -p ym2149 --features streaming
Development Prerequisites
Rust 1.83+ (Rust 2024 edition) with
cargo
and
rustfmt
Audio backend libraries for CPAL/Rodio (ALSA/PulseAudio, CoreAudio, WASAPI, etc.) when testing real-time playback
AY playback: ZX-only, firmware calls are unsupported (CPC/ROM-heavy AY files will be rejected)
Leonard/Oxygene (Arnaud Carré)
– YM format specification, ST-Sound reference material, and the
AtariAudio
C++ implementation that forms the basis of our YM2149 core emulation
Atari ST + demoscene community
– for the original tunes, SNDH archive, and documentation
Rust audio and Bevy ecosystems
– rodio/cpal, Bevy ECS, and community inspiration
When Oliver Sacks arrived in New York City, in September, 1965, he wore a butter-colored suit that reminded him of the sun. He had just spent a romantic week in Europe travelling with a man named Jenö Vincze, and he found himself walking too fast, fizzing with happiness. “My blood is champagne,” he wrote. He kept a letter Vincze had written him in his pocket all day, feeling as if its pages were glowing. Sacks had moved to New York to work as a fellow in neuropathology at the Albert Einstein College of Medicine, in the Bronx, and a colleague observed that he was “walking on air.” Every morning, he carefully polished his shoes and shaved. He adored his bosses. “I smile like a lighthouse in all directions,” he wrote Vincze.
Sacks was thirty-two, and he told Vincze that this was his first romantic relationship that was both physical and reciprocal. He felt he was part of a “two man universe,” seeing the world for the first time—“seeing it clear, and seeing it whole.” He wandered along the shipping piers on the Hudson River, where gay men cruised, with a notebook that he treated as a diary and as an endless letter to Vincze. “To watch life with the eyes of a homosexual is the greatest thing in the world,” Vincze had once told Sacks.
Sacks’s mother, a surgeon in London, had suspected that her son was gay when he was a teen-ager. She declared that homosexuality was an “abomination,” using the phrase “filth of the bowel” and telling him that she wished he’d never been born. They didn’t speak of the subject again. Sacks had moved to America—first to California and then, after five years, to New York—because, he wrote in his journal, “I wanted a
sexual and moral freedom
I felt I could never have in England.” That fall, during Yom Kippur, he decided that, rather than going to synagogue to confess “to the total range of human sin,” a ritual he’d grown up with, he’d spend the night at a bar, enjoying a couple of beers. “What I suppose I am saying, Jenö, is that I now feel differently about myself, and therefore about homosexuality as a whole,” he wrote. “I am through with cringing, and apologies, and pious wishes that I might have been ‘normal.’ ” (The Oliver Sacks Foundation shared with me his correspondence and other records, as well as four decades’ worth of journals—many of which had not been read since he wrote them.)
In early October, Sacks sent two letters to Vincze, but a week passed without a reply. Sacks asked his colleagues to search their mailboxes, in case the letter had been put in the wrong slot. Within a few days, however, he had given up on innocent explanations. He began dressing sloppily. He stopped coming to work on time. He had sex with a series of men who disgusted him.
After two weeks, Vincze, who was living in Berlin, sent a letter apologizing for his delayed reply and reiterating his love. He explained that he was so preoccupied by thoughts of Sacks that he felt as if he were living in a “Klaudur,” a German word that Vincze defined as a “spiritual cell.” He seems to have misspelled
Klausur
, which refers to an enclosed area in a monastery, but Sacks kept using the misspelled word, becoming obsessed with it. “It ramifies in horrible associations,” he wrote Vincze. “The closing of a door. Klaudur, claustrophobia, the sense of being shut in.” Sacks had long felt as if he were living in a cell, incapable of human contact, and this word appeared to be all he needed to confirm that the condition was terminal. The meaning of the word began morphing from “spiritual cell” to “psychotic cage.”
“He just got back from his poker game.”
Cartoon by Liana Finck
The intimacy Sacks had rejoiced in now seemed phony, a “folie à deux”—a two-person delusion. His doubts intensified for a month, then he cut off the relationship. “I must tear you out of my system,
because I dare not be involved
,” he told Vincze, explaining that he barely remembered how he looked, or the sound of his voice. “I hope I will not be taken in like this again, and that—conversely—I will have the strength and clarity of mind to perceive any future such relationships as morbid at their inception, and to abort the folly of their further growth.”
Two months later, Sacks felt himself “slipping down the greased path of withdrawal, discontent, inability to make friends, inability to have sex, etc. etc. towards suicide in a New York apartment at the age of 32.” He took enormous amounts of amphetamines, to the point of hallucinating. A family friend, a psychiatrist who worked with Anna Freud, urged him to find a psychoanalyst. She wrote him that his homosexuality was “a very ‘secondary phenomenon’ ”: he was attracted to men as “a substitute for veering uncertainties of what/whom you could love other than as ‘idealizations’ of yourself.” A few weeks later, he started therapy with Leonard Shengold, a young psychiatrist who was deeply immersed in Manhattan’s psychoanalytic culture. “I think he is very good, and he has at least a very considerable local reputation,” Sacks wrote his parents, who helped to pay for the sessions, three times a week.
Sacks had elevated yet hazy ambitions at the time: he wanted to be a novelist, but he also wanted to become the “Galileo of the inward,” he told a mentor, and to write the neurological equivalent of Sigmund Freud’s “
Interpretation of Dreams
.” He worked in wards with chronically ill and elderly patients who had been warehoused and neglected, and his prospects within academic medicine looked dim. “Have you published anything lately?” his father wrote him, in 1968. “Or have you found yourself temperamentally incapacitated from doing so?”
When Sacks began therapy, “my initial and ultimate complaint was of
fixity
—a feeling of
not-going
,” he wrote in his journal. He regarded Shengold as “a sort of analytic machine.” But gradually Sacks came to feel that “I love him, and need him; that I need him—and
love
him.” He had planned to stay in New York City only for a few years, but he kept delaying his return to England so that he could reach “a terminable point in my analysis.” Shengold, who would eventually publish ten books about psychoanalysis, wrote that therapy requires a “long period of working through”—a term he defined as the “need to repeat emotional conflicts over and over in life” until the patient has the “freedom to own what is there to be felt.”
Sacks saw Shengold for half a century. In that time, Sacks became one of the world’s most prominent neurologists and a kind of founding father of medical humanities—a discipline that coalesced in the seventies, linking healing with storytelling. But the freedom that Shengold’s analysis promised was elusive. After Vincze, Sacks did not have another relationship for forty-four years. He seemed to be doing the “working through” at a remove—again and again, his psychic conflicts were displaced onto the lives of his patients. He gave them “some of
my own powers
, and some of
my
phantasies too,” he wrote in his journal. “I write out symbolic versions of myself.”
During Sacks’s neurology internship, in San Francisco, his childhood friend Eric Korn warned him that the residents at his hospital could sense he was gay. “For God’s sake, exercise what seems to you immoderate caution,” Korn wrote, in 1961. “Compartmentalize your life. Cover your tracks. Don’t bring in the wrong sort of guests to the hospital, or sign your name and address to the wrong sort of register.” He encouraged Sacks to read “Homosexuality: Disease or Way of Life?,” a best-selling book by Edmund Bergler, who argued that homosexuality was an “illness as painful, as unpleasant and as disabling as any other serious affliction,” but one that psychoanalysis could cure. “The book is full of interest,” Korn wrote. “He claims a potential 100% ‘cures’ (a term he chooses to employ because he knows it teases) which is worth investigating perhaps.”
Freud characterized homosexuality as a relatively normal variant of human behavior, but when psychoanalysis came to the United States, in the postwar years, homophobia took on new life. The historian Dagmar Herzog has described how, in the U.S., “reinventing psychoanalysis and reinventing homophobia went hand in hand.” Faced with men who persisted in their love for other men, American analysts commonly proposed celibacy as a stopgap solution. In the historian Martin Duberman’s memoir “
Cures
,” he writes that his psychoanalyst instructed him to “take the veil”—live celibately—so that he could be cured of his desire for men. Duberman agreed to these terms. The best he could get, he thought, was sublimation: instead of enjoying an “affective life,” he would make “some contribution to the general culture from which I was effectively barred.” Sacks, who was closeted until he was eighty, also followed this course.
Shengold had portraits of Charles Dickens, William Shakespeare, and Sigmund Freud in his office, on the Upper East Side. Like Sacks, he came from a literary Jewish family. He seemed deeply attuned to Sacks’s creative life, which took the form of ecstatic surges of literary inspiration followed by months of sterility and depression. “Do your best to enjoy and to work—it is the power of your mind that is
crucial
,” Shengold wrote when Sacks was on a visit with his family in England. Sacks wrote in his journal that he’d dreamed he overheard Shengold telling someone, “Oliver is lacking in proper self-respect; he has never really appreciated himself, or appreciated others’ appreciation of him. And yet, in his way, he is not less gifted than Auden was.” Sacks woke up flushed with embarrassment and pleasure.
Sacks in 1987. He became the modern master of the case study. “I write out symbolic versions of myself,” he wrote.
Photograph by Lowell Handler
Unlike many of his contemporaries, Shengold was not a doctrinaire thinker, but he was still susceptible to psychoanalytic fashions. Reflecting on how he might have viewed living openly as a gay man at that time, Shengold’s daughter, Nina, told me, “I don’t know that was a door that Dad necessarily had wide open.” In several books and papers, Shengold, a prolific reader of Western literature, tried to understand the process by which troubled people sublimate their conflicts into art. In his 1988 book, “
Halo in the Sky: Observations on Anality and Defense
,” Shengold wrote about the importance of transforming “anal-sadistic drives”—he used the anus as a metaphor for primitive, dangerous impulses—into “adaptive and creative ‘making.’ ” When Sacks read the book, he wrote in his journal that it “made me feel I was ‘lost in anality’ (whatever this means).”
Before Vincze, Sacks had been in love with a man named Mel Erpelding, who once told him, Sacks wrote, that he “oozed sexuality, that it poured out through every pore, that I was alive and vibrant with sexuality (a positive-admiring way of putting things), but also that I was reeking and toxic with it.” (Erpelding, who ended up marrying a woman, never allowed his relationship with Sacks to become sexual.) In his early years of therapy, in the late sixties, Sacks resolved that he would give up both drugs and sex. It’s doubtful that Shengold encouraged his celibacy, but he may have accepted that sexual abstinence could be productive, at least for a time. Richard Isay, the first openly gay member of the American Psychoanalytic Association, said that, in the seventies, he’d “rationalized that maturity and mental health demanded the sublimation of sexual excitement in work.” Sacks told a friend, “Shengold is fond of quoting Flaubert’s words ‘the mind has its erections too.’ ”
For Sacks, writing seemed almost physiological, like sweating—an involuntary response to stimuli. He routinely filled a whole journal in two days. “Should I then
put down my pen
, my interminable Journal (for this is but a fragment of the journal I have kept all my life),” he asked, “and ‘start living’ instead?” The answer was almost always no. Sometimes Sacks, who would eventually publish sixteen books, wrote continuously in his journal for six hours. Even when he was driving his car, he was still writing—he set up a tape recorder so that he could keep developing his thoughts, which were regularly interrupted by traffic or a wrong turn. Driving through Manhattan one day in 1975, he reflected on the fact that his closets, stuffed with pages of writing, resembled a “grave bursting open.”
By the late sixties, Sacks had become, he wrote, “almost a monk in my asceticism and devotion to work.” He estimated that he produced a million and a half words a year. When he woke up in the middle of the night with an erection, he would cool his penis by putting it in orange jello. He told Erpelding, “I partly accept myself as a celibate and a cripple, but partly—and this is . . . the wonder of sublimation—am able to
transform
my erotic feelings into other sorts of love—love for my patients, my work, art, thought.” He explained, “I keep my distance from people, am always courteous, never close. For me (as perhaps for you) there is almost no room, no moral room.”
“I have some hard ‘confessing’ to do—if not in public, at least to Shengold—and myself,” Sacks wrote in his journal, in 1985. By then, he had published four books—“
Migraine
,” “
Awakenings
,” “
A Leg to Stand On
,” and “
The Man Who Mistook His Wife for a Hat
”—establishing his reputation as “our modern master of the case study,” as the
Times
put it. He rejected what he called “pallid, abstract knowing,” and pushed medicine to engage more deeply with patients’ interiority and how it interacted with their diseases. Medical schools began creating programs in medical humanities and “narrative medicine,” and a new belief took hold: that an ill person has lost narrative coherence, and that doctors, if they attend to their patients’ private struggles, could help them reconstruct a new story of their lives. At Harvard Medical School, for a time, students were assigned to write a “book” about a patient. Stories of illness written by physicians (and by patients) began proliferating, to the point that the medical sociologist Arthur Frank noted, “ ‘Oliver Sacks’ now designates not only a specific physician author but also a . . . genre—a distinctively recognizable form of storytelling.”
But, in his journal, Sacks wrote that “a sense of hideous criminality remains (psychologically) attached” to his work: he had given his patients “powers (starting with powers of speech) which they do not have.” Some details, he recognized, were “pure fabrications.” He tried to reassure himself that the exaggerations did not come from a shallow place, such as a desire for fame or attention. “The impulse is both ‘purer’—and deeper,” he wrote. “It is not merely or wholly a
projection
—nor (as I have sometimes, ingeniously-disingenuously, maintained) a mere ‘sensitization’ of what I know so well in myself. But (if you will) a
sort of autobiography
.” He called it “
symbolic
‘exo-graphy.’ ”
Sacks had “misstepped in this regard, many many times, in ‘Awakenings,’ ” he wrote in another journal entry, describing it as a “source of severe, long-lasting, self-recrimination.” In the
book
, published in 1973, he startled readers with the depth of his compassion for some eighty patients at Beth Abraham Hospital, in the Bronx, who had survived an epidemic of encephalitis lethargica, a mysterious, often fatal virus that appeared around the time of the First World War. The patients had been institutionalized for decades, in nearly catatonic states. At the time, the book was met with silence or skepticism by other neurologists—Sacks had presented his findings in a form that could not be readily replicated, or extrapolated from—but, to nonspecialists, it was a masterpiece of medical witnessing. The
Guardian
would
name
it the twelfth-best nonfiction book of all time.
“My handwriting is better than your finger-writing.”
Cartoon by William Haefeli
Sacks spent up to fifteen hours a day with his patients, one of the largest groups of post-encephalitic survivors in the world. They were “mummified,” like “living statues,” he observed. A medicine called L-dopa, which elevates the brain’s dopamine levels, was just starting to be used for Parkinson’s disease, on an experimental basis, and Sacks reasoned that his patients, whose symptoms resembled those of Parkinson’s, could benefit from the drug. In 1969, within days of giving his patients the medication, they suddenly “woke up,” their old personalities intact. Other doctors had dismissed these patients as hopeless, but Sacks had sensed that they still had life in them—a recognition that he understood was possible because he, too, felt as if he were “buried alive.”
In “
Awakenings
,” Sacks writes about his encounters with a man he calls Leonard L. “What’s it like being the way you are?” Sacks asks him the first time they meet. “Caged,” Leonard replies, by pointing to letters of the alphabet on a board. “Deprived. Like Rilke’s ‘Panther’ ”—a reference to a poem by Rainer Maria Rilke about a panther pacing repetitively in cramped circles “around a center / in which a mighty will stands paralyzed.”
When Sacks was struggling to write his first book, “
Migraine
,” he told a friend that he felt like “Rilke’s image of the caged panther, stupefied, dying, behind bars.” In a letter to Shengold, he repeated this image. When Sacks met Leonard, he jotted down elegant observations in his chart (“Quick and darting eye movements are at odds with his general petrified immobility”), but there is no mention of Leonard invoking the Rilke poem.
In the preface to “
Awakenings
,” Sacks acknowledges that he changed circumstantial details to protect his patients’ privacy but preserved “what is important and essential—the real and full presence of the patients themselves.” Sacks characterizes Leonard as a solitary figure even before his illness: he was “continually buried in books, and had few or no friends, and indulged in none of the sexual, social, or other activities common to boys of his age.” But, in an autobiography that Leonard wrote after taking L-dopa, he never mentions reading or writing or being alone in those years. In fact, he notes that he spent all his time with his two best friends—“We were inseparable,” he writes. He also recalls raping several people. “We placed our cousin over a chair, pulled down her pants and inserted our penises into the crack,” he writes on the third page, in the tone of an aging man reminiscing on better days. By page 10, he is describing how, when he babysat two girls, he made one of them strip and then “leaped on her. I tossed her on her belly and pulled out my penis and placed it between her buttocks and started to screw her.”
Leonard Shengold, Sacks’s psychoanalyst.
Photograph courtesy Nina Shengold
In “
Awakenings
,” Sacks has cleansed his patient’s history of sexuality. He depicts him as a man of “most unusual intelligence, cultivation, and sophistication”—the “ ‘ideal’ patient.” L-dopa may have made Leonard remember his childhood in a heightened sexual register—his niece and nephew, who visited him at the hospital until his death, in 1981, told me that the drug had made him very sexual. But they said that he had been a normal child and adolescent, not a recluse who renounced human entanglement for a life of the mind.
Sacks finished writing “Awakenings” rapidly in the weeks after burying his mother, who’d died suddenly, at the age of seventy-seven. He felt “a great open torrent—and
release
,” he wrote in his journal. “It seems to be surely significant that ‘Awakenings’ finally came forth from me like a cry after the death of my own mother.” He referred to the writing of the book as his “Great Awakening,” the moment he “came out.” He doesn’t mention another event of significance: his patients had awakened during the summer of the Stonewall riots, the beginning of the gay-rights movement.
Shengold once told Sacks that he had “never met anyone less affected by gay liberation.” (Shengold supported his own son when he came out as gay, in the eighties.) Sacks agreed with the characterization. “I remain resolutely locked in my cell despite the dancing at the prison gates,” he said, in 1984.
In “Awakenings,” his patients are at first overjoyed by their freedom; then their new vitality becomes unbearable. As they continue taking L-dopa, many of them are consumed by insatiable desires. “L-DOPA is wanton, egotistical power,” Leonard says in the book. He injures his penis twice and tries to suffocate himself with a pillow. Another patient is so aroused and euphoric that she tells Sacks, “My blood is champagne”—the phrase Sacks used to describe himself when he was in love with Vincze. Sacks begins tapering his patients’ L-dopa, and taking some of them off of it completely. The book becomes a kind of drama about dosage: an examination of how much aliveness is tolerable, and at what cost. Some side effects of L-dopa, like involuntary movements and overactivity, have been well documented, but it’s hard not to wonder if “Awakenings” exaggerates the psychological fallout—Leonard becomes so unmanageable that the hospital moves him into a “punishment cell”—as if Sacks is reassuring himself that free rein of the libido cannot be sustained without grim consequence.
After “Awakenings,” Sacks intended his next book to be about his work with young people in a psychiatric ward at Bronx State Hospital who had been institutionalized since they were children. The environment reminded Sacks of a boarding school where he had been sent, between the ages of six and nine, during the Second World War. He was one of four hundred thousand children evacuated from London without their parents, and he felt abandoned. He was beaten by the headmaster and bullied by the other boys. The ward at Bronx State “exerted a sort of spell on me,” Sacks wrote in his journal, in 1974. “I lost my footing of proper sympathy and got sucked, so to speak, into an improper ‘perilous condition’ of identification to the patients.”
Shengold wrote several papers and books about a concept he called “soul murder”—a category of childhood trauma that induces “a hypnotic living-deadness, a state of existing ‘as if’ one were there.” Sacks planned to turn his work at Bronx State into a book about “ ‘SOUL MURDER’ and ‘SOUL SURVIVAL,’ ” he wrote. He was especially invested in two young men on the ward whom he thought he was curing. “The miracle-of-recovery started to occur in and through their relation to me (our relation and feelings
to each other
, of course),” he wrote in his journal. “We had to meet in a passionate subjectivity, a sort of collaboration or communication which transcended the Socratic relation of teacher-and-pupil.”
In a spontaneous creative burst lasting three weeks, Sacks wrote twenty-four essays about his work at Bronx State which he believed had the “beauty, the intensity, of Revelation . . . as if I was coming to know, once again, what I knew as a child, that sense of Dearness and Trust I had lost for so long.” But in the ward he sensed a “dreadful silent tension.” His colleagues didn’t understand the attention he was lavishing on his patients—he got a piano and a Ping-Pong table for them and took one patient to the botanical garden. Their suspicion, he wrote in his journal, “centred on the unbearability of my uncategorizability.” As a middle-aged man living alone—he had a huge beard and dressed eccentrically, sometimes wearing a black leather shirt—Sacks was particularly vulnerable to baseless innuendo. In April, 1974, he was fired. There had been rumors that he was molesting some of the boys.
That night, Sacks tore up his essays and then burned them. “Spite! Hate! Hateful spite!” he wrote in his journal shortly after. “And now I am empty—empty handed, empty hearted, desolate.”
The series of events was so distressing that even writing about it in his journal made Sacks feel that he was about to die. He knew that he should shrug off the false accusations as “vile idle gossip thrown by tiddlers and piddlers,” he wrote. But he couldn’t, because of “the
parental
accusation which I have borne—a Kafka-esque cross, guilt without crime, since my earliest days.”
The historian of medicine Henri Ellenberger observed that psychiatry owes its development to two intertwined dynamics: the neuroses of its founders—in trying to master their own conflicts, they came to new insights and forms of therapy—and the prolonged, ambiguous relationships they had with their patients. The case studies of these relationships, Ellenberger wrote, tended to have a distinct arc: psychiatrists had to unravel their patients’ “pathogenic secret,” a hidden source of hopelessness, in order to heal them.
Sacks’s early case studies also tended to revolve around secrets, but wonderful ones. Through his care, his patients realized that they had hidden gifts—for music, painting, writing—that could restore to them a sense of wholeness. The critic Anatole Broyard,
recounting
his cancer treatment in the
Times Magazine
in 1990, wrote that he longed for a charismatic, passionate physician, skilled in “empathetic witnessing.” In short, he wrote, a doctor who “would resemble Oliver Sacks.” He added, “He would see the genius of my illness.”
It speaks to the power of the fantasy of the magical healer that readers and publishers accepted Sacks’s stories as literal truth. In a letter to one of his three brothers, Marcus, Sacks enclosed a copy of “
The Man Who Mistook His Wife for a Hat
,” which was published in 1985, calling it a book of “fairy tales.” He explained that “these odd Narratives—half-report, half-imagined, half-science, half-fable, but with a fidelity of their own—are what
I
do, basically, to keep MY demons of boredom and loneliness and despair away.” He added that Marcus would likely call them “confabulations”—a phenomenon Sacks explores in a chapter about a patient who could retain memories for only a few seconds and must “
make
meaning, in a desperate way, continually inventing, throwing bridges of meaning over abysses,” but the “bridges, the patches, for all their brilliance . . . cannot do service for reality.”
Sacks was startled by the success of the book, which he had dedicated to Shengold, “my own mentor and physician.” It became an international best-seller, routinely assigned in medical schools. Sacks wrote in his journal,
Guilt has been
much
greater since ‘Hat’ because of (among other things)
My lies,
falsification
He pondered the phrase “art is the lie that tells the truth,” often attributed to Picasso, but he seemed unconvinced. “I think I have to thrash this out with Shengold—it is killing me, soul-killing me,” he wrote. “My ‘cast of characters’ (for this is what they become) take on an almost
Dickensian
quality.”
Sacks once told a reporter that he hoped to be remembered as someone who “bore witness”—a term often used within medicine to describe the act of accompanying patients in their most vulnerable moments, rather than turning away. To bear witness is to recognize and respond to suffering that would otherwise go unseen. But perhaps bearing witness is incompatible with writing a story about it. In his journal, after a session with a patient with Tourette’s syndrome, Sacks describes the miracle of being “enabled to ‘feel’—that is, to imagine, with all the powers of my head and heart—how it felt to be another human being.” Empathy tends to be held up as a moral end point, as if it exists as its own little island of good work. And yet it is part of a longer transaction, and it is, fundamentally, a projection. A writer who imagines what it’s like to exist as another person must then translate that into his own idiom—a process that Sacks makes particularly literal.
“I’ll tell you what you are saying,” Sacks told a woman with an I.Q. of around 60 whose grandmother had just died. “You want to go down below and join your dead grandparents down in the Kingdom of Death.” In the conversation, which Sacks recorded, the patient becomes more expressive under the rare glow of her doctor’s sustained attention, and it’s clear that she is fond of him. But he is so excited about her words (“One feels that she is voicing universal symbols,” he says in a recording, “symbols which are infinite in meaning”) that he usurps her experience.
“I know, in a way, you don’t feel like living,” Sacks tells her, in another recorded session. “Part of one feels dead inside, I know, I know that. . . . One feels that one wants to die, one wants to end it, and what’s the use of going on?”
“I don’t mean it in that way,” she responds.
“I know, but you do, partly,” Sacks tells her. “I know you have been lonely all your life.”
Cartoon by Michael Maslin
The woman’s story is told, with details altered, in a chapter in “Hat” titled “Rebecca.” In the essay, Rebecca is transformed by grief for her grandmother. She reminds Sacks of Chekhov’s Nina, in “The Seagull,” who longs to be an actress. Though Nina’s life is painful and disappointing, at the end of the play her suffering gives her depth and strength. Rebecca, too, ends the story in full flower. “Rather suddenly, after her grandmother’s death,” Sacks writes, she becomes decisive, joining a theatre group and appearing to him as “a complete person, poised, fluent,” a “natural poet.” The case study is presented as an ode to the power of understanding a patient’s life as a narrative, not as a collection of symptoms. But in the transcripts of their conversations—at least the ones saved from the year that followed, as well as Sacks’s journals from that period—Rebecca never joins a theatre group or emerges from her despair. She complains that it’s “better that I shouldn’t have been born,” that she is “useless,” “good for nothing,” and Sacks vehemently tries to convince her that she’s not. Instead of bearing witness to her reality, he reshapes it so that she, too, awakens.
Some of the most prominent nonfiction writers of Sacks’s era (
Joseph Mitchell
,
A. J. Liebling
,
Ryszard Kapuściński
) also took liberties with the truth, believing that they had a higher purpose: to illuminate the human condition. Sacks was writing in that spirit, too, but in a discipline that depends on reproducible findings. The “most flagrant example” of his distortions, Sacks wrote in his journal, was in one of the last chapters of “Hat,” titled “The Twins,” about twenty-six-year-old twins with autism who had been institutionalized since they were seven. They spend their days reciting numbers, which they “savored, shared” while “closeted in their numerical communion.” Sacks lingers near them, jotting down the numbers, and eventually realizes that they are all prime. As a child, Sacks used to spend hours alone, trying to come up with a formula for prime numbers, but, he wrote, “I never found any Law or Pattern for them—and this gave me an intense feeling of Terror, Pleasure, and—Mystery.” Delighted by the twins’ pastime, Sacks comes to the ward with a book of prime numbers which he’d loved as a child. After offering his own prime number, “they drew apart slightly, making room for me, a new number playmate, a third in their world.” Having apparently uncovered the impossible algorithm that Sacks had once wished for, the twins continue sharing primes until they’re exchanging ones with twenty digits. The scene reads like a kind of dream: he has discovered that human intimacy has a decipherable structure, and identified a hidden pattern that will allow him to finally join in.
Before Sacks met them, the twins had been extensively studied because of their capacity to determine the day of the week on which any date in the calendar fell. In the sixties, two papers in the
American Journal of Psychiatry
provided detailed accounts of the extent of their abilities. Neither paper mentioned a gift for prime numbers or math. When Sacks wrote Alexander Luria, a Russian neuropsychologist, about his work with the twins, in 1973, he also did not mention any special mathematical skills. In 2007, a psychologist with a background in learning theory published a short article in the
Journal of Autism and Developmental Disorders
, challenging Sacks’s assertion that these twins could spontaneously generate large prime numbers. Because this is not something that humans can reliably do, Sacks’s finding had been widely cited, and was theoretically “important for not only psychologists but also for all scientists and mathematicians,” the psychologist wrote. (The psychologist had contacted Sacks to ask for the title of his childhood book of prime numbers, because he couldn’t find a book of that description, but Sacks said that it had been lost.) Without pointing to new evidence, another scientist wrote in Sacks’s defense, describing his case study as “the most compelling account of savant numerosity skills” and arguing, “This is an example of science at the frontier, requiring daring to advance new interpretations of partial data.”
After the publication of “Hat,” when Sacks was fifty-two years old, he wrote his friend Robert Rodman, a psychoanalyst, that “Shengold suggested, with some hesitancy, some months ago, that I should consider going
deeper
with him.” He added, “He also observes that I don’t complain, say, of sexual deprivation—though this is absolute.” At first, Sacks was worried that Shengold was preparing to dismiss him from treatment: “I’ve done all I can for you—now manage on your own!” Then he felt hopeful that he didn’t need to assume that “boredom-depression-loneliness-cutoffness” would define the rest of his life. He was also moved that, after twenty years, Shengold still considered him “worth extra work.”
But Sacks was shaken by the idea that they’d only been skimming the surface. He looked back through his notebooks and noticed “a perceptible decline in concern and passion,” which he felt had also dulled the quality of his thought. “Is the superficiality of my work, then, due to superficiality of relationships—to running away from whatever has deeper feeling and meaning?” he asked Rodman. “Is this perhaps spoken of, in a camouflaged way, when I describe the ‘superficialization’ of various patients?” As an example, he referenced an essay in “Hat” about a woman with a cerebral tumor. She was intelligent and amusing but seemed not to care about anyone. “Was this the ‘cover’ of some unbearable emotion?” he writes in the essay.
Sacks felt that Shengold was the reason he was still alive, and that he should go further with him. “What have I to lose?” he asked Rodman. But, he wrote, “what one has to lose, of course, may be just that quasi-stable if fragile ‘functioning’ . . . so there is reason to hesitate.” Going deeper would also mean more fully submitting to someone else’s interpretation, experiencing what he asked of his own patients; Rodman proposed that Sacks was “afraid of the enclosure of analysis, of being reduced and fixed with a formulated phrase.”
Sacks and his partner, Bill Hayes.
Photograph courtesy Oliver Sacks Foundation
In the early eighties, Lawrence Weschler, then a writer for
The New Yorker
, began working on a biography of Sacks. Weschler came to feel that Sacks’s homosexuality was integral to his work, but Sacks didn’t want his sexuality mentioned at all, and eventually asked him to stop the project. “I have lived a life wrapped in concealment and wracked by inhibition, and I can’t see that changing now,” he told Weschler. In his journal, Sacks jotted down thoughts to share with Weschler on the subject: “My ‘sex life’ (or lack of it) is, in a sense
irrelevant
to the . . . sweep of my
mind
.” In another entry, he wrote that the Freudian term “sublimation” diminished the process he’d undergone. When he was still having sex, as a young man in California, he used to sheath his body in leather gear, so he was “totally encased, enclosed,” his real self sealed in a kind of “black box.” He wrote, “I have,
in a sense
, ‘outgrown’ these extraordinary, almost
convulsive
compulsions—but this detachment has been made possible by
incorporating
them into a vast and comprehending view of the world.” (Weschler became close friends with Sacks, and, after Sacks died, published a “biographical memoir” titled “And How Are
You
, Dr. Sacks?”)
It’s unclear whether Sacks did “go deeper” with Shengold. In the late eighties, Sacks wrote in his journal that he was “scared, horrified (but, in an awful way, accepting or complaisant) about my non-life.” He likened himself to a “pithed and gutted creature.” Rather than living, he was managing a kind of “homeostasis.”
In 1987, Sacks had an intense friendship with a psychiatrist named Jonathan Mueller, with whom he briefly fell in love. Mueller, who was married to a woman, told me that he did not realize Sacks had romantic feelings for him. Sacks eventually moved on. But he felt that the experience had altered him. “I can read ‘love stories’ with empathy and understanding—I can ‘
enter into them
’ in a way which was impossible before,” he wrote in his journal. He perceived, in a new light, what it meant for his patients in “Awakenings” to glimpse the possibility of “liberation”: like him, he wrote, they were seeking “not merely a cure but an indemnification for the loss of their lives.”
By the nineties, Sacks seemed to ask less of himself, emotionally, in relation to his patients. He had started working with Kate Edgar, who’d begun as his assistant but eventually edited his writing, organized his daily life, and became a close friend. (Shengold had encouraged Sacks to find someone to assist with his work. “The secretary is certainly an important ‘ego-auxiliary,’ ” he wrote him in a letter.) Edgar was wary about the way Sacks quoted his patients—they were suspiciously literary, she thought—and she checked to make sure he wasn’t getting carried away. She spent hours with some of his patients, and, she told me, “I never caught him in anything like that, which actually surprises me.”
Weschler told me that Sacks used to express anxiety about whether he’d distorted the truth. Weschler would assure him that good writing is not a strict account of reality; there has to be space for the writer’s imagination. He said he told Sacks, “Come on, you’re extravagantly romanticizing how bad you are—just as much as you were extravagantly romanticizing what the patient said. Your mother’s accusing voice has taken over.” Weschler had gone to Beth Abraham Hospital to meet some of the patients from “Awakenings” and had been shaken by their condition. “There’s a lot of people shitting in their pants, drooling—the sedimentation of thirty years living in a warehouse,” he said. “His genius was to see past that, to the dignity of the person. He would talk to them for an hour, and maybe their eyes would brighten only once—the rest of the time their eyes were cloudy—but he would glom onto that and keep talking.”
After “Hat,” Sacks’s relationship with his subjects became more mediated. Most of them were not his patients; many wrote to him after reading his work, recognizing themselves in his books. There was a different power dynamic, because these people already believed that they had stories to tell. Perhaps the guilt over liberties he had taken in “Hat” caused him to curb the impulse to exaggerate. His expressions of remorse over “making up, ‘enhancing,’ etc,” which had appeared in his journals throughout the seventies and eighties, stopped. In his case studies, he used fewer and shorter quotes. His patients were far more likely to say ordinary, banal things, and they rarely quoted literature. They still had secret gifts, but they weren’t redeemed by them; they were just trying to cope.
In “
An Anthropologist on Mars
,” from 1992, a book of
case studies
about people compensating for, and adapting to, neurological conditions, some of the richest passages are the ones in which Sacks allows his incomprehension to become part of the portrait. In a chapter called “Prodigies,” he wants badly to connect with a thirteen-year-old boy named Stephen, who is autistic and has an extraordinary ability to draw, but Stephen resists Sacks’s attempts at intimacy. He will not allow himself to be romanticized, a refusal that Sacks ultimately accepts: “Is Stephen, or his autism, changed by his art? Here, I think, the answer is no.” In this new mode, Sacks is less inclined to replace Stephen’s unknowable experience with his own fantasy of it. He is open about the discomfort, and even embarrassment, of his multiple failures to reach him: “I had hoped, perhaps sentimentally, for some depth of feeling from him; my heart had leapt at the first ‘Hullo, Oliver!’ but there had been no follow-up.”
Mort Doran, a surgeon with
Tourette’s
syndrome whom Sacks profiled in “Anthropologist,” told me that he was happy with the way Sacks had rendered his life. He said that only one detail was inaccurate—Sacks had written that the brick wall of Doran’s kitchen was marked from Doran hitting it during Tourette’s episodes. “I thought, Why would he embellish that? And then I thought, Maybe that’s just what writers do.” Doran never mentioned the error to Sacks. He was grateful that Sacks “had the gravitas to put it out there to the rest of the world and say, ‘These people aren’t all nuts or deluded. They’re real people.’ ”
The wife in the title story of “Hat” had privately disagreed with Sacks about the portrayal of her husband, but for the most part Sacks appeared to have had remarkable relationships with his patients, corresponding with them for years. A patient called Ray, the subject of a 1981 piece about Tourette’s syndrome, told me that Sacks came to his son’s wedding years after his formal treatment had ended. Recalling Sacks’s death, he found himself suddenly crying. “Part of me left,” he said. “Part of my self was gone.”
A year after “Awakenings” was published, Sacks broke his leg in Norway, and Leonard L. and his mother wrote him a get-well letter. Thirty-two patients added their names, their signatures wavering. “Everybody had been counting the days for your return, so you can imagine the turmoil when they heard the news,” Leonard’s mother wrote. She explained that “most of the patients are not doing so well without your help and interest.” She added that Leonard “isn’t doing too well either.” When Leonard learned that Sacks wouldn’t be back, she said, “he shed enough tears to fill a bucket.”
Sacks spoke of “animating” his patients, as if lending them some of his narrative energy. After living in the forgotten wards of hospitals, in a kind of narrative void, perhaps his patients felt that some inaccuracies were part of the exchange. Or maybe they thought, That’s just what writers do. Sacks established empathy as a quality every good doctor should possess, enshrining the ideal through his stories. But his case studies, and the genre they helped inspire, were never clear about what they exposed: the ease with which empathy can slide into something too creative, or invasive, or possessive. Therapists—and writers—inevitably see their subjects through the lens of their own lives, in ways that can be both generative and misleading.
In his journal, reflecting on his work with Tourette’s patients, Sacks described his desire to help their illness “reach fruition,” so that they would become floridly symptomatic. “With my help and almost my collusion, they can extract the maximum possible from their sickness—maximum of knowledge, insight, courage,” he wrote. “Thus I will FIRST help them to get ill, to
experience
their illness with maximum intensity; and then,
only then
, will I help them get well!” On the next line, he wrote, “IS THIS MONSTROUS?” The practice came from a sense of awe, not opportunism, but he recognized that it made him complicit, as if their illness had become a collaboration. “An impulse both neurotic and intellectual (artistic) makes me
get the most out of suffering
,” he wrote. His approach set the template for a branch of writing and thinking that made it seem as if the natural arc of illness involved insight and revelation, and even some poetry, too.
In his journals, Sacks repeatedly complained that his life story was over. He had the “feeling that I have stopped doing, that doing has stopped, that life itself has stopped, that it is petering out in a sort of twilight of half-being,” he wrote, in 1987. His journals convey a sense of tangible boredom. He transcribed long passages from philosophers and theologists (Simone Weil, Søren Kierkegaard, Gottfried Wilhelm Leibniz, Dietrich Bonhoeffer) and embarked on disquisitions on the best definition of reality, the “metabolism of grace,” the “deep mystery of incubation.” His thoughts cast outward in many directions—notes for a thousand lectures—then tunnelled inward to the point of non-meaning. “Where Life is Free, Immaterial, full of Art,” he wrote, “the laws of life, of Grace, are those of
Fitness
.”
Sacks proposed various theories for why he had undergone what he called “psychic death.” He wondered if he had become too popular, merely a fuzzy symbol of compassionate care. “Good old Sacks—the House Humanist,” he wrote, mocking himself. He also considered the idea that his four decades of analysis were to blame. Was it possible, he wrote, that a “vivisection of inner life, however conceived, however subtle and delicate, may in fact destroy the very thing it examines?” His treatment with Shengold seemed to align with a life of “homeostasis”—intimacy managed through more and more language, in a contained, sterile setting, on Monday and Wednesday mornings, from 6:00 to 6:45
A
.
M
. They still referred to each other as “Dr. Sacks” and “Dr. Shengold.” Once, they ran into each other at a chamber concert. They were a few rows apart, but they didn’t interact. Occasionally, Shengold told his children that he “heard from the couch” about a good movie or play, but he never shared what happened in his sessions. They inferred that Sacks was their father’s patient after reading the dedication to him in “Hat.”
As Sacks aged, he felt as if he were gazing at people from the outside. But he also noticed a new kind of affection for humans—“homo sap.” “They’re quite complex (little) creatures (I say to myself),” he wrote in his journal. “They suffer, authentically, a good deal. Gifted, too. Brave, resourceful, challenging.”
Perhaps because love no longer appeared to be a realistic risk—he had now entered a “geriatric situation”—Sacks could finally confess that he craved it. “I keep being
stabbed
by love,” he wrote in his journal. “A look. A glance. An expression. A posture.” He guessed that he had at least five, possibly ten, more years to live. “I want to, I want to ••• I dare not say. At least not in writing.”
In 2008, Sacks had lunch with Bill Hayes, a forty-seven-year-old writer from San Francisco who was visiting New York. Hayes had never considered Sacks’s sexuality, but, as soon as they began talking, he thought, “Oh, my God, he’s gay,” he told me. They lingered at the table for much of the afternoon, connecting over their insomnia, among other subjects. After the meal, Sacks wrote Hayes a letter (which he never sent) explaining that relationships had been “a ‘forbidden’ area for me—although I am entirely sympathetic to
(indeed wistful and perhaps envious about)
other people’s relationships.”
A year later, Hayes, whose partner of seventeen years had died of a heart attack, moved to New York. He and Sacks began spending time together. At Sacks’s recommendation, Hayes started keeping a journal, too. He often wrote down his exchanges with Sacks, some of which he later published in a memoir, “Insomniac City.”
“It’s really a question of mutuality, isn’t it?” Sacks asked him, two weeks after they had declared their feelings for each other.
“Love?” Hayes responded. “Are you talking about love?”
“Yes,” Sacks replied.
Sacks began taking Hayes to dinner parties, although he introduced him as “my friend Billy.” He did not allow physical affection in public. “Sometimes this issue of not being out became very difficult,” Hayes told me. “We’d have arguments, and I’d say things like ‘Do you and Shengold ever talk about why you can’t come out? Or is all you ever talk about your dreams?’ ” Sacks wrote down stray phrases from his dreams on a whiteboard in his kitchen so that he could report on them at his sessions, but he didn’t share what happened in therapy.
Kate Edgar, who worked for Sacks for three decades, had two brothers who were gay, and for years she had advocated for gay civil rights, organizing Pride marches for her son’s school. She intentionally found an office for Sacks in the West Village so that he would be surrounded by gay men living openly and could see how normal it had become. She tended to hire gay assistants for him, for the same reason. “So I was sort of plotting on that level for some years,” she told me.
In 2013, after being in a relationship with Hayes for four years—they lived in separate apartments in the same building—Sacks began writing a memoir, “On the Move,” in which he divulged his sexuality for the first time. He recounts his mother’s curses upon learning that he was gay, and his decades of celibacy—a fact he mentions casually, without explanation. Edgar wondered why, after so many years of analysis, coming out took him so long, but, she said, “Oliver did not regard his relationship with Shengold as a failure of therapy.” She said that she’d guessed Shengold had thought, “This is something Oliver has to do in his own way, on his own time.” Shengold’s daughter, Nina, said that, “for my dad to have a patient he loved and respected finally find comfort in identifying who he’d been all his life—that’s growth for both of them.”
A few weeks after finishing the manuscript, Sacks, who’d had melanoma of the eye in 2005, learned that the cancer had come back, spreading to his liver, and that he had only months to live. He had tended toward hypochondria all his life, and Edgar thought that the diagnosis might induce a state of chronic panic. Since he was a child, Sacks had had a horror of losing things, even irrelevant objects. He would be overcome by the “feeling that
there was a hole in the world
,” he wrote in his journal, and the fear that “I might somehow fall through that hole-in-the-world, and be absolutely, inconceivably lost.” Edgar had dealt for decades with his distress over lost objects, but she noticed that now, when he misplaced things, he didn’t get upset. He had an uncharacteristic ease of being.
In the summer of 2015, before Shengold went on his annual summer break, Sacks said to Edgar, “If I’m alive in September when Shengold returns, I’m not sure I need to go back to my sessions.” They had been seeing each other for forty-nine years. Sacks was eighty-two; Shengold was eighty-nine.
When Sacks was struggling with his third book, “
A Leg to Stand On
,” which was about breaking his leg and his frustration that his doctors wouldn’t listen to him, he wrote in his journal that Shengold had suggested (while apologizing for the corniness of the phrase) that the book should be “a message of love”—a form of protest against the indifference that so many patients find in their doctors. Shengold may have been giving Sacks permission to see their own relationship—the one place in which Sacks felt an enduring sense of recognition and care—as a hidden subject of the book. Extending Shengold’s idea, Sacks wrote, of his book, “The ‘moral’ center has to do with . . . the irreducible ultimate in doctor-patient relations.”
In August, two weeks before Sacks died, he and Shengold spoke on the phone. Shengold was with his family at a cottage in the Finger Lakes region of central New York, where he spent every summer. Nina told me, “We all gathered in the living room of that little cottage and put my father on speakerphone. Oliver Sacks was clearly on his deathbed—he was not able to articulate very well. Sometimes his diction was just gone. Dad kept shaking his head. He said, ‘I can’t understand you. I’m so sorry, I can’t understand you.’ ” At the end of the call, Shengold told Sacks, “It’s been the honor of my life to work with you,” and said, “Goodbye, Oliver.” Sacks responded, “Goodbye, Leonard.” It was the first time they had ever used each other’s first names. When they hung up, Shengold was crying.
After Sacks died, Shengold started closing down his practice. “It was the beginning of the end for him,” his son David told me. “He had lost most of his colleagues. He was really the last of his generation.” Nina said, “I do think part of why my father lived so long and was able to work so long was because of that relationship. That feeling of affection and kindred spirit was lifesaving.”
In “Awakenings,” when describing how Leonard L.—his “ ‘ideal’ patient”—initially responded to L-dopa, Sacks characterizes him as “a man released from entombment” whose “predominant feelings at this time were feelings of freedom, openness, and exchange with the world.” He quotes Leonard saying, “I have been hungry and yearning all my life . . . and now I am full.” He also says, “I feel saved. . . . I feel like a man in love. I have broken through the barriers which cut me off from love.’ ”
For years, Sacks had tested the possibility of awakenings in others, as if rehearsing, or outsourcing, the cure he had longed to achieve with Shengold. But at the end of his life, like an inside-out case study, he inhabited the story he’d imagined for his patients. “All of us entertain the idea of
another
sort of medicine . . . which will restore us to our lost health and wholeness,” he wrote, in “Awakenings.” “We spend our lives searching for what we have lost; and one day, perhaps, we will suddenly find it.” ♦
A Lisp Interpreter Implemented in Conway's Game of Life (2021)
Lisp in Life is a Lisp interpreter implemented in Conway’s Game of Life.
The entire pattern is viewable on the browser
here
.
To the best of my knowledge, this is the first time a high-level programming language was interpreted in Conway’s Game of Life.
Running Lisp on the Game of Life
Lisp is a language with a simple and elegant design, having an extensive ability to express sophisticated ideas as simple programs. Notably, the powerful feature of
macros
could be used to modify the language’s syntax to write programs in a highly flexible way. For example, macros can be used to introduce new programming paradigms to the language, as demonstrated in
object-oriented-like.lisp
(which can actually be evaluated by the interpreter, although complex programs take quite a long time to finish running), where a structure and syntax similar to classes in Object Oriented Programming is constructed. Despite the expressibility of Lisp, it is
the world’s second oldest high-level programming language
introduced in 1958, only to be preceded by Fortran.
Conway’s Game of Life is a cellular automaton proposed in 1970. Despite it having a very simple set of rules, it is known to be Turing Complete. Lisp in Life demonstrates this fact in a rather straightforward way.
How can simple systems allow human thoughts to be articulated and be expanded? With the expressibility of Lisp and the basis of Conway’s Game of Life, Lisp in Life provides an answer to this question.
Input and Output
The Lisp program is provided by editing certain cells within the pattern to represent the ASCII-encoding of the Lisp program. The pattern directly reads this text and evaluates the results. You can also load your own Lisp program into the pattern and run it.
The standard output is written at the bottom end of the RAM module, which can be easily located and directly examined in a Game of Life viewer.
The Lisp implementation supports lexical closures and macros, allowing one to write Lisp programs in a Lisp-like taste, as far as the memory limit allows you to.
The
Lisp interpreter
is written in C. Using the build system for this project, you can also compile your own C11-compatible C code and run in on Conway’s Game of Life.
Previous Work
As previously mentioned,
to the best of my knowledge, this is the first time a high-level programming language was interpreted in Conway’s Game of Life.
The entry featuring
Universal Computers
in LifeWiki has a list of computers created in the Game of Life.
Two important instances not mentioned in this entry are the
Quest For Tetris
(QFT) Project
created by the authors of the QFT project, and
APGsembly
created by Adam P. Goucher.
All of these work are designed to run an assembly language and are not designed to interpret a high-level language per se.
An example of a compiled high-level language targeted for the Game of Life is Cogol by the QFT project.
Cogol is compiled to the assembly language QFTASM, targeted for the QFT architecture.
Although Cogol is targeted for the QFT architecture, it requires compilation to QFTASM for the code to be run in the QFT architecture.
In Lisp in Life, a modified version of the QFT architecture is first created for improving the pattern’s runtime.
Modifications include introducing a new cascaded storage architecture for the ROM, new opcodes, extending the ROM and RAM address space, etc.
The Lisp source code is then written into the computer’s RAM module as its raw binary ASCII format.
The Conway’s Game of Life pattern directly reads, parses, and evaluates this Lisp source code to produce its output.
This feature of allowing a Conway’s Game of Life pattern to evaluate a high-level programming language expressed as a string of text
is a novel feature that was newly achieved in this project.
Video
Here is a YouTube video showing Lisp in Life in action:
Screenshots
An overview of the entire architecture.
An overview of the CPU and its surrounding modules. On the top are the ROM modules, with the lookup module on the right, and the value modules on the left. On the bottom left is the CPU. On the bottom right is the RAM module.
This pattern is the VarLife version of the architecture. VarLife is an 8-state cellular automaton defined in the
Quest For Tetris
(QFT) Project, which is used as an intermediate layer to create the final Conway’s Game of Life pattern. The colors of the cells indicate the 8 distinct states of the VarLife rule.
The architecture is based on
Tetris8.mc
in the
original QFT repository
. Various modifications were made to make the pattern compact, such as introducing a new lookup table architecture for the ROM, removing and adding new opcodes, expanding the ROM and RAM address space, etc.
The Conway’s Game of Life version of the architecture, converted from the VarLife pattern.
What appears to be a single cell in this image is actually an
OTCA metapixel
zoomed away to be shown 2048 times smaller.
A close-up view of a part of the ROM module in the Conway’s Game of Life version.
Each pixel in the previous image is actually this square-shaped structure shown in this image.
These structures are
OTCA metapixels
, which can be seen to be in the On and Off meta-states in this image.
The OTCA Metapixel is a special Conway’s Game of Life pattern that can emulate cellular automatons with customized rules.
The original VarLife pattern is simulated this way so that it can run in Conway’s Game of Life.
A video of the RAM module in the VarLife rule in action.
The computer showing the results of the following Lisp program:
(definemult(lambda(mn)(*mn)))(print(mult314))
The result is
42
, shown in binary ascii format (
0b110100
,
0b110010
), read in bottom-to-up order.
As shown in this image, the standard output of the Lisp program gets written at the bottom end of the RAM module, and can be directly viewed in a Game of Life viewer.
This repository also contains scripts that run on Golly to decode and view the contents of the output as strings.
How is it Done?
The
Lisp interpreter
, written in C, is compiled to an assembly language for a CPU architecture implemented in the Game of Life, which is a modification of the computer used in the
Quest For Tetris
(QFT) project. The compilation is done using an extended version of
ELVM
(the Esoteric Language Virtual Machine). The Game of Life backend for ELVM was implemented by myself.
Generating a small enough pattern that runs in a reasonable amount of time required a lot of effort.
This required optimizations and improvements in every layer of the project; a brief summary would be:
The C Compiler layer - adding the
computed goto
feature to the C compiler, preserving variable symbols to be used after compilation, etc.
The C layer (the
Lisp interpreter
) - using a string hashtable and binary search for Lisp symbol lookup, minimization of stack region usage with union memory structures, careful memory region map design, etc.
The QFTASM layer - writing a
compiler optimizer
to optimize the length of the assembly code
The VarLife layer (the CPU architecture) - creating a lookup table architecture for faster ROM access, expanding the size and length of the RAM module, adding new opcodes, etc.
The Game of Life layer -
Hashlife
-specific optimization
A more detailed description of the optimizations done in this project is available in the
Implementation Details
section.
Conversion from VarLife to Conway’s Game of Life
VarLife is an 8-state cellular automaton defined in the
Quest For Tetris
(QFT) Project.
It is used as an intermediate layer to generate the final Conway’s Game of Life pattern; the computer is first created in VarLife, and then converted to a Game of Life pattern.
When converting VarLife to Conway’s Game of Life, each VarLife cell is mapped to an
OTCA Metapixel
(OTCAMP). The conversion from VarLife to the Game of Life is done in a way so that the behavior of the states of the VarLife pattern matches exactly with the meta-states of the OTCA Metapixels in the converted Game of Life pattern.
Therefore, it is enough to verify the behavior of the VarLife pattern to verify the behavior of the Game of Life pattern.
Due to the use of OTCA Metapixels, each VarLife cell becomes extended to a 2048x2048 Game of Life cell, and 1 VarLife generation requires 35328 Game of Life generations. Therefore, the VarLife patterns run significantly faster than the Game of Life (GoL) version.
Additional details on VarLife are available in the
Miscellaneous
section.
Pattern files preloaded with various Lisp programs are available here.
Detailed statistics such as the running time and the memory consumption are available in the
Running Times and Statistics
section.
The patterns can be simulated on the Game of Life simulator
Golly
.
object-oriented-like.lisp
:
This example creates a structure similar to classes in Object-Oriented Programming, using closures.
The class has methods and field variables, where each instance carries distinct and persistent memory locations of their own.
The example instantiates two counters and concurrently modifies the value held by each instance.
New syntaxes for instantiation and method access,
(new classname)
and
(. instance methodname)
, are introduced using macros and functions.
The Lisp interpreter’s variable scope and the macro feature is powerful enough to manage complex memory management,
and even providing a new syntax to support the target paradigm.
printquote.lisp
: A simple demonstration of macros.
factorial.lisp
: A simple demonstration of recursion with the factorial function.
backquote-splice.lisp
:
Implements the
backquote macro
used commonly in Lisp to construct macros.
It also supports the unquote and unquote-splice operations, each written as
~
and
~@
.
primes.lisp
: Prints a list of prime numbers up to 20. This example highlights the use of the
while
syntax.
The contents of print.lisp is quite straightforward - it calculates and prints the result of
3 * 14
.
backquote.lisp and primes-print.lisp are similar to backquote-splice.lisp and primes.lisp, mainly included for performance comparisons.
backquote.lisp doesn’t implement the unquote-splice operation, and demonstrates some more examples.
primes-print.lisp reduces the number of list operations to save memory usage.
Details of the Lisp Interpreter
Special Forms and Builtin Functions
define
if
quote
car, cdr
cons
list
atom
print
progn
while
lambda, macro
eval
eq
+, -, *, /, mod, <, >
Lexical Closures
This Lisp interpreter supports lexical closures.
The implementation of lexical closures is powerful enough to write an object-oriented-like code as shown in
object-oriented-like.lisp
,
where classes are represented as lexical closures over the field variables and the class methods.
Macros
This Lisp interpreter supports macros. Lisp macros can be thought as a function that receives code and returns code.
Following this design, macros are treated exacly the same as lambdas, except that it takes the arguments as raw S-expressions,
and evaluates the result twice (the first time to build the expression, and the second time to actually evaluate the builded expression).
The running times for each program are shown above. The
Hashlife
algorithm used for the simulation requires a lot of memory in exchange of speedups.
The simulations were run on a 32GB-RAM computer, with Golly’s memory usage limit set to 28000 MB, and the default base step to 2 (configurable from the preferences).
The memory usage was measured by Ubuntu’s activity monitor. “(max.)” shows where the maximum permitted memory was used.
The number of CPU cycles and the QFT memory usage was obtained by running the QFTASM interpreter on the host PC.
The QFT memory usage shows the number of RAM addresses that were written at least once.
The memory usage is measured in words, which is 16 bits in this architecture.
All of the VarLife patterns can actually be run on a computer. The shortest running time is about 1 minute for
print.lisp
.
A sophisticated program such as
object-oriented-like.lisp
can even run in about 22 minutes.
On the other hand, the Game of Life patterns take significantly more time than the VarLife patterns, but for short programs it can be run in a moderately reasonable amount of time.
For example,
print.lisp
finishes running in about 6 hours in the Game of Life pattern.
As mentioned in the “Conversion from VarLife to Conway’s Game of Life” section, since the Game of Life pattern emulates the behavior of the VarLife pattern using OTCA Metapixels,
the behavior of the Game of Life patterns can be verified by running the VarLife patterns.
Tests
There are tests to check the behavior of the Lisp interpreter.
There is a test for checking the QFTASM-compiled Lisp interpreter using the QFTASM interpreter, and a test for checking the GCC-compiled Lisp interpreter on the host pc.
To run these tests, use the following commands:
git submodule update --init--recursive# Required for building the source
make test# Run the tests for the QFTASM-compiled Lisp interpreter, using the QFTASM interpreter
make test_executable # Run the tests for the executable compiled by GCC
Running
make test
requires
Hy
, a Clojure-like Lisp implemented in Python available via
pip install hy
.
Some of the tests compare the output results of Hy and the output of the QFTASM Lisp interpreter.
The tests were run on Ubuntu and Mac.
Building from Source
This section explains how to load the Lisp interpreter (written in C) to the Game of Life pattern, and also how to load a custom Lisp program into the pattern to run it on Game of Life.
This section describes the implementation details for the various optimizations for the QFT assembly and the resulting Game of Life pattern.
The C Compiler layer
Added the computed goto feature to ELVM
This was merged into the original ELVM project.
Modified the compiler to preserve and output memory address symbols and program address symbols, for their usage in the compiler optimization tool in the QFTASM layer
This allows to use
memheader.eir
, so that symbols used in the C source can be referenced in the ELVM assembly layer using the same variable symbols.
The ELVM Assembly layer
Wrote the QFTASM backend for ELVM
This was merged into the original ELVM project.
Added further improvements to the QFTASM backend:
Let the ELVM assembly’s memory address space match QFT’s native memory address space
Originally, the ELVM assembly had to convert its memory address every time when a memory access occurs.
Support new opcodes added in the improved QFT architecture
The C layer (the implementation of the Lisp interpreter)
Usage of binary search and hashtables for string representations and comparisons
By profiling the GCC-compiled version of the Lisp interpreter, it was found that the string table lookup process was a large performance bottleneck. This was a large room for optimization.
The optimized string lookup process is as follows.
First, when the Lisp parser accepts a symbol token, it creates a 4-bit hash of the string with the checksum of the ASCII representation of the string. The hash points to a hashtable that holds the root of a binary search tree for string comparison. Each node in the tree holds the string of the symbol token, and two nodes that are before and after the token in alphabetical order. When a query symbol token arrives in the parsing phase, a node with a matching token is returned, or a new node for the token is added into this binary tree if the token does not exist yet. This allows for each distinct symbol in the S-expression to have a distinct memory address.
In the interpretation phase, since each distinct symbol has a distinct memory address, and every string required for the Lisp program has already been parsed, string comparison can be done by simply comparing the memory address of the tokens. Since the interpreter only uses string equality operations for string comparison, simply checking for integer equality suffices for string comparison, speeding up the interpretation phase. Since the hash key is 4 bits long, this allows for reducing 4 searches in the binary tree compared to using a single binary tree.
Usage of jump hash tables for the special form evaluation procedure searches
There are 17 distinct procedures for evaluating the special forms in the Lisp interpreter,
define
,
if
,
quote
,
car
,
cdr
,
cons
,
atom
,
print
,
progn
,
while
, {
lambda
,
macro
},
eval
,
eq
, {
+
,
-
,
*
,
/
,
mod
}, {
<
,
>
},
list
, and lambda/macro invocations (when if the token is not a special form). Using an
if
statement to find the corresponding procedure for a given token becomes a linear search for the token comparisons. To speed up this search process, a hash table is created for jumping to the corresponding procedures. Since the memory addresses for the special forms can be determined before parsing the Lisp program, all of the symbols for the special forms have a fixed memory address. Therefore, the hash key can be created by subtracting an offset to the symbol’s memory address, to point to a hashtable that is created near the register locations. This hashtable is provided in
memheader.eir
. When the hash key is larger than the regions of this hashtable, it means that the symbol is not a special form, so the evaluation jumps to the lambda/macro invocation procedure.
The Lisp implementation has 3 distinct value types,
ATOM
,
INT
, and
LAMBDA
. Each value only consumes one QFT byte of memory; the
ATOM
value holds the pointer to the symbol’s string hashtable, the
INT
value holds the signed integer value, and
LAMBDA
holds a pointer to the
Lambda
struct, as well as its subtype information, of either
LAMBDA
,
MACRO
,
TEMPLAMBDA
and
TEMPMACRO
. (The
TEMPLAMBDA
and
TEMPMACRO
subtypes are lambda and macro types that recycles its argument value memory space every time it is called, but is unused in the final lisp programs.) Since the RAM’s address space is only 10 bits, there are 6 free bits that can be used for addresses holding pointers. Therefore, the value type and subtype information is held in these free bits. This makes the integer in the Lisp implementation to be a 14-bit signed integer, ranging from -8192 to 8191.
Minimization of Stack Region Usage
Since the C compiler used in this project does not have memory optimization features, this has to be done manually within the C source code. This led to the largest reason why the interpreter’s source code seems to be obfuscated.
One of the largest bottlenecks for memory access was stack region usage. Every time a stack region memory access occurs, the assembly code performs memory address offset operations to access the stack region. This does not happen when accessing the heap memory, since there is only one heap region used in the entire program, so the pointers for global variables can be hard-coded by the assembler. Therefore, it is favorable optimization-wise to use the heap memory as much as possible.
One way to make use of this fact is to use as much global variables as possible. Since registers and common RAM memory share the same memory space, global variables can be accessed with a speed comparable to registers (However, since the physical location of the RAM memory slot within the pattern affects the I/O signal arrival time, and the registers have the most smallest RAM addresses, i.e. they are the closest to the CPU unit, the registers have the fastest memory access time).
Another method of saving memory was to use union memory structures to minimize the stack region usage. In the C compiler used in this project, every time a new variable is introduced in a function, the function’s stack region usage (used per call) is increased to fit all of the variables. This happens even when two variables never appear at the same time. Therefore, using the fact that some variables never appear simultaneously, unions are used for every occurence of such variables, so that they can use a shared region within the stack space. This led to minimization of the stack region usage. Since the stack region is only 233 hextets (1 byte in the QFT RAM is 16 bits) large, this allowed to increase the number of nested function calls, especially the nested calls of
eval
which evaluates the S-expressions. Since the S-expressions have a list structure, and
eval
becomes nested when lambdas are called in the Lisp program, this optimization was significant for allowing more sophisticated Lisp programs to be run in the architecture.
The QFTASM layer
The QFT assembly generated by the C compiler has a lot of room for optimization. I therefore created a compiler optimization tool to reduce the QFTASM assembly size.
Constant folding
Immediate constant expressions such as
ADD 1 2 destination
is folded to a
MOV
operation.
MOV
folding
The QFT assembly code can be splitted into subregions by jump operations, such that:
Each subregion doesn’t contain any jump operations
Each subregion ends with a jump operation
Every jump operation in the assembly is guaranteed to jump to the beginning of a subregion, and never to the middle of any subregion
The last guarantee where jumps never occur in the middle of a subregion is provided by the C compiler. The ELVM assembly’s program counter is designed so that it increases only when a jump instruction appears. This makes an ELVM program counter to point to a sequence of multiple instructions, instead of a single instruction. Since the ELVM assembly uses the ELVM program counter for its jump instructions, it is guaranteed that the jump instructions in the QFT assembly never jump to the middle of any subregion, and always jumps to a beginning of a subregion.
In each subregion, the dependency graph for the memory address is created. If a memory address becomes written but is later overwritten without becoming used in that subregion at all, the instruction to write to that memory address is removed. Since it is guaranteed that jump operations never jump to the middle of any subregion, it is guaranteed that the overwritten values can be safely removed without affecting the outcome of the program. The
MOV
folding optimization makes use of this fact to remove unnecessary instructions.
This folding process is also done with dereferences; if a dereferenced memory address is written, and the address is overwritten without being used at all, and the dereference source is unoverwritten at all during this process, the instruction for writingto the dereferenced memory address is removed.
Jump folding
If the destination of a conditional or fixed-destination jump instruction points to another jump instruction with a fixed destination, the jump destination is folded to the latter jump instruction’s destination.
A similar folding is done when a fixed jump instruction points to a conditional jump instruction, where the fixed jump instruction is replaced by the latter conditional jump instruction.
The Varlife layer (the computer architecture)
Created the with a lookup table structure for the ROM module
In
this image
of the CPU and its surrounding modules, the two modules on the top are the ROM modules. The original ROM module had one table, with the memory address as the key and the instruction as the value. I recreated the ROM module to add a lookup table layer, where each distinct instruction (not the opcodes, but the entire instruction including the values used within) holds a distinct serial integer key. The ROM module on the right accepts a program counter address and returns the instruction key for the program counter. The module on the left accepts the instruction key and returns the actual bits of the instruction as the output. This allows for dictionary compression to be performed to the ROM data, saving a lot of space. Since the instructions are 45 bits and the instruction keys are only 10 bits, the instruction key table is 1/4 the size of the original ROM module. Although the ROM size is 3223 for the entire Lisp interpreter, there were only 616 distinct instructions in the Lisp interpreter, making the size of the instruction table be 616 ROM units high, effectively reducing the ROM module size altogether.
The ROM module features another form of compression, where absence of cells are used to represent 0-valued bits within the instruction. Below is a close-up look of the ROM value module:
Notice that some cells on the left are absent, despite the table being expected to be a rectangular shape. This is because absent cells do not emit any signals, hence effectively emitting 0-valued bits as the output. To use this fact, all of the instructions are first alphabetically ordered at table creation time, so that instructions that start with trailing zeroes become located higher in the table (further from the signal source). This allows for a maximum number of cells to be replaced with absent units to represent 0-valued bits. In fact, the instruction for no-ops is represented as all zeroes, so all of the units in the value module are replaced by absent cells. The no-op instruction appears a lot of times immediately after the jump operation, due to the fact that the QFT architecture has a branch delay when invoking a jump instruction, requiring a no-op instruction to be present to compensate for the delay.
Added new optimized instructions to the ALU, and removed unused ones
I removed the
AND
,
OR
,
SL
(shift left),
SRL
(shift right logical), and the
SRA
(shift right arithmetical) opcodes, and added the
SRU
(shift right unit) and
SRE
(shift right eight) opcodes to the architecture. Since there already were opcodes for
XOR
(bitwise-xor) and
ANT
(bitwise-and-not),
AND
and
OR
, which were not used much in the interpreter, could be replaced by these opcodes. The bitshift operations had significantly larger patterns than the other opcodes, being more than 10 times larger than the other opcodes. These were reduced to a fixed-size shift operations which could be implemented in the same sizes as the other opcodes. Since the shift left opcode can be replaced by consecutively adding its own value, effectively multiplying by powers of 2, the opcode was safely removed. The main reason for the original bitshift units being large was due to the shift amounts being dependent on the values of the RAM. Converting a binary value to a physical (in-pattern) shift amount required a large pattern. On the other hand, shifting a fixed value could be implemented by a significantly more simpler pattern. The shift right eight instruction is mainly used for reading out the standard input, where each ASCII character in the input string is packed into one 16-bit RAM memory address.
This resulted in a total of exactly 8 opcodes,
ANT
,
XOR
,
SRE
,
SRU
,
SUB
,
ADD
,
MLZ
, and
MNZ
. Since this can fit in 3 bits, the opcode region for the instruction value was reduced by 1 bit. Since the RAM module is 10 bits, and the third value of the instruction is always the writing destination of the RAM, and the first instruction can also be made so that it becomes the reading source address of the RAM, this allows for an additional 6*2=12 bits to be reduced from the instruction length. These altogether has reduced the ROM word size from 58 to 45 bits, reducing nearly 1/4 of the original instruction size.
Extended the ROM and RAM address space from 9,7-bit to 12,10-bit
The original QFT architecture had a ROM and RAM address space of 9 and 7 bits. I extended the ROM and RAM address space to 12 and 10 bits, respectively. This was not a straightforward task as it first seemed, since the signal arrival timings between the modules had to be carefully adjusted in order for the signals to line up correctly. This involved reverse-engineering and experimenting undocumented VarLife pattern units used in the original QFT architecture. The same held for when redesigning other parts of the architecture.
Reducing the Standard Input Size
Since each byte of the RAM module can be ordered arbitrarily in the CPU’s architecture, the RAM is arranged so that the standard output is written at the very bottom of the RAM module, and proceeds upwards. Therefore, the contents of the RAM can easily be observed in a Game of Life viewer by directly examining the bottom of the RAM module.
Since RAM has 16 bits of memory per memory address, it allows to fit two ASCII-encoded characters per one address. Therefore, the standard input is read out by reading two characters per address. For the standard output, one character is written to one address for aesthetic reasons, so that the characters can be directly observed in a Game of Life viewer the pattern more easily. Also, for the standard output to proceed upwards within the RAM module pattern, the memory pointer for the standard output proceeds backwards in the memory space, while the pointer for the standard input proceeds forwards in the memory space.
The Game of Life layer
Optimizing the Game of Life layer mainly revolved around understanding the
Macrocell
format for representing and saving Game of Life patterns, and the
Hashlife
algorithm. The Macrocell format uses quadtrees and memoization for compressing repeated patterns. Since the final Game of Life pattern is an array of OTCA metapixels which are 2048x2048 large, and even has repeated patterns in the VarLife layer (meaning that there are repeated configurations of OTCA metapixels), this compression reduces the file size for the QFT pattern significantly. The best example that let me understand the Macrocell format was an example provided by Adam P. Goucher in
this thread
in Golly’s mailing list.
The Hashlife algorithm also uses quadtrees and memoization to speed up the Game of Life simulations. This algorithm makes use of the fact that the same pattern in a same time frame influences only a fixed extent of its surrounding regions, hence allowing for memoization.
As for optimization, I first noticed that the QFT pattern had a 1-pixel high pattern concatenated to the entire pattern. The original QFT pattern in the original QFT repository was carefully designed so that it is composed of 8x8-sized pattern units. Therefore, most of the patterns can be represented by 8x8 tiles. However, since the 1-pixel high pattern at the top creates an offset that shifts away the pattern from this 8x8 grid, it causes the pattern to have fewer repeated patterns if interpreted from the corner of its bounding box, causing the memoization to work inefficiently. I therefore tried putting a redundant cell (which does not interfere with the rest of the pattern) to realign the entire pattern to its 8x8 grid, which actually slightly reduced the resulting Macrocell file size from the original one. Although I didn’t compare the running times, since the Hashlife algorithm uses memoization over repeated patterns as well, I expect this optimization to at least slightly contribute to the performance of the simulation.
Another optimization was improving the metafier script used to convert VarLife patterns to Game of Life (
MetafierV3.py
). The original script used a square region to fit the entire pattern to create the quadtree representation. However, since the Lisp in Life VarLife pattern is 968 pixels wide but 42354 pixels high, it tried to allocate a 65536x65536-sized integer array, which was prohibitively large to run. I modified the script so that it uses a rectangular region, where absent regions of the quadtree are represented as absent cells. Although this is very straightforward with the knowledge of the Macrocell format, it was difficult at first until I became fond of the algorithms surrounding the Game of Life.
Memory Region Map and the Phases of Operation
The memory region map is carefully designed to save space. This is best described with the operation phases of the interpreter.
Phase 0: Precalculations
Various precalculations are done after the interpreter starts running. The construction of the string interning hashtable for reserved atoms such as
define
,
quote
, etc. are done in this phase. For the GCC-compiled interpreter, some variables that are defined in the QFT memory header are defined in the C source.
Since the outcome of these precalculations are always the same for any incoming Lisp program, this phase is done on the host PC, and the results are saved as ramdump.csv during the QFTASM compile time. The results are then pre-loaded into the RAM when the VarLife and Game of Life patterns are created. This allows to saves some CPU cycles when running the interpreter.
As explained earlier, the QFT architecture holds register values in the RAM. There are 11 registers, which are placed in the addresses from 0 to 10.
The reserved values in the image include strings such as reserved atoms and the destinations of the jump hashtable used for evaluation. The rest of the region is used for storing global variables in the interpreter’s C source code.
Phase 1: Parsing
The Lisp program provided from the standard input is parsed into S-expressions, which is written into the heap region.
Notice that the string interning hashtables are created in the later end of the stack region. This is because these hashtables are only used during the parsing phase, and can be overwritten during the evaluation phase. For most Lisp programs including the ones in this repository, the stack region does not grow far enough to overwrite these values. This allows to place 3 growing memory regions during the parsing phase, the stack region used for nested S-expressions, the heap region which stores the parsed S-expressions, and the string interning hashtables when new strings are detected within the Lisp program. Newly detected strings such as variable names in the Lisp program are also written into the heap region.
The heap region is also designed so that it overwrites the standard input as it parses the program. Since older parts of the program can be discarded once it is parsed, this allows to naturally free the standard input region which save a lot of space after parsing. The standard input also gets overwritten by the Standard output if the output is long enough. However, due to this design, long programs may have trouble at parsing, since the input may be overwritten too far and get deleted before it is parsed. A workaround for this is to use indentation which places the program further ahead into the memory, which will prevent the program from being overwritten from the growing heap region. For all of the programs included in this repository, this is not an issue and the programs become successfully parsed.
Phase 2: Evaluation
By this time, all of the contents of the stack region and what is ahead of the head of the heap region can be overwritten in the further steps. Note that a similar issue with the standard input happens with the standard output - when too many Lisp objects are created during runtime, it may overwrite the existing standard output, or may simply exceed the heap region and proceed into the stack region. Since the heap region is connected to the later end of the stack region, this may be safe if the standard output is carefully handled, but the interpreter will eventually start overwriting values of the stack region if the heap continues to grow.
Miscellaneous
How can a 2-state OTCA Metapixel emulate the behavior of an 8-state VarLife pattern?
This is one of the most interesting ideas in the original QFT project to make the QFT architecture possible. As explained in
the original QFT post
, the 8 states of VarLife are actually a mixture of 4 different birth/survival rules with binary states. This means that each VarLife cell can only transition between two fixed states, and the birth/survival rule for that cell does not change at any point in time. Moreover, the OTCA Metapixel is designed so that each metapixel can carry its own birth/survival rules. Therefore, each VarLife cell can be enoded into an OTCA Metapixel by specifying its birth/survival rule and the binary state. This means that the array of OTCA Metapixels in the metafied pattern is actually a mixture of metapixels with different birth/survival rules, arranged in a way so that it makes the computation possible.
Halting Time
After the program counter is set to 65535 and the program exits, no more ROM and RAM I/O signals become apparent in the entire module.
This makes the VarLife pattern becomes completely stationary, where every pattern henceforth becomes completely identical.
Defining this as the halting time for the calculation, the pattern for
print.lisp
halts at exactly 105,413,068 VarLife generations.
The halting time for the Game of Life patterns are defined similarly for the meta-states of the OTCA Metapixels.
Since OTCA Metapixels never become stationary, the Game of Life states do not become stationary after the halting time,
but the meta-states of the OTCA Metapixels will become stationary after the halting time.
For the VarLife pattern of
print.lisp
, by generation 105,387,540, the value 65535 gets written to the program counter. At generation 105,413,067, the last signal becomes just one step from disappearing, and at generation 105,413,068 and onwards, the pattern becomes completely stationary and every pattern becomes identical to each other.
In the Game of Life version, since the OTCA Metapixel continues running indefinitely, the pattern does not become completly stationary, but the meta-states of the OTCA Metapixels will become completely stationary, since it is an emulation of the VarLife pattern.
Note that the halting times for programs other than print.lisp is just a sufficient number of generations, and not the exact values.
The required number of generations per CPU cycle depends on many factors such as the ROM and RAM addresses and the types of opcodes, since the arriving times of the I/O signals depend on factors such as these as well. This makes the number of generations required for the program to halt become different between each program.
For example, print.lisp has a rate of 23822.16 generations per CPU cycle (GpC), but z-combinator.lisp has a rate of 28870.81 GpC, and primes-print.lisp has 31502.43 GpC. 23822.16 GpC is in fact insufficient for z-combinator.lisp to finish running, and 28870.81 is also insufficient for primes-print.lisp to finish running.
Miscellaneous Screenshots
The ALU unit in the CPU. From the left are the modules for the
ANT
,
XOR
,
SRE
,
SRU
,
SUB
,
ADD
,
MLZ
, and the
MNZ
opcodes.
The
SRE
and the
SRU
opcodes were newly added for this project.
Credits
The CPU architecture used in this project was originally created by
the members of the
Quest For Tetris
(QFT) project,
and was later optimized and modified by
Hikaru Ikuta
for the Lisp in Life project.
The VarLife cellular automaton rule was also defined by the members of the QFT project.
The metafier for converting VarLife patterns to Conway’s Game of Life patterns was written by the members of the QFT project,
and was later modified by Hikaru Ikuta to support the pattern size of the Lisp in Life architecture.
The assembly language for the QFT architecture, QFTASM, was also originally designed by the members of the QFT project,
and was later modified by Hikaru Ikuta for this project for achieving a feasible running time.
The Lisp interpreter was written by Hikaru Ikuta.
The compilation of the interpreter’s C source code to the ELVM assembly is done using an extended version of
8cc
written by Rui Ueyama from Google.
The compilation from the ELVM assembly to QFTASM is done by an extended version of
ELVM
(the Esoteric Language Virtual Machine),
a project by Shinichiro Hamaji from Preferred Networks, Inc.
The Game of Life backend for ELVM was written by Hikaru Ikuta, and was later further extended by Hikaru for the Lisp in Life project.
In late 2022, I had a conversation with a senior engineer on the coming problem
of “what to do when AI is writing most of the code”. His opinion, which I found
striking at the time, was that engineers would transition from writing mostly
“implementation” code, to mostly writing tests and specifications.
I remember thinking at the time that this was prescient. With three years of
hindsight, it seems like things are trending in a different direction. I
thought
that the reason that testing and specifications would be useful was
that AI agents would be struggling to “grok” coding for quite some time, and
that you’d need to have robust specifications such that they could stumble
toward correctness.
In reality, AI written tests were one of the
first
tasks I felt comfortable
delegating. Unit tests are squarely in-distribution for what the models have
seen on all public open source code. There’s a lot of unit tests in open source
code, and they follow predictable patterns. I’d expect that the variance of
implementation code – and the requirement for out-of-distribution patterns –
is much higher than testing code. The result is that models are now quite good
at translating English descriptions into quite crisp test cases.
1
System Design
There exists a higher level problem of holistic system behavior verification,
though. Let’s take a quick diversion into systems design to see why.
System design happens on multiple scales. You want systems to be robust – both
in their runtime, and their ability to iteratively evolve. This nudges towards
decomposing systems into distinct components, each of which can be
internally
complicated but exposes a firm interface boundary that allows you to abstract
over this internal complexity.
If we design things well, we can swap out parts of our system without disrupting
other parts or harming the top-level description of what the system does. We can
also perform top-down changes iteratively – adding new components, and retiring
old ones, at each level of description of the system.
This all requires careful thinking of how to build these interfaces and
component boundaries in such a way that (1) there is a clean boundary between
components and (2) that stringing all the components together actually produces
the desired top-level behavior.
To do this effectively, we require maps of various levels of description of the
system’s
territory
. My
conjecture is that
code is not a good map for this territory
.
To be clear, I’ve found a lot of value in throwing out system diagram maps and
looking directly at the code territory when debugging issues. However,
code-level reasoning is often not the best level of abstraction to use for
reasoning about systems. This is for a similar reason that “modeling all the
individual molecules of a car” is not a great way to estimate that car’s braking
distance.
LLMs have increasingly longer context windows, so one could naively say “just
throw all the code in the context and have it work it out”. Perhaps. But this is
still just clearly not the most efficient way to reason about large-scale
systems.
Formal Verification
The promise of formal verification is that we can construct provably composable
maps which still match the ground-level territory. Formal verification of code
allows you to specify a system using mathematical proofs, and then exhaustively
prove
that a system is correct. As an analogy: unit tests are like running an
experiment. Each passing test is an assertion that, for the conditions checked,
the code is correct. There could still exist some
untested
input that would
demonstrate incorrect behavior. You only need one negative test to show the code
is incorrect, but only a provably exhaustive set of inputs would be sufficient
to show the code is fully correct. Writing a formal verification of a program is
more like writing a proof. Writing a self-consistent proof is sufficient to show
that the properties you’ve proven always hold.
For example, as of 2009, the formally verified seL4 microkernel consisted of
8,700 lines of C code, but proving it correct required 20 person-years and
200,000 lines of Isabelle code – or 23 lines of proof and half a person-day
for every single line of implementation. Moreover, there are maybe a few
hundred people in the world (wild guess) who know how to write such proofs,
since it requires a lot of arcane knowledge about the proof system.
…
If formal verification becomes vastly cheaper, then we can afford to verify
much more software. But on top of that, AI also creates a need to formally
verify more software: rather than having humans review AI-generated code, I’d
much rather have the AI prove to me that the code it has generated is correct.
If it can do that, I’ll take AI-generated code over handcrafted code (with all
its artisanal bugs) any day!
I’ve long been interested in formal verification tools like TLA+ and Rocq (née
Coq). I haven’t (yet) been able to justify to myself spending all that much time
on them. I think that’s changing: the cost of writing code is
coming down dramatically
.
The cost of
reviewing
and maintaining it is also coming down, but at a slower rate. I agree with
Kleppmann that we need systematic tooling for dealing with this mismatch.
Wishcasting
a future world, I
would be excited to see something like:
One starts with a high-level system specification, in English.
This specification is spun out into multiple TLA+ models at various levels
of component specificity.
These models would allow us to determine the components that are
load-bearing for system correctness.
The most critical set of load-bearing components are implemented with a
corresponding formal verification proof, in something like
Rocq
.
The rest of the system components are still audited by an LLM to ensure they
correctly match the behavior of their associated component in the TLA+ spec.
The biggest concern to me related to formal verification is the following two
excerpts, first from Kleppmann, and then from Hillel Wayne, a notable proponent
of TLA+:
There are maybe a few hundred people in the world (wild guess) who know how to
write such proofs, since it requires a lot of arcane knowledge about the proof
system. –
Martin Kleppmann
TLA+ is one of the more popular formal specification languages and you can
probably fit every TLA+ expert in the world in a large schoolbus. –
Hillel Wayne
For formal verification to be useful in practice, at least some of the arcane
knowledge of its internals will need to be broadly disseminated. Reviewing an
AI-generated formal spec of a problem won’t be useful if you don’t have enough
knowledge of the proof system to poke holes in what the AI came up with.
I’d argue that undergraduate Computer Science programs should allocate some of
their curriculum to formal verification. After all, students should have more
time on their hands as they delegate implementation of their homework to AI
agents.
The Paris Climate Treaty Changed the World. Here’s How
Portside
portside.org
2025-12-13 03:00:21
The Paris Climate Treaty Changed the World. Here’s How
barry
Fri, 12/12/2025 - 22:00
...
Today marks the 10th anniversary of the Paris climate treaty, one of the landmark days in climate-action history. Attending the conference as a journalist, I watched and listened and wondered whether 194 countries could ever agree on anything at all, and the night before they did, people who I thought were more sophisticated than me assured me they couldn’t. Then they did. There are a lot of ways to tell the story of what it means and where we are now, but any version of it needs respect for the complexities, because there are a lot of latitudes between the poles of total victory and total defeat.
I had been dreading the treaty anniversary as an occasion to note that we have not done nearly enough, but in July I thought we might be able celebrate it. Because, on 23 July, the international court of justice handed down an epochal ruling that gives that treaty enforceable consequences it never had before. It declares that all nations have a legal obligation to act in response to the climate crisis, and, as Greenpeace International
put
it, “obligates states to regulate businesses on the harm caused by their emissions regardless of where the harm takes place. Significantly, the court found that the right to a clean, healthy and sustainable environment is fundamental for all other human rights, and that intergenerational equity should guide the interpretation of all climate obligations.” The Paris treaty was
cited
repeatedly as groundwork for this decision.
Ralph Regenvanu, Vanuatu’s special envoy for climate, said of the decision: “I choose my words carefully when I say that this may well be the most consequential case in the history of humanity.” Costa Rica’s Christiana Figueres, who presided over the negotiations that created that Paris climate treaty
declared
, with jubilation, on her podcast: “The reason why I am truly tearful is this is without a doubt, the most far-reaching, the most comprehensive and the most consequential legal opinion we’ve ever had.”
This case that ended in the world’s highest court began with 27 law students in the University of the South Pacific who in 2019, asked themselves what they could do about climate – and it’s not hard to imagine a “what can we do, we’re only students” or “what can we do, we’re from tiny remote nations” stance. Instead, they set out to take a case all the way to the international court of justice in The Hague, unimpeded by the conventional wisdom that they were nobody from nowhere. They needed a law firm, and they chose
Blue Ocean Law
firm, sticking with the Pacific island nations, with indigenous leadership, with the impacted global south. And they needed a country to be plaintiff and the island nation of Vanuatu stepped up. The unanimous court decision in favor of the litigants matters most of all in how it is implemented, either through direct cases or through its impact on nations that take notice and reduce their climate devastation before they’re brought to court.
It’s not widely known that most countries and negotiators went into the conference expecting to set a “reasonable” two-degree threshold global temperature rise we should not cross. As my friend Renato Redentor Constantino, a climate organizer in the Philippines, wrote:“The powerful exerted tremendous effort to keep a tiny number, 1.5, out of United Nations documents. 1.5 degrees centigrade represents what science advises as the maximum allowable rise in average global temperature relative to preindustrial temperature levels. It was the representatives of the mostly global-south nations of the Climate Vulnerable Forum who fought to change the threshold from 2 degrees to 1.5.”
I remember them chanting “1.5 to stay alive”, because two degrees was a death sentence for too many places and people. The officially powerless swayed the officially powerful, and 1.5 degrees was written into the treaty and has became a familiar number in climate conversations ever since. Even though we’ve crashed into that 1.5 threshold, far better that it be set there than at 2 degrees, in which case we might well be complacent in the face of even more destructive temperature rise.
It takes far more than storytelling to get where we need to go, but how we tell the stories is crucial. I asked the climate policy expert Leah Stokes of UC Santa Barbara about the impact of Paris and she told me: “When small island nations pushed for 1.5 degrees as the target, they also requested the IPCC [intergovernmental panel on climate change] write a special report on what policy would be required to get there. That report came out in October 2018, and rocked around the world with headlines like ‘we have 12 years’. It changed the entire policy conversation to be focused on cutting pollution in half by 2030. Then, when it came time to design a climate package, Biden made it clear that his plan was to try to meet that target. You can draw a line between small islands’ fierce advocacy through to the passage of the largest climate law in American history.”
That’s how change often works, how an achievement ripples outward, how the indirect consequences matter as well as the direct ones. The Biden administration tried to meet the 1.5 degree target with the most ambitious US climate legislation ever, the Build Back Better Act that passed Congress after much pressure and conflict as the Inflation Reduction Act. Rumors of the Inflation Reduction Act’s death are exaggerated; some pieces of its funding and implementation are still in effect, and it prompted other nations to pursue more ambitious legislation. In the US, state and local climate efforts, have not been stopped by the Trump administration. Globally not nearly enough has been done to stop deforestation, slash fossil-fuel subsidies, and redesign how we live, move, and consume.
The renewables revolution is a bright spot. It’s often overlooked because it’s incremental, technical, economic, and dispersed, and even its major milestones don’t receive nearly the recognition they should. When the Paris treaty was signed, renewables were overall more expensive than fossil fuel, and were not widely implemented. But the drop in cost and spread of solar has outstripped virtually all predictions. The energy-policy group Ember
reports
: “Record solar power growth and stagnating fossil fuels in 2025 show how clean power has become the driving force in the power sector. Historically a growth segment, fossil power now appears to be entering a period of stagnation and managed decline.” The International Energy Agency
notes
another 2025 landmark: “The electricity sector is now the largest energy employer, surpassing fuel supply for the first time, as the age of electricity gathers pace.”
Anyone who in 2015 accurately prophesied what the energy landscape would look like in 2025 would have been thought to be ridiculous, delusional, or crazy (just like anyone who said in, say, 1995 that the UK would close its last coal-fired plant in 2024 would have been). 2025 is the year that renewables
outstripped
coal as an energy source. Ancillary developments like battery storage technology and design improvements and innovations have led to widespread renewables adoption from Denmark (which gets only 10% of its electricity from fossil fuels) to Texas to Pakistan (where small-scale solar panels from China have led something of an energy revolution). Solar power is now
so cheap and abundant
in Australia that electricity is going to be free for three hours in the middle of the day.
Problems that the enemies of climate action liked to cite, such as the intermittency of sun and wind, have been addressed with battery storage. California now often produces more than 100% of its electricity needs through renewables, led by solar, in the daytime. The excess goes into the batteries so that the state literally runs on sunshine at night. California uses 44% less natural gas to produce electricity than it did two years ago. China is reducing its emissions because it’s speedily transitioning to renewables; earlier this fall, in the United Nations, for the first time it made an actual commitment to reduction targets; and for the last eighteen months its CO2 emissions have been flat or falling.
Is this good enough? Far from it, but we are, as they say, “bending the curve”: before Paris the world was headed for 4 degrees of warming; it’s now headed for 2.5 degrees, which should only be acceptable as a sign that we have bent it and must bend more and faster. In the best-case scenario, the world’s leaders and powers would have taken the early warnings about climate change seriously and we’d be on the far side of a global energy transition, redesign of how we live, and protection of oceans, rainforests, and other crucial climate ecosystems. But thanks to valiant efforts by the climate movement and individual leaders and nations, we’re not in the worst-case scenario either. Landmarks like the Paris treaty and the Vanuatu victory matter, as do the energy milestones, and there’s plenty left to fight for. For decades and maybe centuries it has has been too late to save everything, but it will never be too late to save anything.
Rebecca Solnit is a Guardian US columnist. She is the author of Orwell’s Roses and co-editor with Thelma Young Lutunatabua of the climate anthology Not Too Late: Changing the Climate Story from Despair to Possibility.
The Guardian
is globally renowned for its coverage of politics, the environment, science, social justice, sport and culture. Scroll less and understand more about the subjects you care about with the Guardian's
brilliant email newsletters
, free to your inbox.
1300 Still Images from the Animated Films of Hayao Miyazaki's Studio Ghibli
No major American city has ever built a universal child care system. That means that
nearly three-quarters
of American parents who are looking for a way to take care of their children are struggling to find it. At the same time, costs have exploded: Day care now runs
more than twice
what it did just before the pandemic.
Most politicians don’t even try to enact universal systems — the cost and complexity are daunting, and child care has long been seen as a private family problem, not a public responsibility. But
Zohran Mamdani ran on
such a plan — and New Yorkers made him their next mayor.
Many parts of Mr. Mamdani’s agenda have been dismissed as unrealistic, and his child care program often tops that list. He has promised free care for every child from 6 weeks to 5 years old and pledged to offer child care workers wages
“at parity,” in the campaign’s words, with public school teachers
. Critics say it will cost too much and prove impossible to build at scale. A poll from this fall captured the skepticism:
71 percent of likely New York City voters
supported his pitch for universal child care, but only about 50 percent of those surveyed thought he could actually deliver it.
Having reported on child care policy around the country over the past 10 years, I think many people are looking at Mr. Mamdani’s plan all wrong. It will not be easy to implement, but if he learns from the mistakes that have derailed past efforts, he could pull off something remarkable. He has the opportunity to change the lives of hundreds of thousands of New Yorkers with young children, many of whom pay over
$20,000
a year to send them to day care and preschool. More than that, he could offer a powerful example to leaders all over the country.
Universal child care need not be a pipe dream in America — something we envy the Danes and the Swedes for but never imagine having for ourselves. It can and should be as fundamental to a city’s infrastructure as transit or housing, as essential for
attracting workers and residents
as any investment a mayor can make. After all, for most parents of young children, child care isn’t optional — it’s what makes holding down a job possible.
Mr. Mamdani’s child care bid comes at a moment of unusual political openness to the idea. When the pandemic shut down child care options for millions of Americans, it stranded parents and employers alike. In the years since, a growing coalition of
economists
and
business leaders
has come to see child care as an integral part of economic growth — not a handout, but a way to
keep workers in the labor force
and
families in cities
.
That openness crosses party lines.
Polling this year
found that a majority of Republicans now say the federal government spends too little on programs that benefit children — a notable change for a party long skeptical of new social spending. Candidates for governor in
Georgia
and
Wisconsin
are running on universal child care plans, and affording child care is now a question that presidential candidates
from both parties
get asked about
in debates.
The first thing Mr. Mamdani will need to get right: Any new system has to include the full range of child care options families rely on. That means not just day care centers and public school classrooms but also home-based businesses — small day care operations run out of private residences — and more informal arrangements with family and neighbors.
When people imagine universal child care, they often picture massive new facilities going up across a city. That’s not sufficient. “Child care infrastructure exists, and it exists in the neighborhoods that need it most,” said Jessica Sager,
the co-founder of All Our Kin
, a nonprofit that supports home-based child care providers.
Most New York City home-based providers don’t receive city subsidies or support, she said — and bringing them into the system could unlock thousands of slots for families who need them.
New York has learned the importance of such investments the hard way. When Mayor Bill de Blasio started rolling out universal pre-K for 4-year-olds in 2014, his administration funneled nearly all the new money to
child care centers
. The result was that home-based providers, who relied on revenue from preschool-age children to stay afloat,
lost a critical part of their
enrollment. Many child care businesses collapsed, and providers
quit the field
. The United Federation of Teachers chapter that represents home-based providers shrank from 28,000 providers in 2007 to just 12,000 today.
It’s a hopeful sign that Mr. Mamdani’s
pick for first deputy mayor
was the budget director under Mr. de Blasio, who helped secure funding for universal pre-K, and that one of his
transition co-chairs
played a critical role in expanding that program; they should know what worked and what didn’t. Mr. Mamdani seems keen on a mixed-delivery system, having said during his campaign that he envisions subsidizing “families who prefer to have a trusted neighbor or relative take care of their child.”
Getting the design of such a program right is only half the battle. To deliver on all this, Mr. Mamdani will have to move boldly — but not too
fast. In 1997, Quebec tried to implement universal child care in three years. The rush led the province to cut
corners on quality
,
and the fallout has given skeptics ammunition ever since. Vice President JD Vance
has cited Quebec’s rocky rollout as evidence
that universal child care isn’t worth pursuing. Just last month, The Economist cited Quebec in a
misleading
piece
on the “harm” of universal care. Early stumbles cast a long shadow.
There’s a danger in the other direction, too. In the few states that have made real investments in child care, leaders have been too quick to claim
“mission accomplished”
when plenty of families still don’t have good care options for their kids. In a country like the United States, where caregiving has long been devalued, no child care system can survive without sustained attention and investment, year after year.
You can see this problem play out in wages for child care workers. Child care is one of the country’s
lowest-paid jobs
, though Washington, D.C., has tried to change that locally. A few years ago, the city established what it called a
pay equity fund
to bring child care workers’ salaries closer to those of public school teachers, supplementing their wages via a new tax on the city’s highest earners.
By many measures, D.C.’s program has been a success: Child care workers saw
significant pay increases
,
funded by the city, that enabled day care centers to
hire more staff members and care for more children
.
But when budget pressures hit, the supposed dedicated funding
became a political football
. The funding has not kept up with the program, which has created uncertainty about its future. For workers who had finally started to feel fairly compensated, the whiplash has been demoralizing and destabilizing.
New Mexico is perhaps the most instructive example of a premature victory lap. The state
has
earned
glowing
national praise
for its governor’s
commitment
to make all families eligible for state child care subsidies. But eligibility is not care. Even in 2023, before this latest expansion,
only a quarter of eligible
children under 6 were receiving aid — and while enrollment had surged among middle-income families, it had fallen among families below the poverty line. Moreover, this spring, legislators
quietly diverted some child care money
to a behavioral health program,
illustrating the competing budget pressures politicians face.
In New York, much
media
coverage
has focused
on how Mr. Mamdani will pay for his child care plan. And it’s an important question, since the plan will cost an estimated
$6 billion
or more, and
federal Medicaid cuts
are threatening to blow a hole in the state budget.
For decades, the United States has told families that child care is their burden to navigate. Mr. Mamdani made a bet that New Yorkers were ready for a different answer. If he can deliver, he’ll give the rest of the country a much-needed blueprint to follow, too.
Rachel Cohen Booth is a senior policy correspondent for Vox. She is working on a book, forthcoming from Harmony, about individual agency and social change.
Operation Condor: A Network of Transnational Repression 50 Years Later
Published
Washington, D.C., November 26, 2025
- On General Augusto Pinochet’s 60th birthday, November 25, 1975, four delegations of Southern Cone secret police chieftains gathered in Santiago, Chile, at the invitation of the Chilean intelligence service, DINA. Their meeting—held at the War College building on la Alameda, Santiago’s downtown thoroughfare—was called “to establish something similar to INTERPOL,” according to the confidential meeting agenda, “but dedicated to Subversion.” During the three-day meeting, the military officials from Argentina, Bolivia, Chile, Paraguay and Uruguay agreed to form “a system of collaboration” to identify, track, capture and eliminate leftist opponents of their regimes. As the conference concluded on November 28, a member of the Uruguayan delegation rose to toast the Chileans for convening the meeting and proposed naming the new organization after the host country’s national bird, the condor. According to secret minutes of the meeting, there was “unanimous approval.”
Chilean records refer to Condor as “Sistema Condor.” CIA intelligence reports called it Operation Condor. It was, as John Dinges writes in his comprehensive history,
The Condor Years
, an agency of “cross-border repression, teams went far beyond the frontiers of the member countries to launch assassination missions and other criminal operations in the United States, Mexico and Europe.” His investigation documented 654 victims of kidnapping, torture and disappearance during Condor’s active operationa
l period
in the Southern Cone between 1976 and 1980. A subdivision of Condor codenamed “Teseo”—for Theseus, the heroic warrior king of Greek mythology—established an international death squad unit based in Buenos Aires that launched 21 operations in Europe and elsewhere against opponents of the military regimes.
On the 50th anniversary of the secret inauguration of Operation Condor, the National Security Archive is posting a selection of documents that record the dark history of transnational repression under the Condor system. The selected records include:
The only known DINA document on the inaugural meeting—the “Closing Statement of the First Inter-American Meeting of National Intelligence”—which summarized the agreement between the original five Condor nations.
The first declassified CIA document to name “CONDOR” as a “cooperative arrangement” against subversion. The heavily censored CIA document, dated June 25, 1976, provides initial intelligence on the 2nd Condor meeting held from May 31 to June 2 in Santiago. It was the first in a flurry of CIA intelligence cables in the summer of 1976 on Condor’s evolution from an intelligence sharing collaboration to a transnational system of disappearance and assassination. “The subjects covered at the meeting,” this CIA report noted, “were more sweeping than just the exchange of information on terrorism and subversion.”
A CIA translation of the “Teseo” agreement—an extraordinary document that bureaucratically records the procedures, budgets, working hours, and operational rules for selecting, organizing and dispatching death squads to eliminate targeted enemies of the Southern Cone regimes. The “Teseo” operations base would be located “at Condor 1 (Argentina).” Each member country was expected to donate $10,000 to offset operational costs, and dues of $200 would be paid “prior to the 30th of each month” for maintenance expenses of the operations center. Expenses for agents on assassination missions abroad were estimated at $3,500 per person for ten days “with an additional $1000 first time out for clothing allowance.”
A CIA report on how the Teseo unit will select targets “to liquidate” in Europe and who will know about these missions. The source of the CIA intelligence suggests that “in Chile, for instance, Juan Manuel Contreras Sepulveda, chief of the Directorate of National Intelligence (DINA) the man who originated the entire Condor concept and has been the catalyst in bringing it into being, will coordinate details and target lists with Chilean President Augusto Pinochet Ugarte.”
The first briefing paper for Secretary of State Henry Kissinger alerting him to the existence of Operation Condor and the political ramifications for the United States. In a lengthy August 3, 1973, report from his deputy Harry Shlaudeman, Kissinger is informed that the security forces of the Southern Cone “have established
Operation Condor
to find and kill terrorists…in their own countries and in Europe. Brazil is cooperating short of murder operations."
CIA memoranda, written by the chief of the Western Hemisphere division, Ray Warren, sounding the alarm on Condor’s planned missions in Europe, and expressing concern that the CIA will be blamed for Condor’s assassinations abroad. One memo indicates that the CIA has taken steps to preempt the missions by alerting French counterparts that Condor operatives planned to murder specific individuals living in Paris.
The completely unredacted FBI “Chilbom” report, written by FBI attaché Robert Scherrer one week after the car bomb assassination of former Chilean Ambassador Orlando Letelier and Ronni Moffitt in downtown Washington, D.C. It was this FBI report that resulted in the revelation of the existence of the Condor system in 1979, when its author, FBI attaché Robert Scherrer, testified at a trial of several Cuban exiles who assisted the Chilean secret police in assassinating Letelier and Moffitt.
The first Senate investigative report on Condor based on CIA documents and briefings written in early 1979 by Michael Glennon, a staff member of the Senate Foreign Relations Subcommittee on International Operations. The draft report was never officially published but was leaked to columnist Jack Anderson; a copy was eventually obtained by John Dinges and Saul Landau and used in their book,
Assassination on Embassy Row
. A declassified copy was released as part of the Obama-authorized Argentina Declassification Project in 2019.
“These documents record the dark history of multilateral repression and state-sponsored terrorism in the Southern Cone—a history that defined those violent regimes of the past,” notes Peter Kornbluh, author of
The Pinochet File: A Declassified Dossier on Atrocity and Accountability
. “Fifty years after Condor’s inauguration, these documents provide factual evidence of coordinated human rights atrocities that can never be denied, whitewashed or justified.”
After many years of investigations and resulting trials, it is now clear that Condor may have backfired on its perpetrators, according to John Dinges, whose updated and expanded edition of
The Condor Years
was published in Spanish in 2021 as
Los Años del Condor: Operaciones Internacionales de asesinato en el Cono Sur
. “It is a kind of historic irony,” Dinges notes, “that the international crimes of the dictatorships spawned investigations, including one resulting in Pinochet’s arrest in London, that would eventually bring hundreds of the military perpetrators to justice. Moreover, because Condor’s most notorious crime was in Washington, D.C., the United States government unleashed the FBI to prosecute DINA and the Chilean regime.”
Other documents on Condor discovered in the archives of member states such as Uruguay can be found on this special website—
https://plancondor.org/
—established to record the history of Condor’s human rights atrocities and hold those who committed them accountable for their crimes.
Special thanks to Carlos Osorio whose years of work documenting Operation Condor made this posting possible.
Founded in 1985 by journalists and scholars to check rising government secrecy, the
National Security Archive
combines a unique range of functions: investigative journalism center, research institute on international affairs, library and archive of declassified U.S. documents ("the world's largest nongovernmental collection" according to the Los Angeles Times), leading non-profit user of the U.S. Freedom of Information Act, public interest law firm defending and expanding public access to government information, global advocate of open government, and indexer and publisher of former secrets.
There is common misconception that troubles most developers using PostgreSQL:
tune VACUUM or run VACUUM, and your database will stay healthy. Dead tuples will
get cleaned up. Transaction IDs recycled. Space reclaimed. Your database will
live happily ever after.
But there are couple of dirty "secrets" people are not aware of. First of them
being
VACUUM is lying to you about your indexes
.
When you delete a row in PostgreSQL, it is just marked as a 'dead tuple'.
Invisible for new transactions but still physically present. Only when all
transactions referencing the row are finished, VACUUM can come along and actually
remove them - reclamining the space in the heap (table) space.
To understand why this matters differently for tables versus indexes, you need
to picture how PostgreSQL actually stores your data.
Your table data lives in the heap - a collection of 8 KB pages where rows are
stored wherever they fit. There's no inherent order. When you INSERT a row,
PostgreSQL finds a page with enough free space and slots the row in. Delete a
row, and there's a gap. Insert another, and it might fill that gap - or not - they
might fit somewhere else entirely.
This is why
SELECT * FROM users
without an ORDER BY can return rows in order
initially, and after some updates in seemingly random order, and that order can
change over time. The heap is like Tetris. Rows drop into whatever space is
available, leaving gaps when deleted.
When VACUUM runs, it removes those dead tuples and compacts the remaining
rows within each page. If an entire page becomes empty, PostgreSQL can reclaim
it entirely.
And while indexes are on surface the same collection of 8KB pages, they are
different. A B-tree index must maintain sorted order - that's the
whole point of their existence and the reason why
WHERE id = 12345
is so
fast. PostgreSQL can binary-search down the tree instead of scanning every
possible row. You can learn more about the
fundamentals of B-Tree Indexes and
what makes them fast
.
But if the design of the indexes is what makes them fast, it's also their
biggest responsibility. While PostgreSQL can fit rows into whatever space is
available, it can't move the entries in index pages to fit as much as possible.
VACUUM can remove dead index entries. But it doesn't restructure the B-tree.
When VACUUM processes the heap, it can compact rows within a page and reclaim
empty pages. The heap has no ordering constraint - rows can be anywhere. But
B-tree pages? They're locked into a structure. VACUUM can remove dead index
entries, yes.
Many developers assume VACUUM treats all pages same. No matter whether they are
heap or index pages. VACUUM is supposed to remove the dead entries, right?
Yes. But here's what it doesn't do -
it doesn't restructure the B-tree
.
What VACUUM actually does
Removes dead tuple pointers from index pages
Marks completely empty pages as reusable
Updates the free space map
What VACUUM cannot do
:
Merge sparse pages together (can do it for empty pages)
Reduce tree depth
Deallocate empty-but-still-linked pages
Change the physical structure of the B-tree
Your heap is Tetris, gaps can get filled. Your B-tree is a sorted bookshelf.
VACUUM can pull books out, but can't slide the remaining ones together. You're
left walking past empty slots every time you scan.
Let's get hands-on and create a table, fill it, delete most of it and watch what happens.
CREATE EXTENSION IF NOT EXISTS pgstattuple;
CREATE TABLE demo (id integer PRIMARY KEY, data text);
-- insert 100,000 rows
INSERT INTO demo (id, data)
SELECT g, 'Row number ' || g || ' with some extra data'
FROM generate_series(1, 100000) g;
ANALYZE demo;
At this point, our index is healthy. Let's capture the baseline:
SELECT
relname,
pg_size_pretty(pg_relation_size(oid)) as file_size,
pg_size_pretty((pgstattuple(oid)).tuple_len) as actual_data
FROM pg_class
WHERE relname IN ('demo', 'demo_pkey');
Now remove some data, 80% to be precise - somewhere in the middle:
DELETE FROM demo WHERE id BETWEEN 10001 AND 90000;
The goal is to simulate a common real-world pattern: data retention policies,
bulk cleanup operations, or the aftermath of a data migration gone wrong.
VACUUM demo;
SELECT
relname,
pg_size_pretty(pg_relation_size(oid)) as file_size,
pg_size_pretty((pgstattuple(oid)).tuple_len) as actual_data
FROM pg_class
WHERE relname IN ('demo', 'demo_pkey');
The table shrunk significantly, while index remained unchanged. You now have
20,000 rows indexed by a structure build to handle 100,000. Please, also notice
file_size
remain unchanged. VACUUM doesn't return space to the OS, it only
marks pages as reusable within PostgreSQL.
This experiment is really an extreme case, but demonstrates the problem.
Full page (>80% density)
, when the page contains many index entries,
efficiently utilizing space. Each 8KB page read returns substantial useful data.
This is optimal state.
Partial page (40-80% density)
with some wasted space, but still reasonably
efficient. Common at tree edges or after light churn. Nothing to be worried about.
Sparse page (<40% density)
is mostly empty. You're reading an 8KB page to
find a handful of entries. The I/O cost is the same as a full page, but you get
far less value.
Empty page (0% density)
with zero live entries, but the page still exists in
the tree structure. Pure overhead. You might read this page during a range scan
and find absolutely nothing useful.
You might be wondering how can fillfactor help with this? It's the setting you
can apply both for heap and leaf pages, and controls how full PostgreSQL packs the
pages during the data storage. The
default value for B-tree indexes is 90%
. This
leaves 10% of free space on each leaf page for future insertions.
CREATE INDEX demo_index ON demo(id) WITH (fillfactor = 70);
A lower fillfactor (like 70%) leaves more room, which can reduce page splits
when you're inserting into the middle of an index - useful for tables random index
column inserts or those with heavily updated index columns.
But if you followed carefully the anatomy of storage section, it doesn't help
with the bloat problem. Quite the oppossite. If you set lower fillfactor and then
delete majority of your rows, you actually start with more pages, and bigger
chance to end up with more sparse pages than partial pages.
Leaf page fillfactor is about optimizing for updates and inserts. It's not a
solution for deletion or index-column update bloat.
PostgreSQL's query planner estimates costs based on physical
statistics, including the number of pages in an index.
EXPLAIN ANALYZE SELECT * FROM demo WHERE id BETWEEN 10001 AND 90000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Index Scan using demo_pkey on demo (cost=0.29..29.29 rows=200 width=41) (actual time=0.111..0.112 rows=0 loops=1)
Index Cond: ((id >= 10001) AND (id <= 90000))
Planning Time: 1.701 ms
Execution Time: 0.240 ms
(4 rows)
While the execution is almost instant, you need to look behind the scenes. The
planner estimated 200 rows and got zero. It traversed the B-tree structure
expecting data that doesn't exist. On a single query with warm cache, this is
trivial. Under production load with thousands of queries and cold pages,
you're paying I/O cost for nothing. Again and again.
If you dig further you discover much bigger problem.
SELECT relname, reltuples::bigint as row_estimate, relpages as page_estimate
FROM pg_class
WHERE relname IN ('demo', 'demo_pkey');
The
relpages
value comes from the physical file size divided by the 8 KB page
size. PostgreSQL updates it during VACUUM and ANALYZE, but it reflects the
actual file on disk - not how much useful data is inside. Our index file is still
2.2 MB (276 pages × 8 KB), even though most pages are empty.
The planner sees 276 pages for 20,000 rows and calculates a very low
rows-per-page ratio. This is when planner can come to conclusion -
this index
is very sparse - let's do a sequential scan instead
. Oops.
"But wait," you say, "doesn't ANALYZE fix statistics?"
Yes and no.
ANALYZE
updates the row count estimate. It will no longer think you
have 100,000 rows but 20,000. But it does not shrink relpages, because that
reflects the physical file size on disk.
ANALYZE
can't change that.
The planner now has accurate row estimates but wildly inaccurate page estimates.
The useful data is packed into just ~57 pages worth of entries, but the planner
doesn't know that.
Wait, what? The avg_leaf_density is 86% and it looks perfectly healthy. That's a
trap. Due to the hollow index (we removed 80% right in the middle) we have 57
well-packed leaf pages, but the index still contains 217 deleted pages.
This is why
avg_leaf_density
alone is misleading. The density of used pages
looks great, but 79% of your index file is dead weight.
The simplest way to spot index bloat is comparing actual size to expected size.
SELECT
c.relname as index_name,
pg_size_pretty(pg_relation_size(c.oid)) as actual_size,
pg_size_pretty((c.reltuples * 40)::bigint) as expected_size,
round((pg_relation_size(c.oid) / nullif(c.reltuples * 40, 0))::numeric, 1) as bloat_ratio
FROM pg_class c
JOIN pg_index i ON c.oid = i.indexrelid
WHERE c.relkind = 'i'
AND c.reltuples > 0
AND c.relname NOT LIKE 'pg_%'
AND pg_relation_size(c.oid) > 1024 * 1024 -- only indexes > 1 MB
ORDER BY bloat_ratio DESC NULLS LAST;
A
bloat_ratio
of 2.8 means the index is nearly 3x larger than expected. Anything
above 1.8 - 2.0 deserves investigation.
We filter to indexes over 1 MB - bloat on tiny indexes doesn't matter that much.
Please, adjust the threshold based on your environment; for large databases, you
might only care about indexes over 100 MB.
But here comes
BIG WARNING
: pgstatindex() we used earlier physically reads
the entire index. On a 10 GB index, that's 10 GB of I/O. Don't run it against
all indexes on a production server - unless you know what you are doing!
SELECT
relname,
pg_size_pretty(pg_relation_size(oid)) as file_size,
pg_size_pretty((pgstattuple(oid)).tuple_len) as actual_data
FROM pg_class
WHERE relname IN ('demo', 'demo_pkey');
Our index shrunk from 2.2 MB to 456 KB - 79% reduction (not a big surprise
though).
As you might have noticed we have used
CONCURRENTLY
to avoid using ACCESS
EXCLUSIVE lock. This is available since PostgreSQL 12+, and while there's an
option to omit it - the pretty much only reason to do so is during planned
maintenance to speed up the index rebuild time.
If you look above at the file_size of our relations, we have managed to reclaim
the disk space for the affected index (it was
REINDEX
after all), but the table
space was not returned back to the operating system.
That's where
pg_squeeze
shines. Unlike trigger-based alternatives, pg_squeeze uses logical decoding,
resulting in lower impact on your running system. It rebuilds both the table and
all its indexes online, with minimal locking:
The exclusive lock is only needed during the final swap phase, and its duration
can be configured. Even better, pg_squeeze is designed for regular automated
processing - you can register tables and let it handle maintenance whenever bloat
thresholds are met.
pg_squeeze makes sense when both table and indexes are bloated, or when you want
automated management. REINDEX CONCURRENTLY is simpler when only indexes need
work.
VACUUM FULL
rewrites the entire table and all indexes. While it fixes
everything it comes with a big but - it requires an ACCESS EXCLUSIVE
lock - completely blocking all reads and writes for the entire duration. For a
large table, this could mean hours of downtime.
Generally avoid this in production
. Use pg_squeeze instead for the same
result without the downtime.
Before you now go and
REINDEX
everything in sight, let's talk about when index
bloat actually matters.
B-trees expand and contract with your data
. With random insertions affecting
index columns - UUIDs, hash keys, etc. the page splits happen constantly. Index
efficiency might get hit at occassion and also settle around 70 - 80% over
different natural cycles of your system usage. That's not bloat. That's the tree
finding its natural shape for your data.
The bloat we demonstrated - 57 useful pages drowning in 217 deleted ones - is
extreme. It came from deleting 80% of contiguous data. You won't see this
from normal day to day operations.
When do you need to act immediately:
after a massive DELETE (retention policy, GDPR purge, failed migration cleanup)
bloat_ratio
exceeds 2.0 and keeps climbing
query plans suddenly prefer sequential scans on indexed columns
index size is wildly disproportionate to row count
But in most cases you don't have to panic. Monitor weekly and when indexes bloat
ratio continously grow above warning levels, schedule a
REINDEX CONCURRENTLY
during low traffic period.
Index bloat isn't an emergency until it is. Know the signs, have the tools
ready, and don't let VACUUM's silence fool you into thinking everything's fine.
VACUUM is essential for PostgreSQL. Run it. Let autovacuum do its job. But
understand its limitations: it cleans up dead tuples, not index structure.
The truth about PostgreSQL maintenance is that VACUUM handles heap bloat
reasonably well, but index bloat requires explicit intervention. Know when your
indexes are actually sick versus just breathing normally - and when to reach for
REINDEX.
VACUUM handles heap bloat. Index bloat is your problem. Know the difference.
Junichi Uekawa: I was wondering if there was some debian thread and noticed maybe something is broken in my mail setup.
PlanetDebian
www.netfort.gr.jp
2025-12-13 01:41:02
I was wondering if there was some debian thread and noticed maybe something is broken in my mail setup. The amount of emails I am receiving seems to be very small.
...
10:41:02
#
Life
I was wondering if there was some debian thread and noticed maybe something is broken in my mail setup. The amount of emails I am receiving seems to be very small.
In Cover-Up, Laura Poitras Investigates Seymour Hersh
Portside
portside.org
2025-12-13 01:33:05
In Cover-Up, Laura Poitras Investigates Seymour Hersh
barry
Fri, 12/12/2025 - 20:33
...
Laura Poitras, the journalist and documentary filmmaker, has the rare distinction of having won a Pulitzer Prize (for her reporting on the National Security Agency and Edward Snowden), an Academy Award (for
Citizenfour
, about Snowden), and the Venice Film Festival’s Golden Lion (for
All the Beauty and the Bloodshed
, about the artist and activist Nan Goldin and the opioid crisis). Her new film,
Cover-Up
, examines the career of Seymour M. Hersh—known for his investigative journalism on My Lai and Abu Ghraib, as well as other, more contested scoops—using archival material and exclusive access to Hersh’s notes to probe how accountability journalism is made.
Cover-Up
, which opens Friday in theaters in select cities ahead of its Netflix premiere on December 26, returns Poitras to the familiar intersection of government secrecy, source protection, and the journalistic imperative to reveal what powerful institutions want hidden. Codirected with Mark Obenhaus, the film is a political thriller that offers a candid and illuminating look at Hersh’s career and methods, what makes him tick, and why investigative journalism matters today. Our conversation, which took place at the New York office of Poitras’s production company Praxis Films, has been edited for length and clarity.
AJG: Let me start by observing that
Cover-Up
is a valuable addition to the small canon of journalism movies. Yet I see it as more closely in conversation with fiction films than with other documentaries.
LP: It’s interesting that you reference that, because we definitely were talking about fiction films when we began—we being myself, Amy Foote, and Peter Bowman, the editors—in terms of how we were going to approach its feel and aesthetics. And the films that we were referring to were the 1970s paranoia thrillers, brilliant films that were very skeptical and critical of the state and state power. Alan J. Pakula was at the top of the list.
All the President’s Men
and
The
Parallax View
were the ones that we constantly referred to. I make nonfiction, but I’m often making films that deal with threats and dangers of the state. So it’s not like I’m stealing from the genre of fiction, but rather, I think fiction steals from real life.
This is a long-gestating project. You’ve said that you first thought about training your camera on Seymour Hersh after reading his Abu Ghraib coverage in 2004. Tell me more about that.
Twenty years ago, when I was preparing to go to Iraq, where I spent eight months documenting the US occupation and the war, I felt very strongly that we were living in a landscape where legacy journalism was failing the public in terms of its coverage of the lead-up to this war, its coverage of the Bush era, the “war on terror,” and how it was reporting on Guantánamo Bay prison and torture. All this nightmarish stuff was happening that we knew was happening, and by and large, legacy media was copying the government’s press releases, even to the point of respected news organizations having editorial guidelines not to use the word “torture” to describe the CIA’s torture. It was kind of staggering, and Sy was doing something very different. He was reporting at
The New Yorker
and asking: What is actually going on? Why are we going to Iraq? And he was saying there was no connection between the 9/11 attacks and Al Qaeda and Iraq, but he
was
drawing the connection between Dick Cheney and Halliburton and all the money that was there and using the sort of emergency laws that emerged after 9/11 to get in all these policies that the right had been dreaming of for years.
The Abu Ghraib story broke in April 2004, and I traveled to Iraq about a month later. I had already made the decision to go, but then when I saw those photographs, it was just a level of horror that I could not imagine existed. The film that I ended up making there,
My Country, My Country,
sort of began at Abu Ghraib because I managed to talk my way into the prison that summer when Iraqis were inspecting it because it was such an international scandal.
When I came back, I reached out to Sy and we met. I sort of laugh about entering his office. There should have been
Twilight Zone
music playing, because it really was like going back in time, with all the yellow notepads that you see in the film. It was like time had stopped in the 1970s in that office.
What was your idea for
Cover-Up
in 2005?
Back then, I was proposing to make a film that would follow him in real time, so more observational: Sy meeting sources or in editorial meetings at
The
New Yorker,
throwing things at the editors and threatening to quit, which Amy Davidson Sorkin joked he would do on a regular basis.
He entertained the idea, but after I left, he called me, and he was like:
No way. I can’t risk my sources.
You know:
My sources are too sensitive, and there’s no way a camera can be around.
So it was a hard no, but a very gracious hard no. But we stayed in touch.
And nearly twenty years later, Sy tells you he’s ready for his close-up. Why do you think that was?
He certainly was aware of the reporting I did with Edward Snowden. I think he felt that I was doing something that he felt some kind of kinship to in terms of being a bit of an outsider, a bit of a thorn in the side of the government.
I know that he and his wife, Liz, saw
All the Beauty and the Bloodshed
. I think what maybe resonated with them—without speaking for them, which makes me a little bit nervous—is that it’s a portrait of Nan but also a larger critique of social structures and systems. And this is what I’m trying to do in all my films. I’m not interested in making biopics, but I do tell stories about individuals who are confronting power structures.
Did you go into the film with a clear sense of the story you wanted to tell?
From the very beginning, we were interested in Hersh’s reporting, but also in the patterns we could see across half a century, and particularly around atrocities, cover-ups, impunity, and the role of investigative journalism in circumventing that cycle.
I think one of the reasons why we made this film was because we felt there’s a crisis in investigative journalism, because it’s hard, it’s costly, often comes with legal threats, and often takes a lot of time, and that’s harder to do if you don’t have the architecture to support it.
Did Sy have veto power?
Sy didn’t ask for editorial control, but we did update him to make sure that we weren’t missing things. But he wasn’t difficult. Once he had finally relented to be part of this project, he was just all in.
How did you decide when to get personal?
I was guided by a few things. One was what informed his reporting and what was his motivation, and that’s what we definitely felt about his growing up in Chicago, with his parents being immigrants coming from Eastern Europe, the silence in the house, his dad dies young, and he’s asked to take over the family’s dry-cleaning shop, and nobody was giving him opportunities. And then he sort of stumbled into journalism and found his passion and his love for truth-telling. And so of course, that needed to be in the film. But as I mentioned, I’m definitely very anti-biopic, and I’m also very anti-experts. The people we talked to had to have direct knowledge of the reporting, like Amy Davidson Sorkin, who was his editor at
The
New Yorker
on Abu Ghraib.
There’s a lot of talk these days about bias in the media and how journalists need to be neutral. Can you do investigative journalism from a position of neutrality?
I absolutely believe in certain principles in journalism, like that it should be fact-driven and it should be interrogating power, disclosing any conflicts of interest, if they exist. But I also think that we need to use words to describe what’s happening, and so going back to the Bush era, when news organizations didn’t use the word torture to describe torture, that is lying. It’s not a neutral position, it’s a position that is aligning with the nation-state and asking the press to capitulate to whatever that agenda is. And I think that that’s really dangerous, because you lose trust. So if we’re talking about what’s happening in Gaza, I think we have to use the word “genocide,” because if you look at the evidence of what’s there, that’s what we’re seeing. I don’t think that that’s biased. That is looking at two years of dropping American-taxpayer-funded bombs on a population.
What’s interesting about Sy’s body of work and his career is that his big stories are evidentiary. They present evidence that shows atrocities. We’re talking about the My Lai massacre or Abu Ghraib torture, CIA surveillance on protest movements or involvement in Chile and coups all over the world. In his best stories, he delivers the facts. But he’s never been quiet about his worldview and saying that he was against the Vietnam War and that it was a catastrophe.
I’m sure some people could watch
Cover-Up
and say, clearly, you’re not neutral. While not a hagiography, the film clearly celebrates Hersh’s achievements. I can imagine someone else doing a film about Sy where he’s this muckraker who’s out to make America look bad.
One of the things that speaks most highly about Sy’s body of work is that regardless of what administration has been in power, he’s gotten under their skin. He went after JFK, he went after Johnson, he went after Nixon, he went after Reagan, Carter, Obama, Biden, and now Trump. I believe in that kind of equal-opportunity adversarial journalism.
And about the hagiography thing you raised: it was important in the film to also include times when Sy got it wrong. Because we always felt like it was our job to talk about the times when he got it wrong or got played or got too close to power. And those mistakes happen in the field of journalism. Probably Mark and I had more of an obligation than most to ask about some of the stories where he made mistakes, because we knew him well and felt close to his body of work.
Was Sy reluctant to discuss his mistakes?
He didn’t exactly welcome it, but he was fine. I mean, ultimately, he would have had zero respect for us if we didn’t, which doesn’t mean that those were his favorite days.
Do you think it’s become more difficult to get the truth out in this age of extreme polarization? Part of me even wonders whether a My Lai–style report published today would have the sort of impact it did fifty-five years ago.
I refuse the notion that we’re in a post-fact world. I believe that people are very aware of if they can’t pay rent or afford healthcare or education for their kids. Those are facts that people understand. Yes, some trust has been eroded. And I think it’s been eroded by the public being lied to by our governments and by the press sometimes. But I’m not willing to concede that we shouldn’t care about what’s happening in the world, or that people don’t care. I mean, journalists in Gaza are dying every day to get out information about what’s happening. I think they are reaching the public. Whether or not they’re actually causing governments to change is the real problem.
Do you consider
Cover-Up
a hopeful film?
I don’t know if “hope” is the right word. You know, all of my films have protagonists that are really getting under the skin of power, whether that’s government or corporate power. And that offers the idea that it’s possible that an individual or small group of people can change how we understand the world. That’s a powerful message when people are feeling a lot of despair.
Columbia Journalism Review
’s mission is to be the intellectual leader in the rapidly changing world of journalism. It is the most respected voice on press criticism, and it shapes the ideas that make media leaders and journalists smarter about their work. Through its fast-turn analysis and deep reporting, CJR is an essential venue not just for journalists, but also for the thousands of professionals in communications, technology, academia, and other fields reliant on solid media industry knowledge.
Get the CJR email newsletter.
Join CJR.
[$] The state of the kernel Rust experiment
Linux Weekly News
lwn.net
2025-12-13 01:19:08
The ability to write kernel code in Rust was explicitly added as an
experiment — if things did not go well, Rust would be removed again. At
the 2025 Maintainers Summit, a session was held to evaluate the state of
that experiment, and to decide whether the time had come to declare the
result to be a...
Reader subscriptions are a necessary way
to fund the continued existence of LWN and the quality of its content.
If you are already an LWN.net subscriber, please log in
with the form below to read this content.
Please consider
subscribing to LWN
. An LWN
subscription provides numerous benefits, including access to restricted
content and the warm feeling of knowing that you are helping to keep LWN
alive.
(Alternatively, this item will become freely
available on December 25, 2025)
In 2019, hunters Brad Cape and Phil Yeomans were scouting for elk in southeast Wyoming when they came across a rocky peak that seemed perfect for elk hunting, a suspicion only heightened by its name: Elk Mountain. But finding a way onto Elk Mountain would turn out to be extremely difficult, and whether Brad and Phil succeeded would have lasting consequences for the future of land use everywhere in the U.S. because the single largest obstacle preventing the hunters from making it onto the mountain wasn’t the elevation or the topography. It was that the mountain was on a special type of land known as “the checkerboard”.
The checkerboard is a pattern of land ownership, unique to the American West, found in huge areas from New Mexico all the way up to Washington. On a map, these particular areas resemble a checkerboard, but instead of alternating black and white squares, checkerboarded land alternates between single square-mile parcels of public land and square mile parcels of private land.
In
Railroaded: The Transcontinentals and the Making of Modern America
, the Stanford historian Richard White explains that the checkerboard was created at the tail-end of the Civil War, when the U.S. government gave the railroad companies long corridors of land—up to eighty miles wide—on which to build new rail lines and encourage westward migration.
But almost all of this land was given away in alternating, one-square-mile sections. This checkerboard pattern allowed the government to keep all the undeveloped sections in between and wait for them to go up in value before turning around and selling them to developers. Most checkerboarded land today, regardless of who owns the private squares now, is descended from these initial railroad grants.
But the checkerboard would pose a problem for Brad and Phil. You can’t pass through private property without the landowner’s permission, so the public squares in the checkerboard are often very difficult to access, and Elk Mountain was no different. The private half of the checkerboard belonged to a ranch, and the ranch’s owner, a billionaire pharmaceutical executive, wasn’t allowing strangers to cross his land. So when they came back to hunt in the area in 2020, Brad and Phil and some other hunting buddies decided to try something called corner crossing.
To understand corner crossing, think about a literal checkerboard. In a checkers game, a piece that starts on black needs to stay on black. So the pieces only ever make diagonal movements, crossing from the corner of one square to another.
Moving through checkerboarded land works in the same way. To avoid the ranch’s property, all Brad and Phil and the others had to do was move around like a checkers piece. They’d start on public land and then make sure to stay on public land, by crossing into new squares diagonally, at the corners where all those public squares touch.
The hunters hiked from a public road towards the checkerboard’s nearest approachable corner, where they found two no-trespassing signs, along with a couple of posts with a chain strung between them, obstructing the one spot where they could legally cross. So they grabbed hold of the top of the posts and swung their feet around, making absolutely sure they didn’t touch private property.
From that point on, they stayed entirely on public land inside the checkerboard, corner crossing from one public square to another as they hunted for elk on Elk Mountain.
But in the middle of their hunt, a manager for the ranch approached them and insisted that touching the ranch’s posts counted as trespassing. So when they came back to hunt Elk Mountain the next year, Brad brought a ladder that unfolded to a specific height, length and width, allowing the hunters to go right over the t-posts and across the corner, all without ever touching the ranch’s property.
But this didn’t placate the ranch’s owner. He had the ranch’s manager keep contacting the authorities until eventually the county attorney charged the hunters with criminal trespass.
The chance of jail time was slim, so the hunters could have ended things there by paying a small fine and promising to stay away from Elk Mountain and go hunt elk somewhere else. But the hunters believed the public should have the right to access public land—including in the checkerboard. So instead of paying the fine, the hunters decided to fight the case.
The resulting five-year legal battle, which grew to include two criminal charges and a multimillion-dollar civil case, revolved around the central question of whether corner crossing is or should be legal, and with it, effectively who really controlled millions of acres of public land. Along the way, the stakes attracted private landowners, public land users, lobbying groups on both sides of the divide, and the national media. Eventually the case landed before the U.S. Tenth Circuit Court of Appeals.
The court ruled in favor of the hunters
, saying that the public was owed its half of the deal that the government had struck with the railroads a century and a half earlier.
The Tenth Circuit’s decision won’t bring total closure. Its decision only affects six western states, and the U.S. Supreme Court refused to take up the case, which means that, for now, the status of corner crossing and public land access in the other 44 states remains murky. It’s unlikely Brad and Phil will be involved in whatever comes next. One thing is for sure though: they’re eager to go back to hunt Elk Mountain.
OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI
One of the things that most excited me about
Anthropic’s new Skills mechanism
back in October is how easy it looked for other platforms to implement. A skill is just a folder with a Markdown file and some optional extra resources and scripts, so any LLM tool with the ability to navigate and read from a filesystem should be capable of using them. It turns out OpenAI are doing exactly that, with skills support quietly showing up in both their Codex CLI tool and now also in ChatGPT itself.
Skills in ChatGPT
I learned about this
from Elias Judin
this morning. It turns out the Code Interpreter feature of ChatGPT now has a new
/home/oai/skills
folder which you can access simply by prompting:
So far they cover spreadsheets, docx and PDFs. Interestingly their chosen approach for PDFs and documents is to convert them to rendered per-page PNGs and then pass those through their vision-enabled GPT models, presumably to maintain information from layout and graphics that would be lost if they just ran text extraction.
The reason it took so long is that it was fastidious about looking at and tweaking its own work. I appreciated that at one point it tried rendering the PDF and noticed that the macrons in kākāpō were not supported by the chosen font, so it switched to something else:
You have to run Codex with the
--enable skills
option. I ran this:
cd /tmp
mkdir datasette-cowsay
cd datasette-cowsay
codex --enable skills -m gpt-5.2
Then prompted:
list skills
And Codex replied:
- datasette-plugins — Writing Datasette plugins using Python + pluggy (file: /Users/simon/.codex/skills/datasette-plugin/SKILL.md)
- Discovery — How to find/identify available skills (no SKILL.md path provided in the list)
Then I said:
Write a Datasette plugin in this folder adding a /-/cowsay?text=hello page that displays a pre with cowsay from PyPI saying that text
When I first wrote about skills in October I said
Claude Skills are awesome, maybe a bigger deal than MCP
. The fact that it’s just turned December and OpenAI have already leaned into them in a big way reinforces to me that I called that one correctly.
Skills are based on a
very
light specification, if you could even call it that, but I still think it would be good for these to be formally documented somewhere. This could be a good initiative for the new
Agentic AI Foundation
(
previously
) to take on.
OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI
Simon Willison
simonwillison.net
2025-12-12 23:29:51
One of the things that most excited me about Anthropic's new Skills mechanism back in October is how easy it looked for other platforms to implement. A skill is just a folder with a Markdown file and some optional extra resources and scripts, so any LLM tool with the ability to navigate and read fro...
One of the things that most excited me about
Anthropic’s new Skills mechanism
back in October is how easy it looked for other platforms to implement. A skill is just a folder with a Markdown file and some optional extra resources and scripts, so any LLM tool with the ability to navigate and read from a filesystem should be capable of using them. It turns out OpenAI are doing exactly that, with skills support quietly showing up in both their Codex CLI tool and now also in ChatGPT itself.
Skills in ChatGPT
I learned about this
from Elias Judin
this morning. It turns out the Code Interpreter feature of ChatGPT now has a new
/home/oai/skills
folder which you can access simply by prompting:
So far they cover spreadsheets, docx and PDFs. Interestingly their chosen approach for PDFs and documents is to convert them to rendered per-page PNGs and then pass those through their vision-enabled GPT models, presumably to maintain information from layout and graphics that would be lost if they just ran text extraction.
The reason it took so long is that it was fastidious about looking at and tweaking its own work. I appreciated that at one point it tried rendering the PDF and noticed that the macrons in kākāpō were not supported by the chosen font, so it switched to something else:
You have to run Codex with the
--enable skills
option. I ran this:
cd /tmp
mkdir datasette-cowsay
cd datasette-cowsay
codex --enable skills -m gpt-5.2
Then prompted:
list skills
And Codex replied:
- datasette-plugins — Writing Datasette plugins using Python + pluggy (file: /Users/simon/.codex/skills/datasette-plugin/SKILL.md)
- Discovery — How to find/identify available skills (no SKILL.md path provided in the list)
Then I said:
Write a Datasette plugin in this folder adding a /-/cowsay?text=hello page that displays a pre with cowsay from PyPI saying that text
When I first wrote about skills in October I said
Claude Skills are awesome, maybe a bigger deal than MCP
. The fact that it’s just turned December and OpenAI have already leaned into them in a big way reinforces to me that I called that one correctly.
Skills are based on a
very
light specification, if you could even call it that, but I still think it would be good for these to be formally documented somewhere. This could be a good initiative for the new
Agentic AI Foundation
(
previously
) to take on.
Apple fixes two zero-day flaws exploited in 'sophisticated' attacks
Bleeping Computer
www.bleepingcomputer.com
2025-12-12 23:23:25
Apple has released emergency updates to patch two zero-day vulnerabilities that were exploited in an "extremely sophisticated attack" targeting specific individuals. [...]...
Apple has released emergency updates to patch two zero-day vulnerabilities that were exploited in an “extremely sophisticated attack” targeting specific individuals.
The zero-days are tracked as CVE-2025-43529 and CVE-2025-14174 and were both issued in response to the same reported exploitation.
"Apple is aware of a report that this issue may have been exploited in an extremely sophisticated attack against specific targeted individuals on versions of iOS before iOS 26," reads
Apple's security bulletin
.
CVE-2025-43529 is a WebKit use-after-free remote code execution flaw that can be exploited by processing maliciously crafted web content. Apple says the flaw was discovered by Google’s Threat Analysis Group.
CVE-2025-14174 is a WebKit memory corruption flaw that could lead to memory corruption. Apple says the flaw was discovered by both Apple and Google’s Threat Analysis Group.
Devices impacted by both flaws include:
iPhone 11 and later
iPad Pro 12.9-inch (3rd generation and later)
iPad Pro 11-inch (1st generation and later)
iPad Air (3rd generation and later)
iPad (8th generation and later)
iPad mini (5th generation and later)
On Wednesday, Google fixed a mysterious zero-day flaw in Google Chrome,
initially labeling
it as “[N/A][466192044] High: Under coordination.”
However, Google has now
updated the advisory
to identify the bug as “CVE-2025-14174: Out-of-bounds memory access in ANGLE,” which is the same CVE fixed by Apple, indicating coordinated disclosure between the two companies.
Apple has not disclosed technical details about the attacks beyond saying they targeted individuals running versions of iOS before iOS 26.
As both flaws affect WebKit, which Google Chrome uses on iOS, the activity is consistent with highly targeted spyware attacks.
While these flaws were only exploited in targeted attacks, users are strongly advised to install the latest security updates promptly to reduce the risk of ongoing exploitation.
In September, Apple also backported a fix for a zero-day tracked as
CVE-2025-43300
to older devices running iOS 15.8.5 / 16.7.12 and iPadOS 15.8.5 / 16.7.12.
Broken IAM isn't just an IT problem - the impact ripples across your whole business.
This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.
Apple at the AWS re:Invent 2025 Keynote
Daring Fireball
www.youtube.com
2025-12-12 23:22:39
Six-minute from Amazon’s AWS re:Invent keynote last week:
Payam Mirrashidi, VP, Cloud Systems & Platforms, Apple, explains
how AWS Graviton helps improve developer velocity at scale. Hear
Swift’s journey from the premier programming language for the
Apple ecosystem to adoption by millions of...
The Raise the Age Law Is Not Actually Turning NYC Into a Wild Teen Hellhole, According to City Data
hellgate
hellgatenyc.com
2025-12-12 22:36:01
The next front in the war to roll back criminal justice reform was supposed to be juveniles. With a new mayor and a new study casting doubt on the premise, is it still?...
This week, the Mayor's Office of Criminal Justice quietly dropped a new report indicating,
once again
, that the 2018 Raise the Age Law that moved 16- and 17-year-olds out of adult court and increased the age of criminal responsibility in New York state to 18 is not creating a "consequence-free" youth crime wave, as NYPD Commissioner Jessica Tisch
has previously said
.
According
to the report
, in 2024, the youth share of citywide felony and violent felony arrests was the same as it was in 2018, and recidivism was stable or decreasing in most categories. "In short, adults, not teens, have disproportionately contributed to the post‑2018 rise in felony arrests," the report says.
The exception is gun arrests for those under the age of 18, which increased by 136 percent since 2018,
to 486
arrests in 2024. "While comparable figures for adults are not available, the increase in youth-specific gun incidents suggests a rising exposure to firearms for this group," the report adds.
RFC9460
,
defining
SVCB
and
HTTPS
Resource Records, was
published in November of 2023. Two years later, however, support
for these DNS records is still far from universal.
Add to that the fact that the RFC defines a number of
SvcParamKeys
, which browsers
support to different degrees and where developers
disagree about the proper behavior, and you end up
with no clear picture of which browsers support these
records to which end.
Unfortunately even the otherwise ever so useful
https://caniuse.com/
does not provide that information, although there's a
feature
request
. In order to quickly be able to answer
the question regarding the core features, I ran a few
tests to observe in how far the three most popular
browsers support these records. (Jump to the
table below
if all you care about is
that list.)
AliasMode / TargetName
Support for this mode is important for anybody looking
to implement aliasing of apex domains. In its
simplest form, it would like this:
$ host -t https https.dotwtf.wtf
https.dotwtf.wtf has HTTP service bindings 0 www.dotwtf.wtf.
$ host alias.https.dotwtf.wtf
alias.https.dotwtf.wtf has HTTP service bindings 0 www.dotwtf.wtf.
$
The first is an apex alias, the second a simple
AliasMode
non-apex record.
Neither name has either an
A
nor
AAAA
record. The
expected behavior here is that the browser follows the
TargetName
and connects to
https.dotwtf.wtf
with an SNI of
https.dotwtf.
or
alias.https.dotwtf.wtf
respectively.
ALPN
The Application-Layer Protocol Negotiation (ALPN)
parameters allows clients to immediately connect to
the destination server using the right protocol and
avoid additional round-trips. See
this explanation
for
more details.
$ host alpn-h3.https.dotwtf.wtf
alpn-h3.https.dotwtf.wtf has address 166.84.7.99
alpn-h3.https.dotwtf.wtf has IPv6 address 2602:f977:800:0:e276:63ff:fe72:3900
alpn-h3.https.dotwtf.wtf has HTTP service bindings 1 . alpn="h3,h2"
$
The expected behavior here is that the client will
immediately make an H3 (i.e., QUIC) connection.
ECH
This parameter is
used
for
Encrypted
Client Hello
, providing the encryption public key
and associated metadata needed by the client to
construct the
ClientHelloOuter
.
The RFC defines
ipv4hint
and
ipv6hint
parameters, but
their usefulness remains opaque to me. A client MAY
use the hints, but still has to perform the lookup and
then enter the results in the cache. That is, in
effect the only time hints are used is to cut down the
time to first byte, but even that is a "MAY", not even
a "SHOULD".
This also leads to some confusion amongst
implementers and users when the service name has no
A
/
AAAA
records.
The expectation here is that the client will use
the hints to connect to the service name, although
there appears to be
disagreement
on whether a service name has to have IP addresses
outside of the hints.
There are a million other scenarios where your
authority endpoint might have a different set of IPs,
how to handle cache expiration, how to handle CNAMEs,
conflicts if the authority endpoint itself has a
different
HTTPS
record with
different
IP hints, and so on and
so on.
I didn't check all permutations here, but I did
check which IPs the browsers will use if they get
conflicting results back:
$ for r in A AAAA HTTPS; do
> dig +short $r wrong-iphints.https.dotwtf.wtf
> done
198.51.100.188
2001:db8::8c93:2c23:262f:6ffb
1 . ipv4hint=127.0.0.1 ipv6hint=2001:db8::1
$
The expectation here is that the client will make a
TLS connection to port 4343. (Note: even if we
specified port 80 here, the client should still make a
TLS connection.)
uvm32 is a minimalist, dependency-free virtual machine sandbox designed for microcontrollers and other resource-constrained devices. Single C file, no dynamic memory allocations, asynchronous design, pure C99.
On an
STM32L0
(ARM Cortex-M0+) the required footprint is under 4KB flash/1KB RAM.
uvm32 is a RISC-V emulator, wrapped in a management interface and provided with tools to build efficient code to run in it.
apps/self
host-mini with embedded mandelbrot generation program, compiled as an app (VM running VM)
Quickstart (docker)
The code in
uvm32
to build a VM host is very portable and requires only a C compiler. However, many of the examples provided show how to build target code with different languages and tools. A Dockerfile is provided to set up the required environment.
make dockerbuild
make dockershell
Then, from inside the docker shell
make
./hosts/host/host apps/helloworld/helloworld.bin
host
is the command line test VM for running samples. Run
host -h
for a full list of options.
This project is licensed under the MIT License. Feel free to use in research, products and embedded devices.
Friday Squid Blogging: Giant Squid Eating a Diamondback Squid
Schneier
www.schneier.com
2025-12-12 22:00:30
I have no context for this video—it’s from Reddit—but one of the commenters adds some context:
Hey everyone, squid biologist here! Wanted to add some stuff you might find interesting.
With so many people carrying around cameras, we’re getting more videos of giant squid at the...
I have no context for
this video
—it’s from Reddit—but one of the commenters adds some context:
Hey everyone, squid biologist here! Wanted to add some stuff you might find interesting.
With so many people carrying around cameras, we’re getting more videos of giant squid at the surface than in previous decades. We’re also starting to notice a pattern, that around this time of year (peaking in January) we see a bunch of giant squid around Japan. We don’t know why this is happening. Maybe they gather around there to mate or something? who knows! but since so many people have cameras, those one-off monster-story encounters are now caught on video, like this one (which, btw, rips. This squid looks so healthy, it’s awesome).
When we see big (giant or colossal) healthy squid like this, it’s often because a fisher caught something else (either another squid or sometimes an antarctic toothfish). The squid is attracted to whatever was caught and they hop on the hook and go along for the ride when the target species is reeled in. There are a few colossal squid sightings similar to this from the southern ocean (but fewer people are down there, so fewer cameras, fewer videos). On the original instagram video, a bunch of people are like “Put it back! Release him!” etc, but he’s just enjoying dinner (obviously as the squid swims away at the end).
As usual, you can also use this squid post to talk about the security stories in the news that I haven’t covered.
GNU Unifont is part of the GNU Project.
This page contains the latest release of
GNU Unifont, with glyphs for every printable code point
in the Unicode Basic Multilingual Plane (BMP).
The BMP occupies the first 65,536 code points of the Unicode space,
denoted as U+0000..U+FFFF. There is also growing coverage of the
Supplementary Multilingual Plane (SMP), in the range U+010000..U+01FFFF,
and of Michael Everson's ConScript Unicode Registry (CSUR) with
Rebecca Bettencourt's Under-CSUR additions.
Commercial Use
A user has asked if GNU Unifont can be used with commercial
(non-free) software. The answer is yes. The GNU Font Embedding
Exception and the SIL OFL allow for that. See the next section
for details. The main purpose of the licensing is to require
derivative fonts that others create to be released to the public
under the same licensing terms, not to prohibit the use of those
fonts with certain software. Thus, preserving the license terms
in derivative fonts provides a public benefit. The licenses also
provide acknowledgement of previous Unifont contributors for their
volunteer work.
Copyright, Derivative Works, and License
Thousands of Unifont glyphs are creations of individual Unifont
contributors; those glyphs enjoy copyright protections of various
degrees. Some of those contributions are letter forms of established
alphabets while others are icon (symbol) designs such as the many
animal icons which, as artistic designs, have even stronger
international protections. See for example this memorandum of
applicable laws of Berne Union member country Germany (where
Unifont was created):
Unifont Copyright Protections
.
Derivative variants of Unifont are permitted under the terms of
the dual license: GNU GPLv2+ with the GNU Font Embedding Exception
and the SIL Open Font License version 1.1. These are free licenses.
The remainder of this section provides details.
These font files are licensed under the GNU General Public License,
either Version 2 or (at your option) a later version, with the exception
that embedding the font in a document does not in itself constitute a
violation of the GNU GPL. The full terms of the license are in
LICENSE.txt
.
As of Unifont version 13.0.04, the fonts are dual-licensed under
the SIL Open Font License (OFL) version 1.1 and the GNU GPL 2+ with
the GNU font embedding exception. The SIL OFL is available at
OFL-1.1.txt
.
Font Downloads
The standard font build
— with and without the
ConScript Unicode Registry (CSUR) / Under-CSUR Private Use Area
(PUA) glyphs. Download in your favorite format:
PSF:
A specialized PSF 1 console frame buffer font
consisting of 512 glyphs for use with APL,
A Programming
Language,
in console mode (single-user mode on GNU/Linux,
etc.), mainly to support GNU APL:
Unifont-APL8x16-17.0.03.psf.gz
(4 kbytes)
HEX:
All the Plane 0 glyphs in Roman's .hex format, for those who
wish to experiment:
unifont-17.0.03.hex.gz
(1 Mbyte)
On Windows or Mac OS X, unzip the .ttf.zip file or download the
uncompressed .ttf file and copy the font to your Fonts folder.
On Microsoft Windows, this folder is located under the Windows
folder on your main disk.
On a Mac, this is located under the Library folder on your main disk.
For best appearance on a Mac in a Terminal window, select Terminal from
the menu, then Preferences. A Settings window will appear. Make sure
that you're on the Text tab in that window. Then make sure that the
"Antialias text" box is checked. The OpenType version of the font should
then look fine at point sizes of 12pt and larger. The font won't look
very legible in a Mac Terminal window unless you select this antialias option.
Note:
BDF, PCF, and OpenType files contain dimension and spacing
information for each glyph in a font. Some font rendering engines ignore
this glyph information that the font file provides. This is especially
true of rendering engines designed to handle monospace fonts.
Unifont will not display all glyphs correctly with such software.
The BDF font follows BDF version 2.1 (not version 2.2) because the
the X Window System standardized on version 2.1.
The PSF 1 version of Unifont
is
a monospace font but is
limited to 512 glyphs, and is only of use with font rendering engines
that support more than 256 glyphs in a console frame buffer font.
Unifont only stores one glyph per printable Unicode code point.
This means that complex scripts with special forms for letter
combinations including consonant combinations and floating vowel
marks such as with Indic scripts (Devanagari, Bengali, Tamil, etc.)
or letters that change shape depending upon their position in a word
(Indic and Arabic scripts) will not render well in Unifont.
In those cases, Unifont is only suitable as a font of last resort.
Users wishing to properly render such complex scripts should
use full OpenType fonts that faithfully display such alternate
forms.
Drawing New Glyphs
If you would like to contribute glyphs, please email unifoundry at
gmail in advance (not spelled out because of spammers). Several
contributors are working on new glyphs, and it would be unfortunate
to have multiple persons drawing the same glyphs.
Special Note: New Plane 2 and Plane 3 CJK Glyphs
The People's Republic of China (PRC) has a set of 15-by-16 pixel
Chinese glyphs for Unicode Plane 2 and Plane 3. However,
those glyphs are copyrighted and licensed for sale by the Government
of the PRC,
and thus they cannot be used in a free font.
If you happen to have any of those copyrighted 15-by-16 pixel glyphs,
please do not send them for inclusion. Unifont includes many glyphs
in this range, drawn by Chinese and Japanese volunteers.
More are planned for the future.
The theoretical maximum number of printable glyphs in the
Unicode Plane 0 range is 65,536 code points
minus the 2,048 surrogate pair code points,
minus the 6,400 Private Use Area code points,
minus the two noncharacters (U+FFFE and U+FFFF).
This amounts to 57,086 assignable code points
apart from the Private Use Area.
The theoretical maximum number of printable glyphs in the
higher Unicode planes is 65,534; the last two code points
in each plane are reserved as noncharacters.
Unifont 17.0
1 November 2025 (Unifont 17.0.03)
晓晓Akatsuki, Boris Zhang,
Kusanagi_Sans,
and
others updated over 100 Chinese ideographs in Planes 0, 2, and 3:
Modified ideographs containing the "馬" (Horse) and "鳥"
(Bird) radicals to be more balanced
The first batch of simplified Chinese character lists
(第一批簡體字表) (August 21, 1935~February 1936)
Chinese Character Simplification Scheme (1956–1986)
The simplified character list (1986–2013), including
simplified radicals and Chinese characters used for
descriptions and annotations in official documents
The second list of the Second round of simplified Chinese
characters (not officially implemented)
Ideographs required by the Specification of Common
Modern Chinese Character Components and Components Names
(现代常用字部件及部件名称规范), published in mainland
China in 2009
Other changes; see the ChangeLog file in
the main package for details.
18 October 2025 (Unifont 17.0.02)
Plane 0:
Paul Hardy
modififed U+1521, U+A93D, and U+FB30.
David Corbett
modified U+2B96, U+!7CE, U+A7CF, and U+A7D2
晓晓Akatsuki
adjusted U+4748, U+6B25, and U+6F78 per the latest
Unicode recommendations. Adjusted U+5100 to be 16 pixels tall.
Plane 1:
Paul Hardy
modififed U+1E912 per Unicode 17.0.0 errata.
David Corbett
modified U+1CEDD, U+1E6DE, U+1F778,
U+1CEF0, U+1F77A, U+11DCC, U+11DCD, U+11DD6.
Adjusted base height in chess glyphs U+1FA54..U+1FA57 and eye
height in U+1FA55 and U+1FA57 to match eye height of knights.
晓晓Akatsuki
drew smaller versions of U+16FF2 and U+16FF3.
Plane 2:
For complete coverage of jf7000 0.9,
Boris Zhang
added U+217DA and U+21A4B;
湖 远星
added U+24259, U+249DF, and U+270FD.
Improved these glyphs in the first list of the second round
of simplified Chinese characters: U+0200D3 and U+0201A8
Added these glyphs in the first list of the second round
of simplified Chinese characters: U+20B15, U+20BB5,
U+20CAD, U+219F3, U+21C52, U+22342, U+22488, U+22A83,
U+2418A, U+2462F, U+26678, U+26B01, U+2A9F7, U+2BA4F,
U+2BA51, U+2BBDC, U+2BCB7, U+2BDC0, U+2BE6F, U+2D026,
U+2D64F, U+2D70C, U+2DCFF, and U+2E0B9
Fixed U+2CAD2, which wfz2020 noticed appeared
as the glyph for code point U+2CA02.
Plane 3:
Yzy32767
made these contributions:
Improved these glyphs in the first list of the second
round of simplified Chinese characters: U+030008, U+030061,
U+03006C, U+03011D, U+03014A, and U+0301E3
Modified the archaic Greek digamma glyphs,
U+03DC and U+03DD.
Modified the Korean Won currency symbol, U+20A9,
to only have one bar.
Removed Variation Selector glyphs (U+FE00..U+FE0F)
from default OpenType and TrueType font builds;
they remain in the sample and SBIT font builds.
David Corbett
Modified Arabic glyphs U+0610, U+0616, U+061E,
U+0620, and U+0626.
Redrew the yeh-based glyphs in the ranges
U+FC31..U+FDC7 (Arabic Presentation Forms-A) and
U+FE89..U+FE8C (Arabic Presentation Forms-B).
Johnnie Weaver
modified some Georgian Supplement glyphs (U+2D00..U+2D2F).
晓晓_Akatsuki (Xiao_Akatsuki)
modified U+2EB2 per Unicode updates.
Plane 1:
Paul Hardy
Updated Old Turkic glyph U+10C47.
Updated Khitan Small Script glyph U+18CCA.
Reverted several changes in Musical Symbols
(U+1D100..U+1D1FF) for better positioning with combining
characters. Thanks go out to David Corbett for requesting
the changes.
Modified mathematical bold digamma (U+1D7CA, U+1D7CB)
to match the updated digamma glyphs in Plane 0.
Paul Hardy
Removed Variation Selector glyphs (U+E0100..U+E01EF)
from default OpenType and TrueType font builds;
they remain in the sample and SBIT font builds.
Plane 15 (CSUR/UCSUR):
soweli Kape
[sic] and
NikZapp
Updated Sitelen Pona (U+F1900..U+F19FF)
Updated Sitelen Pona Radicals (U+F1C80..U+F1C9F).
Paul Hardy
Added Titi Pula (U+F1C40..UU+F1C60)
Added Zbalermorna (U+F28A0..UU+F28DF).
19 April 2025 (Unifont 16.0.03)
Plane 0:
David Corbett
redrew some Arabic glyphs for consistency. Most of these
are minor changes to baseline, i‘jam positioning, or making
a derived letter match its origin letter. Code points:
U+0625, U+0634, U+0673, U+06B9, U+06BC, U+0753, U+0754,
U+0757, U+075C, U+0762, U+0767, U+0769, U+076A, U+076D,
U+0770, U+0775, U+0776, U+0777, U+077D, U+077E, U+08A1,
U+08A2, U+08A3, U+08A6, U+08A8, U+08B1, U+08BB, and U+08BC.
晓晓_Akatsuki (Xiao_Akatsuki)
submitted
several CJK refinements from the team of
湖 远星:
Improved 褝 (U+891D) and 肞 (U+809E).
Updated to reflect current Unicode rendering:
㳽 (U+3CFD), 㸿 (U+3E3F), 䑮 (U+446E), 䒳 (U+44B3), 䕈 (U+4548),
and 䩶 (U+4A76).
Updated as per GB18030-2022 change: 垕 (U+5795).
Modified to comply with the GB18030-2022 standard pertaining to
character composition:
姉 (U+59C9): This character is a phono-semantic character.
Therefore, the right side should be "市" (U+5E02) instead
of "巿" (U+5DFF).
濲 (U+6FF2): This character is a variant of "瀔" (U+7014),
and the "穀" (U+7A40) on the right side of "瀔" (U+7014) is
a phono-semantic character, and its "semantic" part is "禾"
(U+79BE), not "木" (U+6728).
膥 (U+81A5): This character is a Cantonese character for "egg".
Not yet (未) Become (成) Meat (肉) → Egg, so the upper left
corner should be "未" (U+672A), not "末" (U+672B).
Modified the top serifs of two Latin fullwidth letters, U+FF44 and U+FF4B.
Plane 1:
Paul Hardy
added new glyphs in Egyptian Hieroglyph Format Controls
(U+13430..U+1345F).
Paul Hardy
and
David Corbett
made adjustments to
glyphs in the Musical Symbols block (U+1D100..U+1D1FF).
Plane 2:
晓晓_Akatsuki
modified U+25ED7 from 16 columns wide to 15 columns.
Hayden Wong
contributed U+29B00..U+29CFF.
Cod'dte
sent a corrected left-hand side of U+2EE57.
Plane 3:
Luke036
has drawn a much-improved glyph for taito (U+3106C).
1 December 2024 (Unifont 16.0.02)
Plane 0:
Johnnie Weaver
modified the
U+13C9 Cherokee and U+AB99 Cherokee Supplement glyphs.
湖 远星
modified Chinese glyphs
U+605C, U+6669, and U+6A37.
Plane 1:
Johnnie Weaver
modified several glyphs in the
ranges U+10880..U+108AF (Nabataean) and U+108E0..U+108FF
(Hatran) so these scripts are now completely half-width.
Paul Hardy
modified several Tulu-Tilagari glyphs
(U+11380..U+113FF), and modified the Kawi glyph U+11F5A
to resemble U+11F49 (per David Corbett's recommendations).
Xiao Akatsuki (晓晓 Akatsuki)
fixed a missing
vertical stroke in U+18B2D.
湖 远星
added more space between the two
halves of U+1F232.
Plane 2:
Hayden Wong
made these changes:
Modified U+20083, U+20087, U+20089, and
U+200B4 from 16 columns wide to 15 columns.
Added the missing glyphs in the range U+20000..U+299FF.
Completed U+29D00..U+29DFF.
Added U+2B64E, which is an incorrect variant
of U+513A (儺).
晓晓 Akatsuki
contributed the missing glyphs
in the range U+20700..U+207FF.
湖 远星
modified U+28A0F, U+28B4E, U+2CB5B,
and U+2CB73 from 16 columns wide to 15 columns.
Boris Zhang
noticed that U+2C7EC was the glyph
for U+2CE7C, so it was removed.
10 September 2024 (Unifont 16.0.01)
Plane 0:
David Corbett
added U+0897, ARABIC PEPET.
Paul Hardy
added the new glyphs in
Balinese (U+1B4E, U+1B4F, and U+1B7F),
Cyrillic Extended-C (U+1C89, U+1C8A), and
Latin Extended-D (U+A7CB..U+A7CD, U+A7DA..U+A7DC).
Hayden Wong
contributed the new glyphs in
CJK Unified Ideographs Extension B U+20020..U+2004F
and U+29E00..2A0FF.
twuchiutann
contributed the new glyphs in
CJK Unified Ideographs Extension B U+20050..U+2073F.
Boris Zhang
redrew CJK Unified Ideographs
Extension D glyphs U+2B75F, U+2B76B, and
Extension I glyphs U+2B7EF, U+2EC1F, U+2EC20,
U+2EC21, U+2EC2F, U+2EC6F, U+2ECBF, U+2ECEC, and U+2ED42.
湖 远星
contributed the following glyphs, which are common
in Cantonese, Hokkien, Hakka,
etc.,
from a list provided
with the Ichiten font.
CJK Unified Ideographs Extension B glyphs:
U+203B7 𠎷
U+20546 𠕆
U+20584 𠖄
U+205FB 𠗻
U+207A9 𠞩
U+207AD 𠞭
U+20803 𠠃
U+2081D 𠠝
U+20895 𠢕
U+20BD7 𠯗
U+20C41 𠱁
U+20CBF 𠲿
U+20CD4 𠳔
U+20D5D 𠵝
U+20D71 𠵱
U+20DA7 𠶧
U+20E76 𠹶
U+20E98 𠺘
U+20ED8 𠻘
U+20F3B 𠼻
U+20F7E 𠽾
U+21014 𡀔
U+210AB 𡂫
U+210F6 𡃶
U+21145 𡅅
U+2176D 𡝭
U+217D3 𡟓
U+2180D 𡠍
U+21883 𡢃
U+2197C 𡥼
U+21C2A 𡰪
U+21CA2 𡲢
U+21CDE 𡳞
U+21DD1 𡷑
U+21F0F 𡼏
U+221A1 𢆡
U+22399 𢎙
U+224DC 𢓜
U+2251B 𢔛
U+22775 𢝵
U+22AB1 𢪱
U+22AE6 𢫦
U+22BED 𢯭
U+22BFE 𢯾
U+22C4B 𢱋
U+22C62 𢱢
U+22C64 𢱤
U+22CB4 𢲴
U+22CB8 𢲸
U+22CC6 𢳆
U+22CEA 𢳪
U+22D80 𢶀
U+22F0C 𢼌
U+22F1B 𢼛
U+23073 𣁳
U+23074 𣁴
U+23350 𣍐
U+236BA 𣚺
U+236EE 𣛮
U+23B88 𣮈
U+23CA9 𣲩
U+23EF8 𣻸
U+23F0E 𣼎
U+240D2 𤃒
U+241AC 𤆬
U+24259 𤉙
U+242B6 𤊶
U+2430D 𤌍
U+24352 𤍒
U+24364 𤍤
U+24419 𤐙
U+24430 𤐰
U+24605 𤘅
U+2479A 𤞚
U+24C8D 𤲍
U+24D80 𤶀
U+24D83 𤶃
U+24E01 𤸁
U+24E31 𤸱
U+24E85 𤺅
U+24EA7 𤺧
U+24EAA 𤺪
U+25148 𥅈
U+2517E 𥅾
U+2531A 𥌚
U+25349 𥍉
U+25435 𥐵
U+2546E 𥑮
U+257C7 𥟇
U+25BDF 𥯟
U+25BE5 𥯥
U+25C14 𥰔
U+25D0A 𥴊
U+25E86 𥺆
U+2624E 𦉎
U+26293 𦊓
U+26706 𦜆
U+267EA 𦟪
U+2688A 𦢊
U+2690E 𦤎
U+26E05 𦸅
U+2725F 𧉟
U+27304 𧌄
U+27371 𧍱
U+27486 𧒆
U+277F0 𧟰
U+279A0 𧦠
U+27A63 𧩣
U+27B2A 𧬪
U+27B99 𧮙
U+27EF4 𧻴
U+27FC1 𧿁
U+27FEC 𧿬
U+27FF3 𧿳
U+280BE 𨂾
U+280BF 𨂿
U+280E9 𨃩
U+280F0 𨃰
U+28154 𨅔
U+282CD 𨋍
U+2837D 𨍽
U+2838A 𨎊
U+28487 𨒇
U+28595 𨖕
U+28891 𨢑
U+28D99 𨶙
U+28E39 𨸹
U+2945D 𩑝
U+2947E 𩑾
U+294E5 𩓥
U+296A8 𩚨
U+296E9 𩛩
U+29704 𩜄
U+29730 𩜰
U+29D71 𩵱
U+29DD3 𩷓
U+29E19 𩸙
U+29E36 𩸶
U+29EAC 𩺬
U+29F27 𩼧
U+29F30 𩼰
U+29F48 𩽈
U+29F70 𩽰
U+2A04E 𪁎
U+2A0BA 𪂺
U+2A1E1 𪇡
U+2A41E 𪐞
U+2A590 𪖐
U+2A612 𪘒
U+2A64A 𪙊
CJK Unified Ideographs Extension C glyphs:
U+2A736 𪜶, U+2AE5A 𪹚, and U+2B4A2 𫒢
CJK Unified Ideographs Extension E glyphs:
U+2B8C6 𫣆, U+2C816 𬠖, and U+2C9B0 𬦰.
Plane 3:
twuchiutann
modified U+30EDD and U+30EDE (biang),
originally drawn by Ming Fan, to differentiate between
traditional and simplified Chinese versions.
湖 远星
contributed the following glyphs, which are common in
Cantonese, Hokkien, Hakka,
etc.,
from a list given in the
Ichiten font.
CJK Unified Ideographs Extension G glyphs:
U+301DB 𰇛, U+308FB 𰣻, and U+30E6C 𰹬
CJK Unified Ideographs Extension H glyph:
U+31C7F 𱱿.
Plane 15 CSUR/UCSUR:
Rebecca Bettencourt
contributed:
U+F16B0..U+F16DF Derani
U+F2000..U+F267F Sadalian.
Paul Hardy
contributed
U+F1C80..U+F1C9C Sitelen Pona Radicals.
*New in Unicode 16.0.0.
Unifont 15.1
24 February 2024 (Unifont 15.1.05)
Plane 0:
Ho-seok Ee
redrew all Hangul glyphs not in the
Hangul Syllables range, so their style more closely
resembles the style of the Hangul Syllables range:
U+1100..U+11FF Hangul Jamo, U+3131..U+318E Hangul
Compatibility Jamo, U+A960..U+A97C Hangul Jamo Extended-A,
U+D7B0..U+D7FB Hangul Jamo Extended-B.
Hayden Wong
improved several glyphs in the range
U+2100..U+214F Letterlike Symbols.
Johnnie Weaver
redrew U+013D LATIN CAPITAL LETTER L
WITH CARON for better compatibility with other glyphs in
the Czech and Slovak alphabets.
Planes 2 and 3:
almost 600 new ideographs, including:
Boris Zhang
and
Yzy32767
contributed
U+20000..U+2001F.
Boris Zhang
and
Yzy32767
contributed
the entire CJK Unified Ideographs Extension D range,
U+2B740..U+2B81D.
湖 远星
contributed 335 glyphs across Plane 2
and Plane 3 with common Cantonese ideographs.
Other new idegraphs in CJK Unified Ideographs Extension I.
Plane F: Paul Hardy
modified the Sitelen Pona script,
added combining character indicators and adding several new
glyphs since the last release. This completes the most current
version of Sitelen Pona.
encodings
29 October 2023 (Unifont 15.1.04)
Default and Japanese versions have larger supersets
of Plane 2 and Plane 3 glyphs.
Johnnie Weaver
contributed updates for
U+266D..U+266F and U+26BC.
21 October 2023 (Unifont 15.1.03)
Boris Zhang
and
Yzy32767
contributed
CJK Unified Ideographs Extension I glyphs
(U+2EBF0..U+2EE5D).
湖 远星
contributed 14 glyphs to CJK
Unified Ideographs Extensions B and C
and updated U+5C81 and U+6708.
21 September 2023 (Unifont 15.1.02)
湖 远星:
Adjusted 46 glyphs in the Plane 0 Wen Quan Yi range,
U+2F00..U+9FFF.
Contributed Plane 3 CJK Unified Ideographs Extension G
glyphs in the range U+30000..U+3017F.
12 September 2023 (Unifont 15.1.01)
As mentioned during the year leading up to this release,
TrueType fonts are no longer produced by the default build;
OpenType fonts have taken their place.
This change has been
driven by the diminishing support for TrueType fonts in the Pango
font rendering engine. TrueType fonts can still be built from
the distribution tarball using the command "make truetype" in
the font directory.
Ho-Seok Ee
proposed
a new Johab encoding for
algorithmic Hangul Syllables generation.
The resulting
scheme uses 6 variations of initial consonants
(choseong), 3 of medial vowels and diphthongs
(jungseong), and 1 of final consonants (jongseong).
The image on the left is partial output from a new supporting
Unifont utility,
unijohab2html
, which gives
an overview of how the three components of a Hangul syllable
combine with each other and outputs any overlaps for a font
designer's analysis. A full discussion of this new Johab
6/3/1 encoding appears on the
Unifont Hangul Syllables Generation
web page.
Minseo Lee (이민서)
provided feedback on the
glyphs prior to their release.
Following a suggestion by Ho-Seok Ee, the
hangul-base.hex
file that contains the
Johab 6/3/1 glyphs for Hangul syllable formation now begins
at code point U+E000. This allows building a Unifont variant
with that entire Hangul johab glyph set in the Uniode Plane 0
Private Use Area (PUA) using the command
"
make PUA=plane00/hangul/hangul-base.hex
".
in the font directory. Unifont builds have traditionally
left the PUA available for CSUR/UCSUR glyphs, which is still
the default; see below for a discussion of the CSUR/UCSUR glyphs.
Johnnie Weaver
modified "IJ" ligature glyphs U+0132 and
U+0133. He also modified U+1E9E LATIN CAPITAL LETTER SHARP S.
Paul Hardy:
Modified U+2CC2 COPTIC CAPITAL LETTER CROSSED SHEI and
U+2CC3 COPTIC SMALL LETTER CROSSED SHEI for consistency
with the redrawn U+03E2 COPTIC CAPITAL LETTER SHEI and
U+03E3 COPTIC SMALL LETTER SHEI.
Redrew Ideographic Description Characters (U+2FF0..U+2FFB)
for consistency and added new glyphs (U+2FFC..U+2FFF).
Also added CJK Strokes glyph U+31EF IDEOGRAPHIC DESCRIPTION
CHARACTER SUBTRACTION.
Modified star glyphs U+2605, U+2606, and U+2BE8 for
consistency.
Modified several Chinese ideographs and Korean ideographs in
CJK Unified Ideographs Extension A (U+3400..U+4DBF) per the
Unicode Standard version 15.1.0.
Wen Quan Yi Glyphs:
Made modifications to Korean
ideographs in CJK Unified Ideographs Extension A
(U+3400..U+4DBF) per Unicode 15.1.0 changes.
Modified CJK Unified Ideographs Extension A U+3B9D, U+454E,
U+49C8 (from 湖 远星) and U+56B8.
Modified CJK Unified Ideographs Extension U+809E and U+891D.
Modified Alchemical Symbols (U+1F700..U+1F77F) per Unicode
15.1.0 changes.
Added three hexadecimal digit notations to the Plane 0 UCSUR:
U+EBF0..U+EBFF:
Bruce Alan Martin's bit location notation.
U+ECF0..U+ECFF:
Ronald O. Whitaker's triangular notation.
Implemented other glyph changes per the Unicode Standard version
15.1.0.
Several other minor changes; see the ChangeLog file in the
main tarball for details.
Earlier Releases
See the Archive link at the top of this page for information on
earlier Unifont releases.
Unifont Glyph Tables
Unifont font files contain glyphs in several Unicode planes.
The following table provides an overview of this coverage.
GNU Unifont Font File Plane Coverage
Font Filename
Plane 0
Plane 1
Plane 2
Plane 3
Plane 14
Plane 15
unifont-*
X
X
1,2
X
1,2
unifont_jp-*
X
X
1,2
X
1,2
unifont_upper-*
X
X
3
X
3
X
unifont_csur-*
X
X
Notes:
1
PCF fonts can only include glyphs in Plane 0.
2
Only a subset of Plane 2 and Plane 3 CJK glyphs plus
the Plane 1 Copyleft glyph (U+1F12F) are included, to stay within the
OpenType limit of 65,536 glyphs.
3
unifont_upper
fonts will contain a superset
of Chinese Plane 2 and Plane 3 glyphs plus JIS X 0213 glyphs
until the OpenType font nears its limit of 65,536 code points.
Click on each link in the tables below to show its corresponding
256-code point range within the respective Unicode planes.
Plane 0 Glyphs
The table below links to the glyphs in the Plane 0 (Basic
Multilingual Plane)
unifont
font files.
GNU Unifont Glyphs
Unicode Basic Multilingual Plane
This next table links to the glyphs in the Plane 0 (Basic
Multilingual Plane)
unifont_jp
Japanese variant font files.
See also the Plane 2 glyphs further down, which are only
included in the
unifont_jp
OpenType and TrueType font files.
GNU Unifont Glyphs — Japanese Version
with Page Coverage for Plane 0
(Green=100%, Red=0%)
The table below links to the Japanese glyphs in Plane 2 (Supplementary Ideographic
Plane) contained in the
unifont_jp
OpenType and TrueType font files.
Note:
These Plane 2 glyphs along with the Plane 0 glyphs in
unifont_jp
font files provide complete coverage of the JIS X 0213
standard. Only 303 glyphs appear in the files below. Files with no glyphs appear
with a gray background.
GNU Unifont Glyphs — Japanese Version
with Page Coverage for Plane 2
(Gray=0%)
This next table links to the Chinese glyphs in Plane 2 (Supplementary Ideographic
Plane) contained in
unifont
OpenType and TrueType font files.
Note:
These Plane 2 glyphs along with the default Plane 0 glyphs
in Unifont provide complete coverage of the Table of General Standard Chinese
Characters (通用规范汉字表). Only 232 glyphs appear in the files below.
Files with no glyphs appear with a gray background.
GNU Unifont Glyphs — Chinese Version
with Page Coverage for Plane 2
(Gray=0%)
Plane 3 begins with the CJK Unified Ideographs Extension G block, from U+30000 through
U+3134A. This includes the highly complex biang Chinese ideograph and
taito Japanese ideograph:
This table links to the two ranges of 256 assigned code points
in Plane 14 (Tags and Variation Selector Supplement) that appear
in the
unifont_upper
OpenType and TrueType font files.
Finally, this last glyph table shows ConScript Unicode Registry (CSUR)
and Under CSUR glyphs that appear in the
unifont_csur
OpenType
and TrueType font files. Not all of the Plane 0 CSUR and UCSUR scripts
have been drawn, but given the esoteric nature of some CSUR and UCSUR scripts
(including the unavailability of glyph samples for many of the more obscure
constructed scripts), the boxes in the table all have a green background color
even if not at 100% coverage.
GNU Unifont Glyphs
Private Use Area, Planes 0 and 15 — ConScript Unicode Registry
If you would like to contribute glyphs to the GNU Unifont effort,
you can download the associated PNG file from the tables above
(SMP and CSUR need additions). Then draw new glyphs in the 16-by-16
pixel area that is inside the inner box you see in the image on
the left.
When done, erase the surrounding inner box and ruler lines around the
inner box. You can then save the file as a monochrome bitmap image.
Then convert the .png file into a .hex file with the unipng2hex utility
in the source tarball. Or you can just email the .png file to me as
a contribution to this effort and I will do the conversion.
Q: Why is the outer grid so much larger than the 16-by-16 pixel
inner box?
A: Because in a future version, unipng2hex, unihex2png, and other
utilities should be able to handle larger glyphs.
The table below shows the current state of completion of the Supplementary
Multilingual Plane (Plane 1). Any range in the table that doesn't have
a green background has missing glyphs. To see which scripts are in a
particular range, consult the "Supplementary Multilingual Plane" list
in the Current Coverage section below. The more red a range appears
in the table below, the more glyphs are missing from that range.
Current Coverage
Links in this section reference the first block of 256 glyphs
where a script begins.
The list below shows the scripts that are in the Unicode
Basic Multilingual Plane, with coverage in this release of Unifont.
The list below shows the scripts that are in the Unicode
Supplementary Multilingual Plane, with coverage in this release of Unifont.
Scripts labeled "(Pending)" are being drawn currently.
*Note: Scripts such as Cuneiform, Egyptian
Hieroglyphs, and Bamum Supplement will not be drawn on a 16-by-16
pixel grid. There are plans to draw these scripts on a 32-by-32
pixel grid in the future.
Plane 14 has two scripts, both of which Unifont covers:
The list below shows the scripts that are in Michael Everson's
ConScript Unicode Registry (CSUR) and Rebecca Bettencourt's Under-CSUR
that have coverage in this release of Unifont:
GNU Unifont Glyphs
Private Use Area, Planes 0 and 15 — ConScript Unicode Registry
Initially I just posted my additions to
Roman Czyborra's
original
unifont.hex file. Then in mid-January 2008, his website went down.
So I started posting font updates here. Roman has encouraged me to continue
with my additions.
Roman's website is now back online, and you can read his
Unifont description and motivation for its creation on his website,
along with his archive of Unifont's changes:
http://czyborra.com/unifont
.
TrueType Font Generation
Luis Alejandro González Miranda
wrote a cool combination of scripts to
convert GNU Unifont from .hex format into FontForge .sfd format, then to
have FontForge convert this to a TrueType outline font (see the Unicode
Utilities web page on this site for more information). Pixels are drawn
as outlined squares, so they scale to all point sizes. This works well with
GNOME; I haven't tried it with any other Unix windowing environment.
I've removed the OpenType SBIT font link from this page because the outline
font is much more flexible.
Luis has given me permission to modify his scripts to convert the latest
GNU Unifont versions to TrueType. I've modified his original scripts to
handle Unicode combining characters.
JIS X 0213 Kanji
Jiskan16
Unifont 12.1.02 added Japanese BDF and TrueType versions,
unifont_jp
. This replaced over 10,000 ideographs
in the default Unifont font with Japanese kanji from the 16 × 16
pixel Jiskan 16 font. The font is available in two files,
corresponding to the two planes in JIS X 0213. Both files are
in the public domain.
The comments in the BDF source font files (downloadable from the
Japanese Fonts
page) credit the following contributors (in order): Toshiyuki Imamura,
HANATAKA Shinya, Taichi Kawabata, Koichi Yasuoka, TOYOSHIMA Masayuki,
Kazuo Koike, and SATO Yasunao.
For the Unifont release, the glyphs from the two JIS X 0213 planes
were converted into Unifont .hex files and mapped to code points
in Unicode's Plane 0 and Plane 2 for Unifont. The result
provides complete representation of the kanji in JIS X 0213 in a free
Unicode font.
Izumi16
Unifont 12.1.03 replaced the Jiskan16 glyphs with the public domain
Izumi16 glyphs. These provide improvements on the earlier Jiskan16
glyphs.
Wen Quan Yi: Spring of Letters
(文泉驛 / 文泉驿)
The original Unifont CJK glyphs were replaced by new CJK glyphs from
version 1.1 of
Qianqian Fang's
Unibit font. The Unibit font
began as a combination of the original GNU Unifont glyphs and a basic
CJK bitmap font placed in the public domain by the People's Republic
of China. It adopted GNU Unifont's scheme of 8x16 and 16x16
glyphs. Qianqian Fang and many others then added about 10,000
more glyphs.
Qianqian states in the Unibit distribution:
"The entire CJK Unified Ideographics (U4E00-U9FA5) and CJK Unified
Ideographics Extension A(U3400-U4DB5) blocks were replaced by
high-quality glyphs from China National Standard GB19966-2005
(public domain)."
Wen Quan Yi volunteeers then edited thousands
of these characters. Qianqian also drew the new 22 CJK ideographs
in the range U+9FA6..U+9FBB that appear in GNU Unifont.
The following code points in the latest unifont.hex file are
taken from the WQY Unibit font (with my additions to complete the
U+3000..U+33FF range, particularly the missing Hiragana, Katakana,
and Kanji), including glyphs updated by the Wen Quan Yi volunteers
and other modifications as part of the Unifont font:
U+3400..U+4DBF: CJK Unified Ideographs Extension A
U+4E00..U+9FBF: CJK Unified Ideographs
U+F900..U+FAFF: CJK Compatibility Ideographs
U+FF00..U+FF60: Fullwidth Forms of Roman Letters
Qianqian has given his okay to add these CJK glyphs from the
Wen Quan Yi project into GNU Unifont. Likewise, I've told him
to incorporate any glyphs he wants from my contributions to GNU
Unifont into his Unibit font. In October 2020, Qianqian Fang also
granted permission to apply the SIL Open Font License version 1.1
to Wen Quan Yi glyphs in Unifont as a dual license.
What's Next?
All of the glyphs in the Supplementary Multilingual Plane that could
easily be drawn in a 16-by-16 pixel grid have been drawn as of the
Unifont 9.0.01 release. There are no plans to draw Tangut.
A number of ConScript Unicode Registry (CSUR) scripts remain to be drawn.
If you are interested in contributing glyphs to this
effort, please contact me. All new contributions must be licensed under
the same license as the rest of Unifont (in a nutshell, GPL 2+ with the
GNU font embedding exception and the SIL OFL 1.1).
With the great work done by contributors in providing ConScript Unicode
Registry (CSUR) glyphs, they are available in font files that have
"_csur" in their name.
macOS 26.2 enables fast AI clusters with RDMA over Thunderbolt
When you tell everyone you’re building a secure platform, the first thing that they ask about is encryption.
And, in 2025, the hot topic in encryption is algorithms that are safe from hypothetical quantum computers that, unlike real ones, can factorise numbers bigger than 31.
These algorithms are referred to as post-quantum cryptography (PQC).
Since NIST standardised a few such algorithms, there’s been a lot more interest in seeing them in production, so I spent some time getting the implementations from the Linux Foundation’s PQ Code Package to run on CHERIoT.
A lot of companies are building hardware to accelerate these operations, so it seemed useful to have a performance baseline on the CHERIoT Ibex, as well as something that can be used in future CHERIoT-based products.
What are ML-KEM and ML-DSA for?
I am not a mathematician and so I’m not going to try to explain how these algorithms work, but I am going to explain what they’re
for
.
Module-Lattice-Based Key-Encapsulation Mechanism (ML-KEM) is, as the name suggests, an algorithm for key encapsulation.
One side holds a public key and uses it (plus some entropy source) to generate a secret in both plain and encapsulated forms.
The encapsulated secret can be sent to a remote party who holds the corresponding private key.
The receiver can then recover unencrypted version of the secret (and detect tampering).
Now, both parties have the same secret and can use it with some key-derivation function to produce something like an AES key for future communication.
Note that this is somewhat more restrictive than traditional key-exchange protocols.
You don’t get to exchange an arbitrary value, the generation step is part of encapsulation.
This also means that it’s a fixed size, defined by the algorithm, which is why you typically feed it into a key-derivation function rather than using it directly.
Module-Lattice Digital Signature Algorithm (ML-DSA) has a similarly informative name.
It is intended for providing and validating digital signatures.
It takes a private key, an arbitrary-sized document and context, and produces a signature.
A holder of the associated public key can then validate that the document matches the version signed with the private key and context.
These are both quite low-level building blocks for higher-level protocols.
For example, TLS can use ML-KEM for key exchange and ML-DSA for certificate validation, but also incorporates traditional algorithms in case the PQC algorithms have unexpected weaknesses against classical computers.
Initial porting
As is usually the case for CHERIoT, porting the C implementations of ML-KEM and ML-DSA required no code changes.
I worked with upstream to slightly simplify the platform-integration layer, so we just provide a single header describing the port.
For example, the
port heaer for ML-DSA
configures the build to produce ML-DSA44 support, defines a custom function for zeroing memory and getting entropy, and adds the
__cheriot_libcall
attribute to the all exported APIs (so we can build them as shared libraries, rather than embedded in a single compartment).
The
file for ML-KEM
is almost identical.
With these defined, it is possible to build both libraries as CHERIoT shared libraries.
This motivated a bit of cleanup.
We have a device interface for entropy sources, but it wasn’t implemented on the Sail model (which doesn’t have an entropy source).
It has a way of exposing the fact that entropy is insecure, so that wasn’t a problem, it just needed doing, so I refactored all of the insecure entropy-source drivers to use a common base.
Most encryption algorithms want an API that fills a buffer with entropy.
It’s nice if these don’t all need to touch the driver directly, so I created a compartment that provides this API and exposes it.
Now, both libraries are simply consumers of this API.
This also makes it easier to add stateful whitening for entropy drivers for hardware entropy sources that don’t do the whitening in hardware.
Most CHERIoT stacks are on the order of 1-2 KiBs.
The PQC algorithms use much more space.
More, in fact, than we permitted.
The previous limitation was based on the precision of bounds rounding.
A CHERI capability compresses the bounds representation by taking advantage of the fact that, for a pointer to an allocation, there is a lot of redundancy between the address of the pointer, the address of the end of the allocation (the top), and the address of the start of the allocation (the base).
The distance from the address to base and top are stored as floating-point values with a shared exponent.
In practical terms, this means that the larger an allocation is, the more strongly aligned its start and end addresses must be.
The same restrictions apply for any capability that grants access to less than an entire object.
When you call a function in another compartment, the switcher will truncate the stack capability so that the callee sees only the bit of the stack that you weren’t using.
The top and base of the stack must be 16-byte aligned (as an ABI requirement), but a very large stack may have hardware requirements for greater alignment and so may require a gap between the bottom of the caller’s stack and the top of the callee’s.
Fortunately, we’d added an instruction precisely for this kind of use case:
CSetBoundsRoundDown
.
This takes a capability and a length and truncates it to
at most
that length.
It was a fairly small tweak to the switcher to make it do this, and a much larger amount of time with SMT solvers to convince ourselves that this was a safe thing to do.
This also showed up a bug in our linker’s handling of the
CAPALIGN
directive, which rounds a section’s base and size up to the required alignment to be representable.
This was not working for sections that followed an explicit alignment directive.
Our stacks must be both at least 16-byte aligned
and
representable as capabilities.
This is now fixed.
So now we support stacks up to almost 64 KiB, a limitation imposed by the current loader metadata format rather than anything intrinsic to how the system operates after booting.
We could easily increase this limit but 64 KiB ought to be enough for anyone.
Performance on CHERIoT Ibex
The repository contains
a simple benchmark example
that tries each of the operations and reports both the cycle time and stack usage.
The output on the CHERIoT Ibex verilator simulation is:
The ML-KEM encrypt (generate shared secret and encrypted version) and decrypt (recover shared secret from encrypted version) each use around 18 KiB of stack and run in around two million cycles.
CHERIoT Ibex should scale up to 200-300 MHz (though may be clocked lower for power reasons in some deployments), but even at 100 MHz that’s 50 encryption or decryption operations per second.
Remember that this is an operation that typically happens when you establish a connection, then you use a stream cypher such as AES with the exchanged key.
The ML-DSA operations are slower and use a
lot
more stack space (almost 60 KiB for signing!).
But, even there, the performance is reasonable, under 4 M cycles.
This means that you can do 20 signature-verification operations per second at 100 MHz.
Even using ML-KEM for key exchange and ML-DSA for certificate validation in a TLS flow is unlikely to add more than a few tens of milliseconds to the handshake time, which is perfectly acceptable for the common use case for embedded devices.
In terms of code size, both are small.
The ML-KEM implementation is around 12 KiB, the ML-DSA implementation 18 KiB.
These both include a SHA3 (FIPS 202) implementation, so there’s scope for code-size reduction on systems that need both, but 30 KiB of code isn’t too bad.
Future plans
The stack usage is very high.
Upstream has some plans to allow pluggable allocators, which will allow us to move a lot of this to the heap.
This is precisely the kind of use case that CHERIoT’s memory-safe heap is great for: something needs 60 KiB of RAM for 4,000,000 cycles, but then doesn’t need that RAM again for a long time.
That memory can then be used for something else, even in a mutually distrusting compartment.
Currently, the library builds are very thin wrappers around the upstream projects.
This is great as a building block, but we should make more use of CHERIoT features in the longer term.
Both ML-KEM and ML-DSA depend on SHA3 (FIPS 202).
Ideally, we’d factor that out as some common code, rather than carrying a copy in each library.
Similarly, the libraries provide an option to plug in your own SHA3 implementation.
This is likely to be a common hardware operation even for chips that don’t have full PQC implementations, so we should expose this option in the build system.
Is it secure?
Security always depends on the threat model.
For signature validation, you don’t have any secret data, just a public key, a document, and a signature.
The only concerns are whether there are weaknesses in the algorithm, or bugs, that would allow an attacker to substitute a different document for the same signature.
CHERIoT prevents memory-safety bugs, so this is concerned solely with logic errors.
The code upstream is checked against a set of test vectors that aim to trigger corner cases in the logic of the underlying implementation, so hopefully is secure in this way.
For signing or key exchange, you need to worry about the key leaking.
On a CHERI system, it’s unlikely to leak explicitly, but may leak via side channels.
The
security section of the upstream projects
discusses a number of techniques that they use to mitigate this kind of attack.
That’s typically sufficient.
It’s been recommended practice for embedded devices to have per-device secrets for a long time.
This means that leaking a key from one device doesn’t compromise the device class, only that specific device.
For some very high-assurance use cases, that secret may matter and need to be robust against an adversary with physical access to the device.
Hardware encryption engines typically care about confidentiality breaches via power side channels and integrity breaches via glitch injection.
Power side channels are difficult to mitigate in software: the power requirements of multiplying two numbers together may depend on the number of carry bits set, for example.
They’re much easier to mitigate in hardware, by simply doing the same calculation twice in parallel, once with the original inputs and once with the inputs permuted to have the opposite power characteristics.
Glitch injection takes the chip out of its specified power or frequency (or thermal) envelope and attempts to introduce bit flips, which can corrupt state in such a way that tamper with signing or leak a key.
These are also effectively impossible to mitigate in software because the software that’s attempting the mitigation is vulnerable to the same glitches.
There are some compiler techniques that can make these harder, but they come with a high performance cost.
If power analysis and glitch injection are part of your threat model, the software implementations are not sufficient.
In this case you may also need to worry about someone removing the top of the chip and using a scanning-tunnelling electron microscope to read bits from non-volatile memory.
This used to require tens of thousands of dollars but is now much cheaper.
Devices that need to worry about this often have tiny explosive charges in the package to destroy the chip in cases of tampering.
If that’s your threat model, hardware PQC implementations may not be sufficient, at least alone.
But if you care about attackers on the network being unable to compromise the security of the class of devices, even if they have a magical and imaginary quantum computer, then these should be sufficient.
With the
eInvoicing Directive (2014/55/EU)
, the European
Union introduced “standardized” electronic invoices in XML format. Increasingly,
institutions and businesses in EU member states will be required to support these
electronic invoices.
While machine-readable invoices are, in general, a good idea, there are various issues
with the EU’s approach, including needless complexity, a lack of true standardization
(multiple syntaxes and various sub-formats), and a tendency to use technologies with
inherent security problems.
Due to a combination of unfortunate design decisions, implementing software for
electronic invoices is likely to be affected by security flaws if no countermeasures are
implemented.
XML Insecurity and XXE
The XML format is known to have inherent security flaws, the most dangerous ones being
XXE vulnerabilities (XML eXternal Entity injection).
XXE vulnerabilities often allow the exfiltration of files. While some XML implementations have
implemented secure defaults or were never vulnerable to begin with (e.g.,
Python
,
libxml2
,
.NET
,
Expat
), others remain insecure
by default.
Two notable examples of implementations with insecure defaults are the Java standard
library and the Saxon library. Both are commonly used within the electronic invoicing
ecosystem.
The problem with XSLT 2.0
XSLT is a document transformation language. Only XSLT version 1.0 is widely supported.
For XSLT 2.0 and above, only one freely available implementation exists: Saxon.
Thus, anyone using these validation artifacts will likely use Saxon to implement invoice
parsing. Saxon, as mentioned, is vulnerable to XXE by default.
Despite its poor implementation status and the fact that its primary implementation
has insecure defaults, XSLT 2.0 (and its successor 3.0) is a W3C recommendation. I
raised these concerns with the W3C.
The EU requirements for electronic invoices are standardized by the European Committee
for Standardization (CEN) in a set of standards named EN16931. The first two parts are
available free of charge. Subsequent parts cost money.
Accessing these standards is surprisingly difficult. A link on the EU web page to CEN is
currently broken. CEN does not provide direct downloads of these documents and refers to
national standardization organizations. Those often require account registrations even
to access the free-of-charge parts of the standard.
The Estonian standardization organization (EVS) provides downloads of parts one and two
without registration:
For the parts of EN16931 that are not available free of charge, prices at EVS are
cheaper than those at most other national standardization organizations.
XXE vulnerabilties
List of security vulnerabilities discovered in electronic invoicing software during this
research:
Reported: 2025-11-17, no reply, re-tested on 2025-11-25, validation functionality was removed (relied on ZUV)
REDACTED1
XXE
Reported: 2025-10-29, no reply, re-tested on 2025-11-18, fix incomplete (see next line)
REDACTED1
Blind XXE
Reported: 2025-11-18, no reply, unfixed
REDACTED2
Blind XXE
Reported: 2025-11-17, no reply, unfixed
* ZUV is no longer developed, and it is recommended to use Mustang instead. Mustang was
also vulnerable to XXE in versions before
2.16.3
(
CVE-2025-66372
).
At present, there are
multiple
cases
in which authors are
suing
AI companies for scraping their works without payment or permission. While these legal battles have been going on, Amazon has quietly added a new AI feature to its Kindle iOS app—a feature that “lets you ask questions about the book you’re reading and receive spoiler-free answers,” according to an
Amazon announcement
.
The company says the feature, which is called Ask this Book, serves as “your expert reading assistant, instantly answering questions about plot details, character relationships, and thematic elements without disrupting your reading flow.”
Publishing industry resource Publishers Lunch noticed Ask this Book earlier this week, and asked Amazon about it. Amazon spokesperson Ale Iraheta
told PubLunch,
“The feature uses technology, including AI, to provide instant, spoiler-free answers to customers’ questions about what they’re reading. Ask this Book provides short answers based on factual information about the book which are accessible only to readers who have purchased or borrowed the book and are non-shareable and non-copyable.”
As PubLunch summed up: “In other words, speaking plainly, it’s an in-book chatbot.”
Amazon did not answer PubLunch’s questions about “what rights the company was relying upon to execute the new feature was not answered, nor did they elaborate on the technical details of the service and any protections involved (whether to prevent against hallucinations, or to protect the text from AI training).”
Perhaps most alarmingly, the Amazon spokesperson said, “To ensure a consistent reading experience, the feature is always on, and there is no option for authors or publishers to opt titles out.”
It also sounds as though authors and publishers were, for the most part, not notified of this feature’s existence.
Amazon is already in the news this week for its flawed AI recaps of television shows. After a
Fallout
recap was “garbage filled with mistakes,” as
io9 called it,
the company
paused the feature
. A similar thing happened earlier this year with Amazon’s
AI dubs for anime series
.
As PubLunch says of Ask this Book, “Many rightsholders and creators are likely not to want an in-book chatbot without their specific review and approval (or at all), and we expect that message will be getting delivered to publishers and Amazon loud and clear in the ensuing days. And many people would deem the outputs of generative AI analyzing a particular copyrighted work as the very embodiment of a derivative work (or simply a direct infringement).”
Ask this Book is currently only available in the Kindle iOS app in the US, but Amazon says it “will come to Kindle devices and Android OS next year.”
LLM 0.28
Simon Willison
simonwillison.net
2025-12-12 20:20:14
LLM 0.28
I released a new version of my LLM Python library and CLI tool for interacting with Large Language Models. Highlights from the release notes:
New OpenAI models: gpt-5.1, gpt-5.1-chat-latest, gpt-5.2 and gpt-5.2-chat-latest. #1300, #1317
When fetching URLs as fragments using llm -f URL, th...
LLM 0.28
. I released a new version of my
LLM
Python library and CLI tool for interacting with Large Language Models. Highlights from the release notes:
New OpenAI models:
gpt-5.1
,
gpt-5.1-chat-latest
,
gpt-5.2
and
gpt-5.2-chat-latest
.
#1300
,
#1317
When fetching URLs as fragments using
llm -f URL
, the request now includes a custom user-agent header:
llm/VERSION (https://llm.datasette.io/)
.
#1309
Fixed a bug where fragments were not correctly registered with their source when using
llm chat
. Thanks,
Giuseppe Rota
.
#1316
That last bullet point about
uv
relates to the dependency groups pattern I
wrote about in a recent TIL
. I'm currently working through applying it to my other projects - the net result is that running the test suite is as simple as doing:
git clone https://github.com/simonw/llm
cd llm
uv run pytest
The new
dev
dependency group
defined in pyproject.toml
is automatically installed by
uv run
in a new virtual environment which means everything needed to run
pytest
is available without needing to add any extra commands.
Lawmakers Pave the Way to Billions in Handouts for Weapons Makers That the Pentagon Itself Opposed
Intercept
theintercept.com
2025-12-12 20:19:44
The pilot program, added to the military budget behind closed doors, upends an 80-year precedent against covering contractors’ interest payments.
The post Lawmakers Pave the Way to Billions in Handouts for Weapons Makers That the Pentagon Itself Opposed appeared first on The Intercept....
For the better part
of a century, there was one thing even the U.S. government would not do to pad the profits of defense contractors.
Now, more than 80 years of precedent may be coming to an end.
On Thursday, lawmakers in the House approved a “pilot program” in the pending Pentagon budget bill that could eventually open the door to sending billions to big contractors, while providing what critics say would be little benefit to the military.
The provision, which appeared in the budget bill after a closed-door session overseen by top lawmakers, would allow contractors to claim reimbursement for the interest they pay on debt they take on to build weapons and other gadgets for the armed services.
“The fact that we are even exploring this question is a little crazy in terms of financial risk.”
The technical-sounding change has such serious implications for the budget that the Pentagon itself warned against it two years ago.
One big defense contractor alone, Lockheed Martin, reported having more than $17.8 billion in outstanding interest payments last year, said Julia Gledhill, an analyst at the nonprofit Stimson Center.
“The fact that we are even exploring this question is a little crazy in terms of financial risk for the government,” Gledhill said.
Gledhill said even some Capitol Hill staffers were “scandalized” to see the provision in the final bill, which will likely be approved by the Senate next week.
Pilot to Where?
For most companies, paying interest on a loan they take out from the bank is a cost of doing business. The pilot program buried in the budget bill, however, is one of many ways in which the federal government would give defense contractors special treatment.
Contractors can already receive reimbursements from the Defense Department for the
cost of research and development
. Under the terms of the legislation, they would also be allowed to receive reimbursements for “financing costs incurred for a covered activity.”
The legislation leaves it up to the Pentagon to design the program. While it’s billed as a pilot, there is no hard spending cap in the pending legislation. The total amount dedicated to the program would be determined by the House and Senate appropriations committees.
The bill tasks the Defense Department with releasing a report in February 2028 on how well the pilot program worked. As approved by Congress, however, the bill does not explain what metrics, if any, the Pentagon is supposed to use to evaluate the program.
“I don’t see any clear parameters for what success looks like,” Gledhill said. “Are there new entrants? Are we building weapons production capacity? Or are new entrants on the way?”
The chairs and ranking members of the House and Senate armed services committees who oversaw the closed-door conference process that produced the final draft of the National Defense Authorization Act did not respond to requests for comment.
In a document posted online, the committee leaders said that similar provisions were included in House and Senate drafts of the bill.
Big Spending at Stake
The switch to covering financing costs seems to be in line with a larger push this year to shake up the defense industry in light of lessons learned from Russia’s brutal war on Ukraine and fears of competition with China.
“The generous view of this provision is: Look, we have industrial capacity constraints and perhaps if we make borrowing essentially free, then maybe — big maybe — contractors will invest in capacity,” Gledhill said.
She is skeptical that will happen, and the Pentagon itself was dubious in a 2023 study conducted by the Office of the Under Secretary of Defense for Acquisition and Sustainment. The Pentagon found that policy change might even supercharge the phenomenon of big defense contractors
using taxpayer dollars for stock buybacks
instead of research and development.
“Higher interest rates or increased borrowing only increase Revenue and Profits further,” the report found. “This creates the real risk of a ‘moral hazard’ as it pertains to interest.”
The sums at stake are enormous. The “five primes” — the big defense contractors who claim the
lion’s share of Pentagon contracts
— each reported spending massive amounts of money on interest payments last year. The companies all disclose their debt loads in slightly different ways in their annual reports, but the scale is nonetheless massive in each case.
Lockheed Martin
said
it had $17.8 billion in outstanding interest payments.
RTX, formerly known as Raytheon,
said
it had $23.3 billion in future interest on long-term debt.
“I don’t think a single dollar should go toward interest payments for contractors.”
Northrop Grumman paid $475 million on interest payments in 2024, and General Dynamics, for its part, paid $385 million.
Meanwhile, Boeing said that it had $38.3 billion in long-term interest on debt. The company did not break down specifically how much of that debt related to its defense business, which accounted for 36.5 percent of its revenue in 2024.
Along with the “five primes,” Silicon Valley firms such as
Anduril
and Palantir are increasingly moving into defense contracting.
It’s unlikely that the contractors’ interest payments would ever be fully reimbursed by the Defense Department, Gledhill said, but even getting a fraction covered would amount to a huge giveaway.
She said, “I don’t think a single dollar should go toward interest payments for contractors.”
We built a complete VR setup from scratch to let rats play DOOM. The system includes a motion-tracked treadmill ball, a panoramic headset, an input trigger, and a reward circuit. All hardware and software components are open sourced, including 3D-printable designs, circuit diagrams, firmware, and control software.
The first version (v1) was built in New York by
Viktor
, who trained rats to walk through a corridor in DOOM using a simpler rig. That version was featured on Vice and PC Gamer. After moving back home, the project was paused. Public interest reignited development, leading to v2, a more advanced and modular version built in collaboration with electrical engineer
Sándor Makra
.
Akos Blaschek
later assisted significantly in documenting the project for open-sourcing, aiming to enable others to replicate and build upon this work. Key metallic components were designed and sourced in collaboration with
SZURWIN KFT
.
V1
Basic ball setup
Rats trained to run forward
Minimal sensors and mechanics
No panoramic screen
Rat VR Setup Version 1
V2
New ball driver mechanism for smoother movement
Foldable AMOLED screen with 180° horizontal and 80° vertical FOV, Full HD resolution
Upgraded sensors for movement tracking
Reinforced feeder system with mixing motor
Modular 3D-printable components
Improved electronics reliability and safety
Rat VR Setup Version 2
Full setup from side showing rat on ball, screen around, trigger, and water tube.
Limitations
We reached the point of rat habituation but didn’t start training. Our rats (Todd, Kojima, Gabe) aged out before full testing. The setup works, but behavioral validation is pending.
Hardware
The hardware is a comprehensive VR rig designed for rodents. It consists of a motion-tracked sphere that captures the rat's movements, a custom-built trigger for in-game actions, a curved panoramic screen for visual immersion, and an automated reward system that dispenses sugar water to reinforce behavior. All these components are mounted on a modular aluminum frame, creating a complete, self-contained environment for the rat to interact with the game.
The headset wraps around the rat’s head with a foldable AMOLED screen. It maximizes immersion without obstructing whisker space. The screen supports Full HD resolution.
The headset frame also integrates several sensory components: two small air nozzles are positioned near the left and right whiskers, capable of delivering targeted air puffs on command (e.g., signaling wall collisions in-game). The frame provides a secure mounting point for the reward system's dispenser tube, placing it near the rat's mouth. Additionally, the design includes placeholders for miniature speakers near each ear, intended for future implementation of stereo audio cues.
Headset close-up.
3D Model: Headset
Locomotion
Movement is captured via a free-spinning ball under the rat. Rotary sensors track displacement and convert it into game motion. The ball can also be driven by motors.
These motors are used during training to roll the ball and simulate movement paths before a reward. This guides the rat on where to go, helping form movement-action associations. Like the trigger, this allows for programmatic training sequences with minimal initial input from the animal.
Ball mount showing driven/undriven modes and sensor placement.
3D Model: Stand/Ball
Trigger Input
The shooting input is a custom-built hand-operated lever. Rats pull it with their paws to fire. The lever is held in place by small springs, encased in a 3D-printed housing. It includes a rotary encoder to detect motion and a stepper motor to actuate it.
The motor allows programmatic control—pulling the lever to demonstrate shooting. This enables training by pairing visual cues with mechanical motion, reinforcing the association before the rat initiates the action on its own.
Close-up of trigger lever with encoder and motor.
3D Model: Trigger
Reward System
Positive in-game actions trigger a liquid reward: sugar water delivered through a precise dispensing mechanism. The system consists of:
Mixer: Continuously stirs the sugar solution to maintain even concentration
Pump + Pressure Sensor: Keeps the line under constant pressure
Solenoid Valve: Magnetic valve that opens to release exact 10 µL doses
Dispenser: Positioned near the mouth for easy access
This setup ensures accurate, repeatable reward delivery with minimal delay. The reward is synchronized with game events to reinforce desired behaviors.
The messy but functional reward circuit from behind.
Limitations
The current system assumes basic rat mobility and grooming behavior. Fine-tuning might be needed for rats of different sizes or temperaments. Trigger placement and reward tube flow may need calibration per subject.
Software
The setup is controlled through a modular Python system. The main entry point is
arena_scenario.py
, which runs the full control loop.
The system includes:
Motion capture: Reads movement from optical flow sensors mounted around the treadmill ball.
Locomotion control: Drives the ball motors to guide the rat during training.
Trigger input: Reads lever pulls, detects voluntary shooting actions.
Reward delivery: Dispenses precise 10 μL sugar water rewards via a controlled solenoid valve and maintains constant line pressure.
DOOM integration: Interfaces with a modified ViZDoom environment for real-time closed-loop behavior.
Training logic: Enforces demonstrations and delivers rewards based on game state and rat behavior.
The software runs on a PC and communicates with a Raspberry Pi via TCP sockets. The Pi handles real-time sensor reading, ball actuation, and reward control; the PC processes the sensor data, runs the game, and sends high-level commands to the Pi.
All major components—movement tracking, ball driving, trigger detection, and reward control—can be operated manually or in closed-loop mode. All control parameters (e.g., motor speeds, reward volumes) are set in Python code.
Limitations
There’s no in-built calibration suite. Users must validate sensor alignment and reward timing manually. Some microcontroller firmwares might require tuning based on hardware tolerances.
Results
The rats successfully learned to navigate the virtual environment and trigger the shooting mechanism. Habituation took approximately two weeks per rat. While advanced training wasn't completed due to time constraints, initial data showed promising engagement with the system.
Rat interacting with the VR setup.
Limitations
Full behavioral validation requires longer training periods. Cross-subject variability wasn't extensively studied. The impact of prolonged VR exposure on rat well-being needs further research.
What Now?
Interested in building your own animal VR setup? Feel free to reach out for guidance. We're also compiling a comprehensive
Rat VR Build Guide
.
At
YoloRun.Capital
, we invest in ambitious, boundary-pushing projects like this, even the beautifully impractical ones. Have a wild idea? Let's talk.
Team
Viktor Tóth
Gamer Rat Coach
Sándor Makra
Electrical Engineer
Ákos Blaschek
Documentation Lead
Three new stable kernels
Linux Weekly News
lwn.net
2025-12-12 19:45:30
Greg Kroah-Hartman has released the 6.18.1, 6.17.12, and 6.12.62 stable
kernels. Each contains important fixes; users of those kernels
are advised to upgrade.
...
Greg Kroah-Hartman has released the
6.18.1
,
6.17.12
, and
6.12.62
stable
kernels. Each contains important fixes; users of those kernels
are advised to upgrade.
No, founders are not adopting Bryan Johnson’s regimen to reverse aging. Quite the opposite : the average founder raising capital ages six months every year.
1
I suspect founder age has been increasing steadily for three reasons. First, venture capital has shifted toward AI, which grew from roughly 10% to 60% of investment in just three years.
2
AI founders skew older. Many AI labs are started by PhDs who spent extended periods in school & often come from industry, commercializing initiatives from major labs or hyperscalers.
Second, the shift toward B2B rewards experience. B2B founders benefit from established relationships with potential team members, design partners & expertise selling to enterprises. These networks take years to build.
Third, press coverage distorts perception. Media tends to spotlight younger founders pursuing product-led growth or consumer strategies. The
Cursor team
, fresh from MIT, captures the zeitgeist. But there are many founders who grow up within an industry & then go out to upend it.
Perhaps venture capitalists should start funding reverse aging programs. If this trend holds, the typical founder will be a decade older in 20 years.
Subscribe
The 1-minute read that turns tech data into strategic advantage. Read by 150k+ founders & operators.
GP at Theory Ventures. Former Google PM. Sharing data-driven insights on AI, web3, & venture capital.
Daring Fireball Weekly Sponsorships, End of Year and Q1 2026
Daring Fireball
daringfireball.net
2025-12-12 19:25:23
Weekly sponsorships have been the top source of revenue for Daring Fireball ever since I started selling them back in 2007. They’ve succeeded, I think, because they make everyone happy. They generate good money. There’s only one sponsor per week and the sponsors are always relevant to at least some ...
To schedule a sponsorship or for additional information,
email John Gruber
.
Week-long sponsorships are available for Daring Fireball. This is the only way to promote your product or service specifically to Daring Fireball’s audience of Mac nerds, designers, nitpickers, perfectionists, and connoisseurs of fine sarcasm.
What sponsors get:
A display ad in the sidebar on every page of the site, all week long.
A post from the sponsor will appear in the RSS feed at the start of the week. You, the sponsor, get to address Daring Fireball’s most dedicated readers directly.
At the end of the week, I’ll also post an item thanking and linking to the feed sponsor.
Sponsorship is exclusive. Only
one
sponsor per week.
Every so often, a wonderful thing happens: someone young enough to have missed out on using computers in the early 1990s is introduced to the Windows 3.1 "Hot Dog Stand" color scheme. Back in the day Windows was pretty plain looking out of the box, with grey windows and blue highlights as the default. A number of optional color palettes gave it a bit more pep, like the wine-tinged Bordeaux or the more sophisticated teal of Designer.
And then there was Hot Dog Stand, which more or less turned Windows into a carnival.
"The truly funny thing about this color scheme is that all the other Windows 3.1 color schemes are surprisingly rational, totally reasonable color schemes," tech blogger Jeff Atwood wrote
back in 2005
. "And then you get to 'Hot Dog Stand. Which is
utterly insane
. … I have to think it was included as a joke."
(Image credit: Microsoft)
Did Windows 3.1 really ship with a garish color scheme that was
dared
into being? That was a story I needed to hear, so I went digging for the credits of the Microsoft employees who worked on the user interface back then and found my way to
Virginia Howlett
, who joined Microsoft in 1985 as the company's first interface designer, and worked there up through the launch of Windows 95.
Howlett also co-created the font
Verdana
, which is partially named after her daughter Ana and is up there with Helvetica as one of the most-used fonts of the last 30 years. But enough about her world-changing contributions to modern technology:
we're here to talk Hot Dog Stand.
"I confess that I'm surprised anyone cares about Windows 3.1 in late 2025! It was such a long time ago and the world has changed so much," Howlett told me when I reached out over email. She confirmed that she and a "small team of designers" created Windows 3.1's themes, which were a "radically new" feature at the time—prior to its release, you couldn't customize different parts of the OS, like the backgrounds and title bars of windows, with different colors.
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
Publicity photo from her early years at Microsoft
(Image credit: Virginia Howlett)
I asked if the designers at Microsoft really had included Hot Dog Stand as a joke, or if it was inspired by a particular stand they frequented near the corporate campus (hey, it was a longshot, but you never know). I'll let Virginia tell the rest of the story:
As I recall there were 16 colors: white, black, gray, RGB, CMY, and the dark versions of those colors—so dark red, dark green, dark blue, dark cyan, dark magenta, dark yellow, dark gray. (Normal people might call some of these colors teal, navy, burgundy, etc.) Much of the user interface was black lines on a white background and used 2 shades of gray to create 3-D buttons: 'affordances.'
We designed a long list of themes using those 16 colors. No one today seems interested in 'Bordeaux' or 'Tweed' or 'Arizona.' We were covering all the bases, hoping to come up with color schemes that would appeal to a broad range of people. 'Hot Dog Stand' used bright yellow and red.
I have been mystified about why that particular theme causes so much comment in the media. Maybe it's partly the catchy name. (Never underestimate the power of a good brand name!)
I do remember some discussion about whether we should include it, and some snarky laughter. But it was not intended as a joke. It was not inspired by any hot dog stands, and it was not included as an example of a bad interface—although it was one. It was just a garish choice, in case somebody out there liked ugly bright red and yellow.
The 'Fluorescent' theme was also pretty ugly, but it didn't have a catchy name, so I've never heard anything about it.
I'm really glad that 'Hot Dog Stand' has entertained so many people for so many years.
With regards to design historians everywhere,
Virginia Howlett
As delightfully garish as Hot Dog Stand is, Howlett is right that it's far from the only eye searing theme in the Windows 3.1 collection. Check out Fluorescent and Plasma Power Saver:
(Image credit: Microsoft)
You can play around with Windows 3.1 in your browser thanks to the emulator
PCjs Machines
; if you get really into it, you can even customize every color yourself instead of relying on one of the preset themes.
So that's that: Hot Dog Stand may have inadvertently served as a warning to aspiring theme customizers that madness was just a few overzealous color choices away, but that wasn't its original intent. It wasn't included on the floppy disks as a dare, or a joke—it just happened to end up one of the funniest and most memorable relics of Windows history.
Wes has been covering games and hardware for more than 10 years, first at tech sites like
The Wirecutter
and
Tested
before joining the PC Gamer team in 2014. Wes plays a little bit of everything, but he'll always jump at the chance to cover emulation and Japanese games.
When he's not obsessively optimizing and re-optimizing a tangle of conveyor belts in Satisfactory (it's really becoming a problem), he's probably playing a 20-year-old Final Fantasy or some opaque ASCII roguelike. With a focus on writing and editing features, he seeks out personal stories and in-depth histories from the corners of PC gaming and its niche communities. 50% pizza by volume (deep dish, to be specific).
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.
White House Refuses to Rule Out Summary Executions of People on Its Secret Domestic Terrorist List
Intercept
theintercept.com
2025-12-12 19:02:03
The Trump administration ignored questions about whether it would order the killings of those on its NSPM-7 list — even while answering our other queries.
The post White House Refuses to Rule Out Summary Executions of People on Its Secret Domestic Terrorist List appeared first on The Intercept....
President Donald Trump
has shattered the limits of executive authority by ordering the summary executions of individuals he deems members of designated terrorist organizations. He has also tested the bounds of his presidential powers by creating a
secret list
of domestic terrorist organizations, established under National Security Presidential Memorandum 7, or
NSPM-7
.
Are Americans that the federal government deems to be members of domestic terrorist organizations subject to extrajudicial killings like those it claims are members of designated terrorist organizations? The White House, Justice Department, and Department of War have, for more than a month, failed to answer this question.
Lawmakers and other government officials tell The Intercept that the pregnant silence by the Trump administration has become especially worrisome as the death toll mounts from attacks on alleged members of “designated terrorist organizations” in the Caribbean Sea and Pacific Ocean, and as Trump himself makes ever more unhinged threats to
imprison
or
execute
his political adversaries.
“The Trump Administration is trying to justify blowing small boats out of the water by arbitrarily calling them ‘designated terrorist organizations’ — a label not grounded in U.S. statute nor international law, but in solely what Trump says,” Sen. Tammy Duckworth, D-Ill., told The Intercept. “If Trump is using this justification to use military force on any individuals he chooses — without verified evidence or legal authorization — what’s stopping him from designating anyone within our own borders in a similar fashion and conducting lethal, militarized attacks against them? This illegal and dangerous misuse of lethal force should worry all Americans, and it can’t be accepted as normal.”
For almost a quarter century, the United States has been killing people — including
American citizens
,
on occasion
— around the world with drone strikes. Beginning as post-9/11 counterterrorism operations, these targeted killings in
Afghanistan
, Iraq, Somalia, Yemen, and other nations relied on a flimsy legal rationale that consistently
eroded respect for international law
. Details of these operations were
kept secret
from the American people, and
civilian casualties
were
ignored, denied, and covered up
. The recent attacks on alleged drug boats lack even the rickety legal rationale of the drone wars, sparking fear that there is little to stop the U.S. government from taking the unprecedented step of military action against those it deems terrorists within the nation’s borders.
The military has carried out 22 known attacks in the Caribbean Sea and eastern Pacific Ocean since September, killing at least
87 civilians
. Last week, footage of the September 2
double-tap strike
shown to
select
members of Congress
ignited a firestorm
. Trump announced, on camera, that he had “
no problem
” with releasing the video of the attack. This week, he
denied
ever saying it, in another example of his increasingly unbalanced behavior.
“The public deserves to know how our government is justifying the cold-blooded murder of civilians as lawful and why it believes it can hand out get-out-of-jail-free cards to people committing these crimes,” said Jeffrey Stein, staff attorney with the American Civil Liberties Union’s National Security Project, on Tuesday, as the ACLU, the Center for Constitutional Rights, and the New York Civil Liberties Union filed a federal lawsuit for the immediate release of a classified
Justice Department’s opinion
and other documents related to the attacks on boats. “The Trump administration must stop these illegal and immoral strikes, and officials who have carried them out must be held accountable.”
Since October,
The Intercept
has been
asking if the White House
would rule out conducting summary executions of members of the list “of any such groups or entities” designated as “domestic terrorist organization[s]” under NSPM-7, without a response. Similar questions posed to the Justice and War departments have also been repeatedly ignored, despite both departments offering replies to myriad other queries. The Justice Department responded with a statement that did not answer the question. “Political violence has no place in this country, and this Department of Justice will investigate, identify, and root out any individual or violent extremist group attempting to commit or promote this heinous activity,” a spokesperson told The Intercept.
“The Trump administration should answer all questions about the terrorist lists,” Rep. Ro Khanna, D-Calif., told The Intercept. “The American people have a right to answers about who is on them and what that means for all of us.”
Rebecca Ingber, a former State Department lawyer, notes that while the designated terrorist organization label as a targeting authority is “entirely manufactured,” the administration is relying on it to summarily execute people in the boat strikes, making their application of the terrorist label on the domestic front especially concerning. “Many of us have warned that there seems to be no legal limiting principle to the Administration’s claims of authority to use force and to kill people,” Ingber, now a law professor at Cardozo Law School in New York, told The Intercept. “This is one of the many reasons it is so important that Congress push back on the President’s claim that he can simply label transporting drugs an armed attack on the United States and then claim the authority to summarily execute people on that basis.”
Last month, members of Congress spoke up against Trump’s increasingly authoritarian measures when a group of Democratic lawmakers posted a video on social media in which they reminded military personnel that they are required to disobey illegal orders. This led to a Trump tirade that made the White House’s failure to dismiss the possibility of summary executions of Americans even more worrisome.
“This is really bad,” the president
wrote
on Truth Social, “and Dangerous to our Country. Their words cannot be allowed to stand. SEDITIOUS BEHAVIOR FROM TRAITORS!!! LOCK THEM UP???” A follow-up
post
read: “SEDITIOUS BEHAVIOR, punishable by DEATH!” Trump also
reposted
a comment that said: “HANG THEM GEORGE WASHINGTON WOULD !!”
“What’s most telling is that the President considers it punishable by death for us to restate the law,” the six lawmakers — Sens. Elissa Slotkin, Mark Kelly, and Reps. Jason Crow, Chris Deluzio, Maggie Goodlander, and
Chrissy Houlahan
— all of them former members of the armed forces or the intelligence community —
replied in a joint statement
. “Every American must unite and condemn the President’s calls for our murder and political violence.” Trump
later claimed
he did not call for the lawmakers’ executions.
White House spokesperson Taylor Rogers failed to answer questions about Trump’s history of threatening to kill people and his recent unhinged behavior.
As Trump lobs
threats at political foes and his administration seeks to put convicted and supposed criminals to death at home and abroad, NSPM-7 directs hundreds of thousands of federal officials to target U.S. progressive groups and their donors as well as political activists who profess undefined anti-American, antifascist, or anti-Christian sentiments. The memorandum harkens back to past government enemies lists and efforts that led to massive overreach and illegal acts of repression to
stifle dissent
. That includes the
House Un-American Activities Committee
, which began in the 1940s, the FBI’s secret Counter Intelligence Program, or COINTELPRO, which began in the 1950s, and the Patriot Act, enacted in the wake of 9/11, which led to abuses of Black, brown, and
Muslim communities
, along with racial, social, environmental, animal rights, and other social justice activists and groups.
“NSPM-7 is a greater infringement on freedoms than the Patriot Act.”
“Trump’s NSPM-7 represses freedom of speech and association. Investigating any organization with anti-capitalism or anti-American views
is
anti-American. NSPM-7 is a greater infringement on freedoms than the Patriot Act,” said Khanna. “We’re seeing the greatest erosion of civil liberties and human rights in our modern history.”
NSPM-7 directs Bondi to compile a list “of any such groups or entities” to be designated as “domestic terrorist organization[s]” and Bondi has ordered the FBI to “compile a list of groups or entities engaging in acts that may constitute domestic terrorism,” according to a Justice Department
memo disclosed
by reporter Ken Klippenstein on Saturday. The department also shared the December 4 memo, “Implementing National Security Presidential Memorandum-7: Countering Domestic Terrorism and Organized Political Violence,” with The Intercept.
The Justice Department memo notes that under Section 3 of NSPM-7, “the FBI, in coordination with its partners on the [Joint Terrorism Task Forces], and consistent with applicable law, shall compile a list of groups or entities engaged in acts that may constitute domestic terrorism” and “provide that list to the Deputy Attorney General.” (The FBI’s Joint Terrorism Task Forces are located in each of the FBI’s 56 field offices and specifically “support President Trump’s executive orders,” according to a
top FBI official
.)
The Justice Department memorandum offers a fictitious apocalyptic vision of urban America which the Trump administration has previously employed to justify its
military occupations
, including “mass rioting and destruction in our cities, violent efforts to shut down immigration enforcement, [and] targeting of public officials or other political actors.” While Trump has even falsely claimed, for example, that members of the Venezuelan gang Tren de Aragua have engaged in
hand-to-hand combat
with
U.S. troops
on the streets of Washington, D.C., state attorneys general have repeatedly and successfully argued that troop deployments in Chicago, Los Angeles, and Portland, Oregon, were illegal because Trump administration claims of rampant civil unrest
were found
to be
overblown or fictional
.
The December 4 Justice Department memo also claims that “certain Antifa-aligned extremists” profess “extreme viewpoints on immigration, radical gender ideology, and anti-American sentiment” and “a willingness to use violence against law-abiding citizenry to serve those beliefs.” Over the last decade, Republicans have frequently
blamed
antifa for violence and used it as an omnibus term for
left-wing activists
, as if it were an organization with members and a command structure.
In September, Trump signed an executive order
designating antifa
as a “domestic terror organization,” despite the fact that it is essentially a
decentralized
, leftist ideology — a collection of related ideas and political concepts much like
feminism
or environmentalism.
Last month, the State Department
designated
four European groups — Antifa Ost, based in Germany; Informal Anarchist Federation/International Revolutionary Front, a mostly Italian group; and Armed Proletarian Justice and Revolutionary Class Self-Defense, both Greek organizations — as “
foreign terrorist organizations
” because of their alleged threats and attacks against political and economic institutions in Europe. The State Department
announced
that the FTO designation specifically supports NSPM-7. The Treasury Department’s Office of Foreign Assets Control also
designated
the groups as “specially designated nationals.”
Michael Glasheen, a longtime FBI agent serving as operations director of the bureau’s national security branch, was
flummoxed by questions about antifa
while testifying on Thursday before the House Committee on Homeland Security. He said antifa was the “
most immediate violent threat
” facing the United States, but could not answer basic details about the movement, including its size or where it is headquartered. The FBI, Glasheen said, has conducted more than 1,700 domestic terrorism investigations this year, including “
approximately 70 antifa investigations
,” and logged a 171 percent increase in arrests. He also
drew attention
to a “concerning uptick in the radicalization of our nation’s young people,” specifically “those who may be motivated to commit violence and other criminal acts to further social or political objectives stemming from domestic influences.”
Last month, a federal grand jury in Fort Worth, Texas, indicted nine alleged “
North Texas Antifa Cell operatives
” — one of them a former
Marine Corps reservist
— on multiple charges, including attempted murder, stemming from a
shooting
during a July 4 protest at the ICE Prairieland Detention Center in Alvarado in which a local police officer was injured. The Justice Department claims that the North Texas Antifa Cell is “part of a larger militant enterprise made up of networks of individuals and small groups primarily ascribing to an ideology that explicitly calls for the overthrow of the United States Government, law enforcement authorities, and the system of law.”
The December 4 Justice Department memo states that within 60 days, the FBI “shall disseminate an intelligence bulletin on Antifa and Antifa-aligned anarchist violent extremist groups,” including their “organizations’ structures, funding sources, and tactics so that law enforcement partners can effectively investigate and policy makers can effectively understand the nature and gravity of the threat posed by these extremist groups.”
The memo calls for bounties and a network of informants.
The memo also calls for bounties and a network of
informants
. The “FBI shall establish a cash reward system for information that leads to the successful identification and arrest of individuals in the leadership of domestic terrorist organizations,” reads the document, noting that the bureau also aims to “establish cooperators to provide information and eventually testify against other members and leadership of domestic terrorist organizations.”
Neither NSPM-7 nor the December 4 memo mentions summary executions, and both speak explicitly in terms of “prosecution” and “arrest” of members of domestic terrorist organizations. Attacks on members of designated terrorist organizations are justified by another document — a
classified opinion
from the Justice Department’s Office of Legal Counsel — that claims that narcotics on supposed drug boats are lawful military targets because their cargo generates revenue for cartels whom the Trump administration claims are in armed conflict with the United States. Attached to that secret memo is a similarly
secret list
of designated terrorist organizations.
The December 4 memorandum directs Justice Department prosecutors to focus on specific federal crimes highlighted in NSPM-7 and flags more than 25 federal charges including crimes that may be capital offenses under specific, aggravating circumstances, such as killing or attempting to kill a federal officer and murder for hire.
“The administration is creating new categories of organizations outside of the law, creating immense uncertainty about who and what they intend to target and how,” Faiza Patel, the senior director of the Brennan Center for Justice’s Liberty and National Security Program, told The Intercept, drawing attention to the administration’s invented term: designated terrorist organizations. “But drug trafficking is not war, and these actions are patently illegal in the absence of Congressional authorization,” she added. “At the same time, National Security Presidential Memorandum 7 is aimed at ‘domestic terrorist organizations’ — another term that has no basis in U.S. law. It is designed to ramp up law enforcement scrutiny of groups espousing a broad swath of First Amendment-protected beliefs from anti-Christianity to anti-Americanism. NSPM-7 does not in any way, shape, or form authorize military strikes and using it for that would be plainly unlawful.”
Benn Jordan's flock camera jammer will send you to jail in Florida now [video]
Special Dyslexia Fonts Are Based on Voodoo Pseudoscience
Daring Fireball
www.edutopia.org
2025-12-12 18:37:47
Youki Terada, writing for Edutopia in 2022 (via Jens Kutílek):
Under close scrutiny, the evidence for dyslexia-friendly fonts
falls apart. In a 2017 study, for example, researchers tested
whether OpenDyslexic, a popular font with thicker lines near the
bottom of the letters, could improve the re...
In 1927, Samuel Orton, a neuropsychiatrist, observed that many of his young patients with reading difficulties reversed similar letters, confusing
d
for
b
, for example. Concluding that the condition
was caused by “directional confusion
,” he coined the term
strephosymbolia
, meaning “twisted symbol.” The characterization, but not the coinage, stuck—and fueled early speculation that what came to be known as dyslexia was a visual disorder that caused printed letters to appear as a confusing, jumbled mess.
Since then, a cottage industry of dyslexia-focused products has emerged, hawking everything from prisms to tinted glasses and transparent color overlays. One website catering to dyslexic readers—whose tagline promises to solve “complicated problems with a simple solution”—sells prism glasses, offering up a slew of
testimonials
touting the product’s benefits. “My reading has improved from 4th grade to college level,” exclaims one satisfied wearer.
In the last decade, another contender—typographic fonts designed to alleviate the reading difficulties associated with dyslexia—has entered the popular discourse. The simple, classroom-friendly intervention claims to improve the speed and accuracy of dyslexic readers by adjusting the size and shape of fonts, adding thicker lines to help students distinguish between similar letters. The designers of the fonts claim that the “heaviness” of the letters, for example, prevents them from flipping upside-down or left-to-right, while the arms—the top of a
b
or
d
, for example—have varying thicknesses to reduce possible confusion.
According to the
Yale Center for Dyslexia and Creativity
, dyslexia is the most common learning disability, affecting one in five children. Students with dyslexia often struggle to read, prompting teachers to search far and wide for helpful remedies. The market for solutions is large and alluring.
But the new fonts—and the odd assortment of paraphernalia that came before them—assume that dyslexia is a visual problem rooted in imprecise letter recognition. That’s a myth, explains Joanne Pierson, a speech-language pathologist at the University of Michigan. “Contrary to popular belief, the core problem in dyslexia is not reversing letters (although it can be an indicator),” she
writes
. The difficulty lies in identifying the discrete units of sound that make up words and “matching those individual sounds to the letters and combinations of letters in order to read and spell.”
In other words, dyslexia is a language-based processing difference, not a vision problem, despite the popular and enduring misconceptions. “Even when carefully explained, soundly discredited, or decisively dispatched, these and similar dyslexia myths and their vision-based suppositions seem to rise from the dead—like the villain-who-just-won’t-die trope in a B movie,” the International Dyslexia Association
forcefully asserts
.
Dyslexia Fonts, Under the Microscope
Under close scrutiny, the evidence for dyslexia-friendly fonts falls apart. In a
2017 study
, for example, researchers tested whether OpenDyslexic, a popular font with thicker lines near the bottom of the letters, could improve the reading rate and accuracy for young children with dyslexia. According to the developers of the font, which is open-source and free of charge, the “heaviness” of the letters prevented them from turning upside down for readers with dyslexia, which they claimed would improve reading accuracy and speed.
Shelley Adams
OpenDyslexic features heavier lines that are meant to increase readability for readers with dyslexia—but rigorous research suggests that other mainstream fonts may be more effective.
Researchers put the font to the test, comparing it with two other popular fonts designed for legibility—Arial and Times New Roman—and discovered that the purportedly dyslexia-friendly font actually reduced reading speed and accuracy. In addition, none of the students preferred to read material in OpenDyslexic, a surprising rebuke for a font specifically designed for the task.
In a separate
2018 study
, researchers compared another popular dyslexia font—Dyslexie, which charges a fee for usage—with Arial and Times New Roman and found no benefit to reading accuracy and speed. As with the previous dyslexia font, children expressed a preference for the mainstream fonts. “All in all, the font Dyslexie, developed to facilitate the reading of dyslexic people, does not have the desired effect,” the researchers concluded. “Children with dyslexia do not read better when text is printed in the font Dyslexie than when text is printed in Arial or Times New Roman.”
“I don’t necessarily think teachers need to go and get a special font,” says Julie Rawe, a member of W3C’s Cognitive and Learning Disabilities Task Force and a reading and disability expert at
Understood
. “So far, the research doesn’t really have a lot of evidence showing that these special fonts help kids or adults with dyslexia to read faster or make fewer mistakes.”
Giving False Hope
Dyslexia fonts may also give students false hope—and result in disappointment, the researchers of the 2017 study warn. “The most harm may come when students who have already experienced significant struggle and academic failures related to learning to read have yet another experience with failure when they are not able to read significantly better in a font designed to do so,” they caution.
That’s because children with dyslexia often have to deal with the stigma of being behind their peers, and they may conclude that they’re not smart enough to master the materials, according to a
2010 study
. If a child is told that a dyslexia font can help them read, but it doesn’t actually improve their grades or their reading experience, they may assume that the problem lies with their own inability—not with the font.
Legible Fonts and Evidence-Based Instruction
Fonts do matter, experts at the
British Dyslexia Association
explain, but only because they matter for all readers: “Adopting best practice for dyslexic readers has the advantage of making all written communication easier on the eye for everyone.” They recommend fonts designed for general legibility, like Arial, Verdana, and Tahoma. For better reading outcomes, font size should be between 12 and 14 points, and section headings should be used to create a consistent structure within your documents, easing navigation and supporting better sense-making.
Of course, typography is just one small part of the puzzle. Most children with dyslexia can learn to read—but it takes considerably more time and effort than for their peers, according to the
Yale Center for Dyslexia and Creativity
. Reading instruction should be “evidence-based, systematic, and delivered in a small group setting,” they say, and should include explicit instruction in phonemic awareness and phonics, with many opportunities to practice reading skills in a supportive environment. The
International Dyslexia Association
recommends a “multisensory, structured language approach” that systematically integrates several senses (hearing, seeing, touching) while the child is learning to read.
Classroom accommodations
such as audiobooks, note-taking apps, video recordings of assignment instructions, and text-to-speech software can help students with dyslexia feel supported and accepted, explains former literacy teacher Jessica Hamman. Tasks that appear simple to most students may take extra time for those with dyslexia, so it’s important to provide tools “that take into account their unique processing challenges and allow them to demonstrate their content understanding and access the curriculum with more ease,” she says.
The Takeaway
On scores of reading speed and accuracy, dyslexia fonts perform no better than common fonts like Arial and Times New Roman, and sometimes they perform worse, according to recent studies. Even using dyslexia fonts with neutral effects can raise false hopes in struggling young readers, contributing to feelings of helplessness and discouragement.
I’m at the point where I’m migrating all my projects to
uv
, and new Python projects don’t use any other package manager.
I finally got around to migrating
this site
to use it, using the very handy
migrate-to-uv
tool. So it’s time to update my recommended Python setup.
My
old Python setup
was very much built around the complications of managing environments and dependencies, and the conflicting set of tools to deal with those two problems. There are still a few places where I’ll use
pipx
, but otherwise everything is on
uv
.
This guide is still aimed at a recent Apple or Linux computer, or
WSL
if you’re on Windows. I’m writing this on a MacBook Pro with an M2 chip, if that matters to you.
You probably don’t need both
uv
and
pipx
. I have a bunch of existing tools I installed with
pipx
, and those work fine, so I haven’t migrated them to
uvx
.
There is one set of tools that stays on
pipx
, though:
Datasette
and its
SQLite toolchain
.
Simon Willison
built those to install their own plugins, using
datasette install <plugin>
or
llm install <plugin>
. Those use
pip
internally and sometimes
uv
can cause problems upgrading, so I’ve kept them on
pipx
.
Installing the right Python
Use
uv
and nothing else for this. Run
uv python list
to see what’s already installed or otherwise available. If you’re not using
pipx
, it’s fine to just let
uv
install the right version of Python for each project.
If you want a specific version of Python installed globally, use
uv python install <version>
. The
docs
are good.
For
pipx
, stick to my instructions from a couple years ago:
pipinstall--userpipx# install it
pipxensurepath# make sure your system can find it
That’s assuming your system already comes with a vesion of Python and
pip
installed. If not, try
Homebrew
. Maybe it’s better now, especially with
uv
managing everything else.
Virtual environments and local dependencies
Everything is now part of
uv
. Run
uv init
to create a project,
uv add
for each dependency and
uv sync
to install everything from an existing project.
Use
uv run
to run scripts inside the virtual environment that
uv
creates.
This is easier now
I never managed to write the post about why Python’s setup is so hard. It ultimately comes down to dependencies, both libraries and Python itself. For the most part,
uv
has made this a non-issue. It’s also significantly faster than the tools it replaced, which means I can iterate faster and don’t lose focus waiting for dependencies to download and install.
Now, to migrate more projects …
Coupang data breach traced to ex-employee who retained system access
Bleeping Computer
www.bleepingcomputer.com
2025-12-12 18:28:30
A data breach at Coupang that exposed the information of 33.7 million customers has been tied to a former employee who retained access to internal systems after leaving the company. [...]...
A data breach at Coupang that exposed the information of 33.7 million customers has been tied to a former employee who retained access to internal systems after leaving the company.
This was shared by the Seoul Metropolitan Police Agency with local news outlets, following an investigation that included a raid on the firm's offices earlier this week.
Coupang is South Korea's largest online retailer, employing 95,000 people and generating annual revenue of over $30 billion.
On December 1, 2025, the company announced that it had suffered a data breach that exposed the personal data of
33.7 million customers
, including names, email addresses, physical addresses, and order information.
The breach occurred on June 24, 2025, but Coupang only discovered it on November 18, when it also launched an internal investigation.
On December 6, Coupang
published an update
on the incident, assuring its customers that the stolen information had not been leaked anywhere online.
Despite these assurances and the company's claimed full collaboration with the authorities, the police
raided the company's offices
on Tuesday to collect evidence for an independent investigation.
On Wednesday, the company's CEO, Park Dae-Jun,
announced his resignation
and apologized to the public for failing to stop what is the country's worst cybersecurity breach in history.
As the police continued their investigations in Coupang's offices for a second day, they uncovered that the primary suspect was a 43-year-old Chinese national who was
a former employee
of the retail giant.
According to
JoongAng
, the man, who joined Coupang in November 2022, was assigned to an authentication management system and left the firm in 2024. He is believed to have already left the country.
The Korean news outlet reports that the police were still at Coupang's offices yesterday, gathering records such as internal documents, logs, system records, IP addresses, user credentials, and access histories that could help explain how the rogue former employee gained access to the corporate systems.
Police transporting seized documents out of Coupang's office
Source: Korea JoungAng Daily
The police have stated that, while Coupang is treated as the victim, if negligence or other legal violations are found, the company and employees responsible for protecting customer data may be deemed liable.
In the meantime, the incident has sparked high-volume phishing activity in the country, affecting roughly two-thirds of its population, and the police have received hundreds of reports of Coupang impersonation since the start of the month.
Broken IAM isn't just an IT problem - the impact ripples across your whole business.
This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.
Home Depot GitHub token exposed for a year, granted access to internal systems
A security researcher said Home Depot exposed access to its internal systems for a year after one of its employees published a private access token online, likely by mistake. The researcher found the exposed token and tried to privately alert Home Depot to its security lapse but was ignored for several weeks.
The exposure is now fixed after TechCrunch contacted company representatives last week.
Security researcher Ben Zimmermann told TechCrunch that, in early November, he found a published GitHub access token belonging to a Home Depot employee, which was exposed sometime in early 2024.
When he tested the token, Zimmermann said that it granted access to hundreds of private Home Depot source code repositories hosted on GitHub and allowed the ability to modify their contents.
The researcher said the keys allowed access to Home Depot’s cloud infrastructure, including its order fulfillment and inventory management systems, and code development pipelines, among other systems. Home Depot has hosted much of its developer and engineering infrastructure on GitHub since 2015, according to a
customer profile on GitHub’s website
.
Zimmermann said he sent several emails to Home Depot but didn’t hear back.
Nor did he get a response from Home Depot’s chief information security officer, Chris Lanzilotta, after sending a message over LinkedIn.
Zimmermann told TechCrunch that he has disclosed several similar exposures in recent months to companies, which have thanked him for his findings.
“Home Depot is the only company that ignored me,” he said.
Given that Home Depot does not have a way to report security flaws, such as a vulnerability disclosure or bug bounty program, Zimmermann contacted TechCrunch in an effort to get the exposure fixed.
When reached by TechCrunch on December 5, Home Depot spokesperson George Lane acknowledged receipt of our email but did not respond to follow-up emails asking for comment. The exposed token is no longer online, and the researcher said the token’s access was revoked soon after our outreach.
We also asked Lane if Home Depot has the technical means, such as logs, to determine if anyone else used the token during the months it was left online to access any of Home Depot’s internal systems. We did not hear back.
Zack Whittaker is the security editor at TechCrunch. He also authors the weekly cybersecurity newsletter,
this week in security
.
He can be reached via encrypted message at zackwhittaker.1337 on Signal. You can also contact him by email, or to verify outreach, at
zack.whittaker@techcrunch.com
.
Crackpots ranging from billionaire Peter Thiel to random YouTube influencers claim that science has been stagnating for the past 50 years. They admit that computing is an exception: they don’t pretend that my personal 32GB laptop is not an advance over the 16MB mainframe that served the whole Caltech community when I was there. Instead they claim that advances in computing were driven solely by industrial research, quite overlooking the role of academia
and government funding
in pushing the VLSI revolution, RISC processor design, networking, hypertext, virtual memory and indeed computers themselves. As for the industrial research,
most of it came from just two “blue sky” institutes –
Bell Labs
and
Xerox PARC
– that closed a long time ago.
LCF-style proof assistants are a world away from mainstream computing,
so let’s look at 50 years of progress there.
1975–1985: Edinburgh LCF
The first instance of LCF was Stanford LCF, developed by Robin Milner in 1972, but it was
not
an LCF-style proof assistant! LCF meant “Logic for Computable Functions”, a quirky formalism based on Scott domains and intended for reasoning about small functional programs. But “LCF-style proof assistant” means one that, like Edinburgh LCF, was coded in some form of
the ML programming language and provided a proof kernel,
encapsulated in an abstract type definition, to ensure that a theorem could only be generated
by applying inference rules to axioms or other theorems:
… the ML type discipline is used… so that—whatever complex procedures are defined—all values of type
thm
must be theorems, as only inferences can compute such values…. This security releases us from the need to preserve whole proofs… — an important practical gain since large proofs tended to clog up the working space… [
Edinburgh LCF
, page IV]
Edinburgh LCF was first announced in 1975, which conveniently is exactly 50 years ago,
at the almost mythical conference on
Proving and Improving Programs
held at Arc-et-Senans.
The
user manual
, published in the Springer lecture notes series, came out in 1979.
Edinburgh LCF introduced some other principles that people still adhere to today:
inference rules in the
natural deduction
style, with a dynamic set of assumptions
a
goal-directed
proof style, where you start with the theorem statement and work backwards
a structured system of
theories
to organise groups of definitions
Edinburgh LCF had its own version of the ML language.
It supported a fragment of first-order logic containing
the logical symbols $\forall$, $\land$ and $\to$ along with
the relation symbols $\equiv$ and $\sqsubseteq$.
It introduced proof tactics and also
tacticals
:
operators for combining tactics.
Tactics supported goal-directed proof,
but Edinburgh LCF had no notion of the current goal or anything to help the user manage the tree of subgoals.
Its user interface was simply the ML top level and the various theorem-proving primitives were simply ML functions.
ML stood for
metalanguage
, since managing the process of proof was its exact job.
Avra Cohn and Robin Milner wrote a
report
on proving the correctness of a parsing algorithm
using Edinburgh LCF.
The proof consists of one single induction followed by
a little simplification and other reasoning.
The report includes a succinct description of Edinburgh LCF and
is a nice snapshot of the state of the art in 1982
when Cambridge in 1982 to join a project run by Robin Milner and Mike Gordon.
Full of youthful enthusiasm, I told Mike that it would be great
if one day we could formalise the Prime Number Theorem.
I hardly knew what the theorem was about or how to prove it,
but my college roommate had told me it was really deep.
Disappointed to discover that we only had $\forall$, $\land$ and $\to$,
I set out to fix that, to support full first-order logic.
I ended up changing so much
(backwards compatibility is overrated) that people eventually shamed me into writing my own
user manual
.
Cambridge LCF never caught on because, well,
nobody liked the LCF formalism.
But I used it for a development that seemed big at the time: to
verify the unification algorithm
.
This development was later
ported to Isabelle
.
It contains 36 inductions, so we were making progress.
And this takes us to 1985, exactly 40 years ago;
see also
this survey
of the state of play.
But there was almost no mathematics: no negative numbers and no decimal notation, so you could not even write 2+2=4.
As far as the broader computer science community was concerned, we were a joke.
1985–1995: Cambridge LCF and HOL
Cambridge LCF was in itself a dead end, but because it included a much faster ML compiler,
it ended up
being incorporated
into a lot of other proof assistants, notably Mike’s
HOL88
.
And just like that,
hardware verification
became a reality.
Although software verification seemed stuck in the doldrums,
a couple of production-ready chip designs were verified!
Mike’s explanation was that hardware verification was simply easier.
Also in 1985, we got a new
standard for the ML language
and, soon, two compilers for it.
So then I started working on experiments that would
lead to Isabelle
.
It would be like LCF but would support constructive type theory,
crucially allowing both unification and backtracking, like in Prolog.
But there was no working system yet, just a grant application.
And that was the state of play 40 years ago.
Funding secured, Isabelle development started in earnest in 1986.
It was coded in
Standard ML
from the start, while HOL88 was ported from the Cambridge LCF version of ML
to Standard ML, emerging as HOL90.
Mike acquired a bevy of energetic PhD students,
who engaged in verification projects or built extensions for HOL.
Versions of HOL were being used in institutes around the world.
Stepping aside from HOL for a moment, other proof assistants had made great progress
by the mid 1990s.
The addition of inductive definitions to the calculus of constructions
gave us the
calculus of inductive constructions
,
which in essence is the formalism used today by Rocq and Lean.
The very first release of Isabelle/HOL
happened in 1991
,
primarily the work of Tobias Nipkow, though I was soon to
join in
.
Isabelle/ZF, which was my pet project, formalised axiomatic set theory
to some
quite deep results
.
But I am still not certain whether negative numbers were supported (can somebody help me?).
Our weak support for arithmetic may seem odd
when our research community was aware that the real numbers
had been
formalised in AUTOMATH
,
but we didn’t seem to want them.
To many, we were still a joke. This was about to change.
1995–2005: Proof assistants come of age
In 1994, came the Pentium with its
FDIV bug
:
a probably insignificant but detectable error in floating-point division.
The subsequent product recall cost Intel nearly half a billion dollars.
John Harrison, a student of Mike’s, decided to devote his PhD research
to the verification of floating-point arithmetic.
By June 1996 he had submitted an extraordinary
thesis
,
Theorem Proving with the Real Numbers
,
which described a formidable series of achievements:
a formalisation of the real member system in HOL
formalised analysis including metric spaces, sequences and series, limits, continuity and differentiation, power series and transcendental functions, integration
proper numerals represented internally by symbolic binary, and calculations on them
computer algebra techniques including a decision procedure for real algebra
tools and techniques for floating-point verification by reference to the IEEE standard
This thesis, which I had the privilege to examine, won a Distinguished Dissertation Award
and was
published as a book
by Springer.
So by the middle of the 1990s, which was 30 years ago,
we had gone from almost no arithmetic to a decent chunk of formalised real analysis
that was good enough to verify actual floating-point algorithms.
This period also saw something of an arms race in automation.
My earlier, Prolog-inspired vision of backtracking search
had led to some
fairly general automation
that was effective not just in standard predicate logic
but with any theorems were expressed in a form suitable for forward or backward chaining.
I had also done experiments with classical automatic techniques such as model elimination, which, although pathetic compared with automatic provers of that era,
was good enough to troll users on the
hol-info
mailing list.
Soon I had provoked John Harrison to build a superior version of ME for HOL Light.
Later, Joe Hurd built his
metis
superposition prover, which found its way into HOL4.
Not to be outdone, Tobias made Isabelle’s simplifier the best in its class incorporating a number of sophisticated refinements, including some great ideas from Nqthm.
Twenty years from the start of this chronology we now had
several reasonably mature and powerful systems, including Isabelle/ZF, Isabelle/HOL,
multiple versions of the HOL system, and Coq (now Rocq).
1
Many of them used
Proof General
,
a common user interface for tactic-based proof assistants
based on the Emacs editor.
And we had 100MHz machines, some with 64MB of memory!
We were ready to do big things.
During this period, I did a lot of work on the
verification of cryptographic protocols
,
also
here
.
These secure Internet connections and other network communications;
they are valuable when you need to know who is on the other end
and need to keep messaging secure from eavesdropping and tampering.
Among the protocols investigated were the ubiquitous TLS
and the late, unlamented SET protocol.
These proofs were not at the level of code or bits;
buggy implementations could and did emerge.
In 2005, the big thing that caught everyone’s eye
was
George Gonthier’s formalisation
(in Coq)
of the Four Colour Theorem.
Most educated people had heard of the theorem already,
and its history is fascinating:
numerous proofs had been attempted and rejected since the mid 19th century.
The 1977 proof by Appel and Haken was questioned
because it relied on a lot of ad-hoc computer code.
Suddenly, despite the still unwelcome involvement of computers,
no one could doubt the theorem anymore.
At the opposite extreme was
my own formalisation
of Gödel’s proof of the relative consistency of the axiom of choice in Isabelle/ZF.
This was the apex of my ZF work, technically difficult but incomprehensible to most people.
My early dream of having a formalisation of the Prime Number Theorem came true in 2005
when Jeremy Avigad
formalised
the theorem in Isabelle.
Somewhat later, John Harrison
formalised a different proof
in HOL Light.
And there was much more. Without any doubt, our systems were capable of serious mathematics.
Perhaps the most consequential achievement of this period was Mike Gordon’s collaboration
with Graham Birtwistle and Anthony Fox to
verify the ARM6 processor
.
Graham, at Leeds, formally specified the instruction set architecture of the processor
(i.e. the assembly language level), while Mike and Anthony at Cambridge verified the implementation of that architecture in terms of lower level hardware components.
Eventually a
number of other processors
were similarly specified,
and some verified.
Without any doubt, our systems were capable of serious verification.
Despite of the focus on applications in this section,
system development continued in the run-up to 2005.
I am only familiar with Isabelle development, but they were tremendous:
the
Isar language
for structured, legible proofs (a break with the LCF idea that the top level must be a programming language, i.e. ML)
axiomatic type classes
, providing principled overloading
counterexample finders
:
Quickcheck
and Refute (now Nitpick)
code generation
from the executable fragment of higher-order logic, and reflection
sledgehammer
was under active development, but only ready a couple of years later.
With so much going on, it’s not surprising that our community started doing big things,
and other people were starting to notice.
2005–2015: The first landmarks
I am not used to phone calls from journalists:
for most of my career, formal verification has been seen as (at best) niche.
But the journalist on the end of the line was asking for information about
seL4
,
the first operating system kernel ever to be formally verified.
Tools for extended static checking were by then able to detect a lot of program faults, but the seL4 verification claimed to cover
full functional correctness
:
the code did exactly what it was supposed to do.
There is now an
entire ecosystem
around seL4,
backed by a million lines of Isabelle/HOL proofs.
People have wanted to verify compilers
since forever
.
The task of fully specifying a programming language, target machine
and compiler already seemed impossible, let alone providing the actual proof.
With
CompCert
, that task was finally fulfilled, for a large subset of the C language:
What sets CompCert apart from any other production
compiler, is that it is formally verified, using machine-
assisted mathematical proofs, to be exempt from mis-
compilation issues. In other words, the executable code
it produces is proved to behave exactly as specified by
the semantics of the source C program.
A seemingly intractable problem with compiler verification
was how to translate your verified compiler into machine code.
For example, CompCert is mostly written in Rocq,
which is then extracted to OCaml code.
The OCaml compiler had never been verified,
so how do we know that its compiled code is correct?
CakeML
squares this circle through
bootstrapping
.
CakeML translates from its source language (a dialect of ML)
to assembly language, accompanied by a proof that the two pieces of code are equivalent.
This work was an outgrowth of the ARM6 project mentioned earlier.
Magnus Myreen
had
developed techniques
for
automatically and verifiably translating between assembly language
and recursive functions in higher-order logic, in both directions.
At the start of the bootstrapping process,
a tiny compiler was written in pure logic and proved correct.
It was now safe to run this compiler
and use its tiny language to implement a bigger language.
This process ultimately produced a verified compiler in both source form
and assembly language form, with a proof of their equivalence,
as well as
verified extraction
from higher-order logic to ML.
The end of the decade also saw impressive results in the formalisation of mathematics:
Without going into details here, each of these was an ambitious proof, combining in various ways deep mathematics, intricate technicalities and sheer bulk.
Our community was proud of our achievements.
We were no longer a joke, but what exactly we were good for?
2015–2025: Breaking through
This period brought something astonishing:
acceptance of proof assistants by many mainstream mathematicians.
I mostly recall mathematicians regardeding computers
with something close to contempt.
Even some logicians regarded formalised mathematics as impossible,
somehow fixating on Gödel’s incompleteness or that notorious proof of 1+1=2 on page 360.
Regarding my work formalising big chunks of ZF theory,
someone commented “only for finite sets obviously”.
My EU-funded
ALEXANDRIA
project started in 2017.
My team formalised more advanced and deep mathematics
than I ever imagined to be possible, using Isabelle/HOL.
(I have told this story in an
earlier blogpost
.)
But ALEXANDRIA alone would not have had much of an impact on mathematical practice.
What made a difference was
Kevin Buzzard
and his enthusiastic, tireless promotion of the idea of formalising mathematics
in
Lean
.
He recruited a veritable army.
I got the idea of blogging from him, but my blog has not had the same impact. Where are you guys?
In 2022, for the first time ever, machine assistance
was
used to confirm
brand-new mathematics that a Fields Medallist had concerns about.
Mathematicians will for the most part continue to work the way they always have done,
but proof assistants are getting better and better,
and they will encroach more and more on the everyday practice of mathematics.
Meanwhile, Isabelle continued to be useful for verification.
I was amazed to hear that that the systems group here in the Computer Lab
had completed a
major verification
using Isabelle/HOL.
The tradition is for systems people to despise verification tools
for sweeping aside ugly things like overflow and floating point errors, even though they no longer do.
Besides, a research tool like Isabelle is only used by its own developer and his students.
Times were changing.
Isabelle is also one of the several proof assistants involved
with
CHERI
, a large-scale project
reviving the old idea of
capabilities
to ensure security at the hardware level.
CHERI has produced numerous publications, some of which
(for example
this one
and
that one
) describe very large proofs.
These concern the design and implementation of novel computer architectures
with fine-grained memory protection,
and a design process with formal verification at its heart.
Isabelle has also contributed to the design of
WebAssembly
,
a relatively new platform for web applications.
By subjecting the WebAssembly specification to
formal scrutiny
,
Conrad Watt was able to identify a number of issues in time for them to be fixed.
Finally, I’d like to mention this announcement (4 December 2025) by Dominic Mulligan of Amazon Web Services (AWS):
Over three years, lots of hard work, and 260,000 lines of Isabelle/HOL code later, the Nitro Isolation Engine (NIE)
is finally announced
alongside Graviton5.
Working with our colleagues in EC2, Annapurna, and AWS AppSec, we have been working to rearchitect the Nitro system for Graviton5+ instances around a small, trusted separation kernel. Written from scratch in Rust, we have additionally specified the behaviour of a core subset of the Nitro Isolation Engine kernel, verified that the implementation meets this specification, and additionally proved deep security properties—confidentiality and integrity—of the implementation.
I am biased, since I’ve been working with AWS on this exact project, but this is a big deal.
AWS has been using formal verification tools for a considerable time.
A notable earlier accomplishment was verify tricky but efficient algorithms using HOL Light,
speeding up
RSA encryption by a massive factor.
2025–2035 Becoming ordinary
A couple of months ago, Apple announced new models in their iPhone range,
but no crowds formed around Apple Stores.
They once did: the iPhone was once regarded as revolutionary.
Now, smartphones are a commodity, which is the final stage of a new technology.
Formal verification is not ordinary yet.
But it’s coming: more and more software will be seen as too important to develop any other way,
as is already the case for hardware.
Postscript
I am well aware that there is much outstanding work adjacent to that
described here, e.g. using other interactive tools, such as Nqthm and ACL2,
PVS and Agda, and much else using Rocq. There have been amazing advances
in the broader theorem proving world, also in model checking,
SAT/SMT solving and their applications to extended static checking of software.
I have related what I personally know.
And remember, the point of this post is not (simply) to boast
but to demonstrate the progress of our research community,
so the more achievements the better. Feel free to add some in the comments!
This post does not prove anything about other fields of science,
such as solid-state physics, molecular biology or mathematics.
But it’s fair to assume that such fields have not been idle either.
People have proved Fermat’s Last Theorem and the Poincaré conjecture,
and settled more obscure questions such as the projective plane of order 10.
People have located the remains of King Richard III, who died in 1485,
excavating and positively identifying the body by its DNA.
People have linked a piece of bloody cloth to Adolf Hitler and diagnosed that he had a specific genetic condition.
The immensely complex James Webb Space Telescope
was successfully deployed;
it is now revealing secrets about the early Universe.
Sometimes I wonder about the motives of those who claim that science is moribund.
Do they have political aims, or just unrealistic expectations?
Were they expecting time travel or some sort of warp drive?
People need to remember that movies are fiction.
YOCaml a framework used to describe static site generator
YOCaml is, as its name suggests, written in the wonderful language
OCaml
, a programming language that is
statically typed
(with type inference),
functional
,
imperative
, and
object-oriented
, and that features a rich
module system. While the simplest reason we wrote YOCaml in OCaml is
probably that
we like OCaml
, the language’s grammatical and
conceptual flexibility made it easier to design an API that we find
expressive
. In addition, OCaml is a high-performance language with
a rich ecosystem — if you want to convince yourself to use OCaml, we
invite you to read
Why I chose OCaml as my primary
language
.
Adhering to the ecosystem
YOCaml was designed in a
very modular
way, allowing us to take
advantage of the OCaml ecosystem. As a result, even though YOCaml is
packaged with a set of
standard plugins
, the core API makes it
fairly easy to integrate other libraries. For example,
users
have
requested
support for
Gemtext
, in
order to serve their site over
Gemini
. No changes
were required in YOCaml’s core,
demonstrating its flexibility
.
Easy deployment
One of the
great strengths
of statically generated sites is that
they are very easy to deploy. In fact, a simple static server is
enough! However, YOCaml goes further: thanks to the
Mirage
project, it is possible to directly
generate documents using a
Git repository
as a file system
(compatible with
GitHub Pages
) and serve
them statically. For example, by using
Unipi
, you can build an
operating system (unikernel) designed to statically serve your
site
with great ease!
Doom
and
Quake
studio id Software are now home to a "wall-to-wall" union according to the Communications Workers of America (CWA). The organisation have announced that a group of 165 id workers have just voted to unionise, adding to the ranks of the 300 ZeniMax quality assurance staff who
unionised back in 2023
.
According to the CWA's press release, Microsoft have already recognised this latest union - which is made up of "developers, artists, programmers, and more" - in accordance with the
labour neutrality agreement
the two parties agreed in 2022.
"The wall-to-wall organizing effort at id Software was much needed; it’s incredibly important that developers across the industry unite to push back on all the unilateral workplace changes that are being handed down from industry executives," said id Software producer and CWA organising committee member Andrew Willis.
Meanwhile, id lead services programmer and CWA committee member Chris Hays specifically cited remote staff not being dragged into the office as a reason behind the push for representation. "Remote work isn’t a perk," he said. "It’s a necessity for our health, our families, and our access needs. RTO policies should not be handed down from executives with no consideration for accessibility or our well-being."
The CWA release also cited "
mass industry layoffs
, sudden periods of crunch time, and unfair pay" as part of the impetus behind a wider push towards unionisation among devs across the industry this year, adding that the total of unionised workers across Microsoft's fiefdom is now "nearly 4,000" strong.
CWA president Ron Swaggerty added that the union "look forward to sitting across the table from Microsoft to negotiate a contract that reflects the skill, creativity, and dedication these workers bring to every project."
If you want to learn more about the CWA's unionisation efforts as the games industry's suits and moneyfolk continue to
lob
developers
out of
windows
with depressing regularity, give
this interview Nic did
a read.
Meanwhile, members of the "industry-wide union"
the CWA announced
earlier this year
held a protest
outside of The Game Awards yesterday, with their aim being to "to acknowledge the video games and studios that have been closed and to also condemn the creativity that’s been crushed by corporate greed and studio executives".
Solidarity to these id Software workers.
Google Releases Its New Google Sans Flex Font as Open Source
Google has made its ‘next generation brand typeface’,
Google Sans Flex
,
available for download
— under an open source
license
, which is welcome news.
A modern sans serif font purpose-designed for use on screens and OSes, Google Sans Flex is a ground-up, multi-axis rebuild of the proprietary Google Sans font, by typographer David Berlow (of
Font Bureau
fame).
The “flex” in GS Flex is because it’s a variable font that is
“extremely flexible [with] variable axes for weight, width, optical size, slant, as well as an axis for rounded terminals” (as in terminals in letters, not command-line apps).”
Android and web developers will find the varied variable axes on offer a creative boon for “expressive” design work.
Changing system font is a simple way to give Ubuntu (or any other Linux) desktop a subtle new vibe without having to futz around with themes, icon packs or other eye-candy extras which substantially alter the stock experience:
Google Sans Flex as UI font on Ubuntu 25.10
However, Linux desktop environments don’t yet support doing anything
fancy
with variable fonts, beyond the basics.
Ergo, unlike on modern Android, you can’t toggle Dark Mode in GNOME or KDE with this font enabled to make it automatically adjust its GRAD axis to compensate for the optical thinning that typically occurs when white text is rendered against darker backgrounds.
It’s not a major drawback, and GS Flex works great as a competent, classy system UI font on Linux, especially on HiDPI displays with fractional scaling. For my tastes, Google Sans Flex has (like GNOME’s default Adwaita Sans font) more presence than the Ubuntu font.
Want to try it out? Google has released the font under the
SIL Open Font License (OFL)
, meaning you can modify, redistribute and use it in your own projects.
To get it:
Go to Google Fonts
Search for ‘Google Sans Flex’
Hit “Get Font” > “Download All”
Extract the ZIP
Find the .ttf file inside and either:
Move it to
~/.local/share/fonts
; or
Install via your desktop’s font manager GUI
Once installed it’ll be available to use/select in other apps, settings and so on.
To
change UI font on Ubuntu
you can install the
GNOME Tweaks
tool and then open it, go to Appearance and set the UI font to
Google Sans Flex
. Although you may see variable options listed to pick from, GNOME will always render the ‘regular’ version.
Nuclear energy key to decarbonising Europe, says EESC
The EESC has adopted an opinion pointing out that nuclear energy is an essential component of the clean energy mix which is needed to phase out fossil fuels. The Committee calls on the European Commission to include key regulatory and financial enablers in order to make the planned investment possible, and to enhance transparent dialogue with civil society.
Nuclear energy plays and will continue to play a crucial role in decarbonising the European Union, says the European Economic and Social Committee (EESC) in an
opinion
adopted at the December plenary session. This is particularly true given the fact that the EU needs to consolidate its strategic autonomy in the fields of energy and technology.
The EESC opinion, drawn up by rapporteur
Dumitru Fornea
and co-rapporteur
Alena Mastantuono
, assesses the European Commission’s 8th
Nuclear Illustrative Programme
(PINC), published in June 2025.
According to the Committee, nuclear energy is a key element in diversifying the EU’s energy supply because it delivers safe, reliable, low-carbon electricity. This ensures that the grid remains stable most of the time, regardless of the weather or time of day, with less pressure on systemic costs.
Nuclear energy can therefore play an important role in supporting the EU’s overall industrial transition as it bolsters resilience against supply disruptions while complementing renewables and reducing dependence on imported fuels. Against this backdrop, existing EU industries (such as steel, cement and chemicals) as well as new industries (data centres) can enjoy a constant stream of decarbonised electricity.
‘The European nuclear industry sustains more than 1.1 million jobs in the EU and is a significant economic sector with a major footprint in terms of jobs, supply chain capacity and advanced R&D. It is a net-zero value chain based almost entirely in the EU,’ said
Mr Fornea
. ‘If we want to effectively move away from coal, we need accessible clean energy and funding for nuclear.’
Moving ahead with planned investment
In the opinion, the EESC regrets that the PINC does not propose any specific enablers, nor a real action plan, for the planned investment and urges the European Commission to include regulatory and financial measures. The goal is to enable investment in the sector, promote the development of innovative fuel cycle facilities and propose specific figures on the investment required by the nuclear fuel cycle.
‘We call on the Commission to put forward concrete measures to make the investment planned under the PINC possible,’ said
Ms Mastantuono
. ‘This is more necessary than ever given the geopolitical turmoil which is forcing the Union to develop EU-based capacities. For this reason, the nuclear value chain should be supported in terms of skills, research and the fuel supply chain.’
More specifically, the Committee recommends speeding up investment through specific measures such as a streamlined State aid process, access to EU cohesion funds, sustainable financing, licensing processes and faster decisions at EU and national level.
In addition, the EESC advises applying the same facilities to investment in nuclear energy as for renewables. These two energy sources are complementary and Member States are free to choose their own energy mix.
Keeping transparent dialogue open with civil society
Dialogue with civil society remains pivotal in building trust, ownership and societal acceptance, and could be more prominently addressed in the PINC. Moreover, there is no dedicated funding available for meaningful civil society participation.
On this matter, the EESC’s view is that decisions on new projects in the nuclear sector, including the development of new technologies, should be taken following the outcome of a broad and transparent dialogue with civil society on the technical, economic, social and environmental aspects.
Public engagement is essential to ensure that energy strategies reflect societal priorities (such as sustainability, reliability, land-use and responsibility for long-term waste management) and the early involvement of civil society through dialogue strengthens trust and legitimacy for both nuclear energy and other low-carbon technologies.
According to article 40 of the Euratom Treaty, the European Commission is required to periodically publish a
Nuclear Illustrative Programme
(PINC)
and consult the EESC. The Commission Communication on the PINC issued in June 2025 has therefore been presented under this article for the opinion of the European Economic and Social Committee.
The PINC provides a comprehensive overview of investment needs in nuclear energy, both fission and fusion, and encompasses all stages of the nuclear lifecycle. It also feeds into the debate on the role of nuclear energy in achieving carbon neutrality in the EU by 2050. In line with the highest level of nuclear safety, the PINC supports EU competitiveness, energy security and affordable energy prices.
In the 8th PINC, the Commission points out that nuclear energy requires significant investment, of around EUR 241 billion until 2050, both for lifetime extensions of existing reactors and the construction of new large-scale reactors. The Commission also says that additional investment is needed for Small Modular Reactors (SMRs), Advanced Modular Reactors (AMRs) and microreactors and in fusion for the longer-term future.
Small follow-up point re: my post this week on iMessage’s delivery architecture being built atop the Apple Push Notification service:
APNs can only relay messages up to 4 or 16 KB in size, depending
on the iOS or iPadOS version. If the message text is too long or
if an attachment such as a photo...
Users start a new iMessage conversation by entering an address or name. If they enter a phone number or email address, the device contacts the
Apple Identity Service (IDS)
to retrieve the public keys and APNs addresses for all of the devices associated with the addressee. If the user enters a name, the device first uses the user’s Contacts app to gather the phone numbers and email addresses associated with that name and then gets the public keys and APNs addresses from IDS.
The user’s outgoing message is individually encrypted for each of the receiver’s devices. The public encryption keys and signing keys of the receiving devices are retrieved from IDS. For each receiving device, the sending device generates a random 88-bit value and uses it as an
HMAC
-SHA256 key to construct a 40-bit value derived from the sender and receiver public key and the plaintext. The concatenation of the 88-bit and 40-bit values makes a 128-bit key, which encrypts the message with it using AES in Counter (CTR) Mode. The 40-bit value is used by the receiver side to verify the integrity of the decrypted plaintext. This per-message AES key is encrypted using RSA-OAEP to the public key of the receiving device. The combination of the encrypted message text and the encrypted message key is then hashed with SHA-1, and the hash is signed with the
Elliptic Curve Digital Signature Algorithm (ECDSA)
using the sending device’s private signing key. In
iOS 13
or later and
iPadOS 13.1
or later, devices may use an Elliptic Curve Integrated Encryption Scheme (ECIES) encryption instead of RSA encryption.
The resulting messages, one for each receiving device, consist of the encrypted message text, the encrypted message key, and the sender’s digital signature. They are then dispatched to the APNs for delivery. Metadata, such as the timestamp and APNs routing information, isn’t encrypted. Communication with APNs is encrypted using a forward-secret TLS channel.
APNs can only relay messages up to 4 or 16 KB in size, depending on the iOS or iPadOS version. If the message text is too long or if an attachment such as a photo is included, the attachment is encrypted using AES in CTR mode with a randomly generated 256-bit key and uploaded to iCloud. The AES key for the attachment, its
Uniform Resource Identifier (URI)
, and an SHA-1 hash of its encrypted form are then sent to the recipient as the contents of an iMessage, with their confidentiality and integrity protected through normal iMessage encryption, as shown in the following diagram.
For group conversations, this process is repeated for each recipient and their devices.
On the receiving side, each device receives its copy of the message from APNs and, if necessary, retrieves the attachment from iCloud. The incoming phone number or email address of the sender is matched to the receiver’s contacts so that a name can be displayed when possible.
As with all push notifications, the message is deleted from APNs when it’s delivered. Unlike other APNs notifications, however, iMessage messages are queued for delivery to offline devices. Messages are stored on Apple servers for up to 30 days.
Labor Leaders Cheer House Vote To Undo ‘Single-Largest Act of Union Busting in American History’
Portside
portside.org
2025-12-12 17:24:03
Labor Leaders Cheer House Vote To Undo ‘Single-Largest Act of Union Busting in American History’
Maureen
Fri, 12/12/2025 - 12:24
...
Labor Leaders Cheer House Vote To Undo ‘Single-Largest Act of Union Busting in American History’
Published
December 12, 2025
Members of the American Federation of Government Employees protest against firings during a rally in Washington, DC on February 11, 2025 | Nathan Posner/Anadolu
US labor leaders on Thursday celebrated the House of Representatives’ bipartisan vote in favor of a bill that would reverse President Donald Trump’s attack on the collective bargaining rights of 1 million
federal workers
.
Trump’s sweeping assault on federal
workers
has included March and August
executive orders
targeting their rights under the guise of protecting national security. In response, Congressmen Jared Golden (D-Maine) and Brian Fitzpatrick (R-Pa.) spearheaded the fight for the
Protect America’s Workforce Act
. They recently
collected
enough signatures to force the
231-195
vote, in which 20
Republicans
joined all Democrats present to send the bill to the Senate.
“The right to be heard in one’s workplace may appear basic, but it carries great weight—it ensures that the people who serve our nation have a seat at the table when decisions shape their work and their mission,” Fitzpatrick
,” he added. “I will always fight for our workers, and I call on the Senate to help ensure these protections are fully reinstated.”
American Federation of Labor and Congress of Industrial Organizations (AFL-CIO) president Liz Shuler joined union leaders in applauding the lower chamber on Thursday and calling on the Senate to follow suit. She
said
in a statement that “President Trump betrayed workers when he tried to rip away our collective bargaining rights. In these increasingly polarized times, working people delivered a rare bipartisan majority to stop the administration’s unprecedented attacks on our freedoms.”
“We commend the Republicans and Democrats who stood with workers and voted to reverse the single-largest act of union busting in American history,” she continued. “Americans
t
rust
unions
more than either political party. As we turn to the Senate—where the bill already has bipartisan support—working people are calling on the politicians we elected to stand with us, even if it means standing up to the union-busting boss in the
White House
.”
Everett Kelley, national president of the American Federation of Government Employees, the largest federal workers union, similarly
praised
the members of Congress who “demonstrated their support for the nonpartisan civil service, for the dedicated employees who serve our country with honor and distinction, and for the critical role that collective bargaining has in fostering a safe, protective, and collaborative workplace.”
“This vote marks an historic achievement for the House’s bipartisan pro-labor majority, courageously led by Reps. Jared Golden of
Maine
and Brian Fitzpatrick of
Pennsylvania
,” he said. “We need to build on this seismic victory in the House and get immediate action in the Senate—and also ensure that any future
budget
bills similarly protect collective bargaining rights for the largely unseen civil servants who keep our government running.”
American Federation of State, County, and Municipal Employees president Lee Saunders also applauded the House’s passage of “a bill that strengthens federal workers’ freedoms on the job so they can continue to keep our nation safe, healthy, and strong.”
“This bill not only provides workers’ critical protections from an administration that has spent the past year relentlessly attacking them,” he noted, “but it also ensures that our communities are served by the most qualified public service workers—not just those with the best political connections.”
Randy Erwin, the head of the National Federation of Federal Employees,
declared
that “this is an incredible testament to the strength of federal employees and the longstanding support for their fundamental right to organize and join a union.”
“The president cannot unilaterally strip working people of their constitutional freedom of association. In bipartisan fashion, Congress has asserted their authority to hold the president accountable for the biggest attack on workers that this country has ever seen,” he added, thanking the House supporters and pledging to work with “senators from both parties to ensure this bill is signed into law.”
Django 6.0 was
released today
, starting another release cycle for the loved and long-lived Python web framework (now 20 years old!). It comes with a
mosaic
of new features, contributed to by many, some of which I am happy to have helped with. Below is my pick of highlights from
the release notes
.
Upgrade with help from django-upgrade
If you’re upgrading a project from Django 5.2 or earlier, please try my tool
django-upgrade
. It will automatically update old Django code to use new features, fixing some deprecation warnings for you, including five fixers for Django 6.0. (One day, I’ll propose django-upgrade to become an official Django project, when energy and time permit…)
Template partials
There are four headline features in Django 6.0, which we’ll cover before other notable changes, starting with this one:
The Django Template Language now supports
template partials
, making it easier to encapsulate and reuse small named fragments within a template file.
Partials are sections of a template marked by the new
{% partialdef %}
and
{% endpartialdef %}
tags. They can be reused within the same template or rendered in isolation. Let’s look at examples for each use case in turn.
Reuse partials within the same template
The below template reuses a partial called
filter_controls
within the same template. It’s defined once at the top of the template, then used twice later on. Using a partial allows the template avoid repetition without pushing the content into a separate include file.
Reach for this pattern any time you find yourself repeating template code within the same template. Because partials can use variables, you can also use them to de-duplicate when rendering similar controls with different data.
Render partials in isolation
The below template defines a
view_count
partial that’s intended to be re-rendered in isolation. It uses the
inline
option, so when the whole template is rendered, the partial is included.
The page uses
htmx
, via my
django-htmx package
, to periodically refresh the view count, through the
hx-*
attributes. The request from htmx goes to a dedicated view that re-renders the
view_count
partial.
{%loaddjango_htmx%}<!doctype html><html><body><h1>{{video.title}}</h1><videowidth=1280height=720controls><sourcesrc="{{video.file.url}}"type="video/mp4">
Your browser does not support the video tag.
</video>{%partialdefview_countinline%}<sectionclass=view-counthx-trigger="every 1s"hx-swap=outerHTMLhx-get="{%url'video-view-count'video.id%}">{{video.view_count}} views
</section>{%endpartialdef%}{%htmx_script%}</body></html>
The relevant code for the two views could look like this:
The initial
video
view renders the full template
video.html
. The
video_view_count
view renders just the
view_count
partial, by appending
#view_count
to the template name. This syntax is similar to how you’d reference an HTML fragment by its ID in a URL.
History
htmx was the main motivation for this feature, as promoted by htmx creator Carson Gross in
a cross-framework review post
. Using partials definitely helps maintain “Locality of behaviour” within your templates, easing authoring, debugging, and maintenance by avoiding template file sprawl.
Django’s support for template partials was initially developed by Carlton Gibson in the
django-template-partials package
, which remains available for older Django versions. The integration into Django itself was done in a Google Summer of Code project this year, worked on by student Farhan Ali and mentored by Carlton, in
Ticket #36410
. You can read more about the development process in
Farhan’s retrospective blog post
. Many thanks to Farhan for authoring, Carlton for mentoring, and Natalia Bidart, Nick Pope, and Sarah Boyce for reviewing!
Tasks framework
The next headline feature we’re covering:
Django now includes a built-in Tasks framework for running code outside the HTTP request–response cycle. This enables offloading work, such as sending emails or processing data, to background workers.
Basically, there’s a new API for defining and enqueuing background tasks—very cool!
Background tasks are a way of running code outside of the request-response cycle. They’re a common requirement in web applications, used for sending emails, processing images, generating reports, and more.
Historically, Django has not provided any system for background tasks, and kind of ignored the problem space altogether. Developers have instead relied on third-party packages like
Celery
or
Django Q2
. While these systems are fine, they can be complex to set up and maintain, and often don’t “go with the grain” of Django.
The new Tasks framework fills this gap by providing an interface to define background tasks, which task runner packages can then integrate with. This common ground allows third-party Django packages to define tasks in a standard way, assuming you’ll be using a compatible task runner to execute them.
At this time, Django does not include a production-ready task backend, only two that are suitable for development and testing:
ImmediateBackend
- runs tasks synchronously, blocking until they complete.
DummyBackend
- does nothing when tasks are enqueued, but allows them to be inspected later. Useful for tests, where you can assert that tasks were enqueued without actually running them.
For production use, you’ll need to use a third-party package that implements one, for which
django-tasks
, the reference implementation, is the primary option. It provides
DatabaseBackend
for storing tasks in your SQL database, a fine solution for many projects, avoiding extra infrastructure and allowing atomic task enqueuing within database transactions. We may see this backend merged into Django in due course, or at least become an official package, to help make Django “batteries included” for background tasks.
To use django-tasks’
DatabaseBackend
today, first install the package:
Second,
add these two apps to your
INSTALLED_APPS
setting:
This process runs indefinitely, polling for tasks and executing them, logging events as it goes:
Task id=10b794ed-9b64-4eed-950c-fcc92cd6784b path=example.tasks.echo state=RUNNING
Hello from test task!
Task id=10b794ed-9b64-4eed-950c-fcc92cd6784b path=example.tasks.echo state=SUCCEEDED
You’ll want to run
db_worker
in production, and also in development if you want to test background task execution.
History
It’s been a long path to get the Tasks framework into Django, and I’m super excited to see it finally available in Django 6.0. Jake Howard started on the idea for Wagtail, a Django-powered CMS, back in 2021, as they have a need for common task definitions across their package ecosystem. He upgraded the idea to target Django itself in 2024, when he proposed
DEP 0014
. As a member of the Steering Council at the time, I had the pleasure of helping review and accept the DEP.
Since then, Jake has been leading the implementation effort, building pieces first in the separate
django-tasks package
before preparing them for inclusion in Django itself. This step was done under
Ticket #35859
, with
a pull request
that took nearly a year to review and land. Thanks to Jake for his perseverance here, and to all reviewers: Andreas Nüßlein, Dave Gaeddert, Eric Holscher, Jacob Walls, Jake Howard, Kamal Mustafa, @rtr1, @tcely, Oliver Haas, Ran Benita, Raphael Gaschignard, and Sarah Boyce.
Built-in support for the
Content Security Policy (CSP)
standard is now available, making it easier to protect web applications against content injection attacks such as cross-site scripting (XSS). CSP allows declaring trusted sources of content by giving browsers strict rules about which scripts, styles, images, or other resources can be loaded.
I’m really excited about this, because I’m a bit of a security nerd who’s been deploying CSP for client projects for years.
CSP
is a security standard that can protect your site from cross-site scripting (XSS) and other code injection attacks. You set a
content-security-policy
header to declare which content sources are trusted for your site, and then browsers will block content from other sources. For example, you might declare that only scripts your domain are allowed, so an attacker who manages to inject a
<script>
tag pointing to evil.com would be thwarted, as the browser would refuse to load it.
Previously, Django had no built-in support for CSP, and developers had to rely on building their own, or using a third-party package like the very popular
django-csp
. But this was a little bit inconvenient, as it meant that other third-party packages couldn’t reliably integrate with CSP, as there was no common API to do so.
The new CSP support provides all the core features that django-csp did, with a slightly tidier and more Djangoey API. To get started,
first
add
ContentSecurityPolicyMiddleware
to your
MIDDLEWARE
setting:
Place it next to
SecurityMiddleware
, as it similarly adds security-related headers to all responses. (You
do
have
SecurityMiddleware
enabled, right?)
Second,
configure your CSP policy using the new settings:
SECURE_CSP
to configure the
content-security-policy
header, which is your actively enforced policy.
SECURE_CSP_REPORT_ONLY
to configure the
content-security-policy-report-only
header, which sets a non-enforced policy for which browsers report violations to a specified endpoint. This option is useful for testing and monitoring a policy before enforcing it.
For example, to adopt the nonce-based strict CSP
recommended by web.dev
, you could start with the following setting:
The
CSP
enum used above provides constants for CSP directives, to help avoid typos.
This policy is quite restrictive and will break most existing sites if deployed as-is, because it requires nonces, as covered next. That’s why the example shows starting with the report-only mode header, to help track down places that need fixing before enforcing the policy. You’d later change to setting the
SECURE_CSP
setting to enforce the policy.
Anyway, those are the two basic steps to set up the new CSP support!
Nonce generation
A key part of the new feature is that
nonce generation
is now built-in to Django, when using the CSP middleware. Nonces are a security feature in CSP that allow you to mark specific
<script>
and
<style>
tags as trusted with a
nonce
attribute:
The nonce value is randomly generated per-request, and included in the CSP header. An attacker performing content injection couldn’t guess the nonce, so browsers can trust only those tags that include the correct nonce. Because nonce generation is now part of Django, third-party packages can depend on it for their
<script>
and
<style>
tags and they’ll continue to work if you adopt CSP with nonces.
Nonces are the recommended way to use CSP today, avoiding problems with previous allow-list based approaches. That’s why the above recommended policy enables them. To adopt a nonce-based policy, you’ll need to annotate your
<script>
and
<style>
tags with the nonce value through the following steps.
First,
add the new
csp
template context processor to your
TEMPLATES
setting:
This can be tedious and error-prone, hence using the report-only mode first to monitor violations might be useful, especially on larger projects.
Anyway, deploying CSP right would be another post in itself, or even a book chapter, so we’ll stop here for now. For more info, check out
that web.dev article
and
the MDN CSP guide
.
History
CSP itself was proposed for browsers way back in 2004, and was first implemented in Mozilla Firefox version 4, released 2011. That same year,
Django Ticket #15727
was opened, proposing adding CSP support to Django. Mozilla created
django-csp
from 2010, before the first public availability of CSP, using it on their own Django-powered sites. The first comment on Ticket #15727 pointed to django-csp, and the community basically rolled with it as the de facto solution.
Over the years, CSP itself evolved, as did django-csp, with
Rob Hudson
ending up as its maintainer. Focusing on the package motivated to finally get CSP into Django itself. He made a draft PR and posted on Ticket #15727 in 2024, which I enjoyed helping review. He iterated on the PR over the next 13 months until it was finally merged for Django 6.0. Thanks to Rob for his heroic dedication here, and to all reviewers: Benjamin Balder Bach, Carlton Gibson, Collin Anderson, David Sanders, David Smith, Florian Apolloner, Harro van der Klauw, Jake Howard, Natalia Bidart, Paolo Melchiorre, Sarah Boyce, and Sébastien Corbin.
Email API updates
The fourth and final headline feature:
Email handling in Django now uses Python’s modern email API, introduced in Python 3.6. This API, centered around the
email.message.EmailMessage
class, offers a cleaner and Unicode-friendly interface for composing and sending emails.
This is a major change, but it’s unlikely to affect projects using basic email features. You can still use Django’s
send_mail()
function
and
EmailMessage
class
as before, like:
fromdjango.core.mailimportEmailMessageemail=EmailMessage(subject="🐼 Need more bamboo",body="We are desperately low, please restock before the pandas find out!",from_email="zookeeper@example.com",to=["supplies@example.com"],)email.attach_file("/media/bamboo_cupboard.jpg")email.send()
The key change is that, under-the-hood, when you call
send()
on a Django
EmailMessage
object, it now translates itself into a Python’s newer
email.message.EmailMessage
type before sending.
Modernizing provides these benefits:
Fewer bugs
- many edge case bugs in Python’s old email API have been fixed in the new one.
Django is less hacky
- a bunch of workarounds and security fixes in Django‘s email code have been removed.
More convenient API
- the new API supports some niceties, like the below inline attachment example.
Easier inline attachments with
MIMEPart
Django’s
EmailMessage.attach()
method allows you to attach a file as an attachment. Emails support images as
inline attachments
, which can be displayed within the HTML email body.
While you could previously use
EmailMessage.attach()
to add inline attachments, it was a bit fiddly, using a legacy class. Now, you can call the method with a Python
email.message.MIMEPart
object to add an inline attachment in a few steps:
importemail.utilsfromemail.messageimportMIMEPartfromdjango.core.mailimportEmailMultiAlternativesmessage=EmailMultiAlternatives(subject="Cute Panda Alert",body="Here's a cute panda picture for you!",from_email="cute@example.com",to=["fans@example.com"],)withopen("panda.jpg","rb")asf:panda_jpeg=f.read()cid=email.utils.make_msgid()inline_image=MIMEPart()inline_image.set_content(panda_jpeg,maintype="image",subtype="jpeg",disposition="inline",cid=cid,)message.attach(inline_image)message.attach_alternative(f'<h1>Cute panda baby alert!</h1><img src="cid:{cid[1:-1]}">',"text/html",)
It’s not the simplest API, but it does expose all the power of the underlying email system, and it’s better than the past situation.
History
The new email API was added to Python as provisional
in version 3.4 (2014)
, and made stable
in version 3.6 (2016)
. The legacy API, however, was never planned for deprecation, so there was never any deadline to upgrade Django’s email handling.
In 2024, Mike Edmunds
posted on the (old) django-developers mailing list
, proposing the upgrade with strong reasoning and planning. This conversation led to
Ticket #35581
, which he worked on for eight months until it was merged. Many thanks to Mike for leading this effort, and to Sarah Boyce for reviewing! Email is not a glamorous feature, but it’s a critical communication channel for nearly every Django project, so props for this.
Positional arguments in
django.core.mail
APIs
We’re now out of the headline features and onto the “minor” changes, starting with this deprecation related to the above email changes:
django.core.mail
APIs now require keyword arguments for less commonly used parameters. Using positional arguments for these now emits a deprecation warning and will raise a
TypeError
when the deprecation period ends:
All optional parameters (
fail_silently
and later) must be passed as keyword arguments to
get_connection()
,
mail_admins()
,
mail_managers()
,
send_mail()
, and
send_mass_mail()
.
All parameters must be passed as keyword arguments when creating an
EmailMessage
or
EmailMultiAlternatives
instance, except for the first four (
subject
,
body
,
from_email
, and
to
), which may still be passed either as positional or keyword arguments.
Previously, Django would let you pass all parameters positionally, which gets a bit silly and hard to read with long parameter lists, like:
fromdjango.core.mailimportsend_mailsend_mail("🐼 Panda of the week","This week’s panda is Po Ping, sha-sha booey!","updates@example.com",["adam@example.com"],True,)
The final
True
doesn’t provide any clue what it means without looking up the function signature. Now, using positional arguments for those less-commonly-used parameters raises a deprecation warning, nudging you to write:
fromdjango.core.mailimportsend_mailsend_mail(subject="🐼 Panda of the week",body="This week’s panda is Po Ping, sha-sha booey!",from_email="updates@example.com",["adam@example.com"],fail_silently=True,)
This change is appreciated for API clarity, and Django is generally moving towards using keyword-only arguments more often. django-upgrade can automatically fix this one for you, via its
mail_api_kwargs
fixer
.
Thanks to Mike Edmunds, again, for making this improvement in
Ticket #36163
.
Extended automatic
shell
imports
Next up:
Common utilities, such as django.conf.settings, are now automatically imported to the shell by default.
One of the headline features back in Django 5.2 was
automatic model imports in the shell
, making
./manage.py shell
import all of your models automatically. Building on that DX boost, Django 6.0 now also imports other common utilities, for which we can find the full list by running
./manage.py shell
with
-v
2
:
$ ./manage.pyshell-v26 objects imported automatically: from django.conf import settings from django.db import connection, models, reset_queries from django.db.models import functions from django.utils import timezone...
(This is from a project without any models, so only the utilities are listed.)
So that’s:
settings
, useful for checking your runtime configuration:
Salvo Polizzi contributed the original automatic shell imports feature in Django 5.2. He’s then returned to offer these extra imports for Django 6.0, in
Ticket #35680
. Thanks to everyone that contributed to the forum discussion agreeing on which imports to add, and to Natalia Bidart and Sarah Boyce for reviewing!
Dynamic field refresh on
save()
Now let’s discuss a series of ORM improvements, starting with this big one:
GeneratedField
s and fields assigned expressions are now refreshed from the database after
save()
on backends that support the
RETURNING
clause (SQLite, PostgreSQL, and Oracle). On backends that don’t support it (MySQL and MariaDB), the fields are marked as deferred to trigger a refresh on subsequent accesses.
Django models support having the database generate field values for you in three cases:
The
db_default
field option, which lets the database generate the default value when creating an instance:
Previously, only the first method, using
db_default
, would refresh the field value from the database after saving. The other two methods would leave you with only the old value or the expression object, meaning you’d need to call
Model.refresh_from_db()
to get any updated value if necessary. This was hard to remember and it costs an extra database query.
Now Django takes advantage of the
RETURNING
SQL clause to save the model instance and fetch updated dynamic field values in a single query, on backends that support it (SQLite, PostgreSQL, and Oracle). A
save()
call may now issue a query like:
Django puts the return value into the model field, so you can read it immediately after saving:
video=Video.objects.get(id=1)...video.last_updated=Now()video.save()print(video.last_updated)# Updated value from the database
On backends that don’t support
RETURNING
(MySQL and MariaDB), Django now marks the dynamic fields as deferred after saving. That way, the later access, as in the above example, will automatically call
Model.refresh_from_db()
. This ensures that you always read the updated value, even if it costs an extra query.
History
This feature was proposed in
Ticket #27222
way back in 2016, by Anssi Kääriäinen. It sat dormant for most of the nine years since, but ORM boss Simon Charette picked it up earlier this year, found an implementation, and pushed it through to completion. Thanks to Simon for continuing to push the ORM forward, and to all reviewers: David Sanders, Jacob Walls, Mariusz Felisiak, nessita, Paolo Melchiorre, Simon Charette, and Tim Graham.
Universal
StringAgg
aggregate
The next ORM change:
The new
StringAgg
aggregate returns the input values concatenated into a string, separated by the
delimiter
string. This aggregate was previously supported only for PostgreSQL.
This aggregate is often used for making comma-separated lists of related items, among other things. Previously, it was only supported on PostgreSQL, as part of
django.contrib.postgres
:
fromdjango.contrib.postgres.aggregatesimportStringAggfromexample.modelsimportVideovideos=Video.objects.annotate(chapter_ids=StringAgg("chapter",delimiter=","),)forvideoinvideos:print(f"Video {video.id} has chapters: {video.chapter_ids}")
…which might give you output like:
Video 104 has chapters: 71,72,74
Video 107 has chapters: 88,89,138,90,91,93
Now this aggregate is available on all database backends supported by Django, imported from
django.db.models
:
fromdjango.db.modelsimportStringAgg,Valuefromexample.modelsimportVideovideos=Video.objects.annotate(chapter_ids=StringAgg("chapter",delimiter=Value(",")),)forvideoinvideos:print(f"Video {video.id} has chapters: {video.chapter_ids}")
Note the
delimiter
argument now requires a
Value()
expression wrapper for literal strings, as above. This change allows you to use database functions or fields as the delimiter if desired.
While most Django projects stick to PostgreSQL, having this aggregate available on all backends is a nice improvement for cross-database compatibility, and it means third-party packages can use it without affecting their database support.
History
The PostgreSQL-specific
StringAgg
was added way back in Django 1.9 (2015) by Andriy Sokolovskiy, in
Ticket #24301
. In
Ticket #35444
, Chris Muthig proposed adding the
Aggregate.order_by
option, something used by
StringAgg
to specify the ordering of concatenated elements, and as a side effect this made it possible to generalize
StringAgg
to all backends.
Thanks to Chris for proposing and implementing this change, and to all reviewers: Paolo Melchiorre, Sarah Boyce, and Simon Charette.
BigAutoField
as the default primary key type
Next up:
DEFAULT_AUTO_FIELD
setting now defaults to
BigAutoField
This important change helps lock in scalable larger primary keys.
Django 3.2 (2021) introduced
the
DEFAULT_AUTO_FIELD
setting
for changing the default primary key type used in models. Django uses this setting to add a primary key field called
id
to models that don’t explicitly define a primary key field. For example, if you define a model like this:
A key motivation for adding the setting was to allow projects to switch from
AutoField
(a 32-bit integer) to
BigAutoField
(a 64-bit integer) for primary keys, without needing changes to every model.
AutoField
can store values up to about 2.1 billion, which sounds large but it becomes easy to hit at scale.
BigAutoField
can store values up to about 9.2 quintillion, which is “more than enough” for every practical purpose.
If a model using
AutoField
hits its maximum value, it can no longer accept new rows, a problem known as
primary key exhaustion
. The table is effectively blocked, requiring an urgent fix to switch the model from
AutoField
to
BigAutoField
via a locking database migration on a large table. For a great watch on how Kraken is fixing this problem, see
Tim Bell’s DjangoCon Europe 2025 talk
, detailing some clever techniques to proactively migrate large tables with minimal downtime.
To stop this problem arising for new projects, Django 3.2 made new projects created with
startproject
set
DEFAULT_AUTO_FIELD
to
BigAutoField
, and new apps created with
startapp
set their
AppConfig.default_auto_field
to
BigAutoField
. It also added a system check to ensure that projects set
DEFAULT_AUTO_FIELD
explicitly, to ensure users were aware of the feature and could make an informed choice.
Now Django 6.0 changes the actual default values of the setting and app config attribute to
BigAutoField
. Projects using
BigAutoField
can remove the setting:
from django.apps import AppConfig
class ChannelConfig(AppConfig):
name = "channel"
- default_auto_field = "django.db.models.BigAutoField"
The default
startproject
and
startapp
templates also no longer set these values. This change reduces the amount of boilerplate in new projects, and the problem of primary key exhaustion can fade into history, becoming something that most Django users no longer need to think about.
History
The addition of
DEFAULT_AUTO_FIELD
in Django 3.2 was proposed by Caio Ariede and implemented by Tom Forbes, in
Ticket #31007
. This new change in Django 6.0 was proposed and implemented by ex-Fellow Tim Graham, in
Ticket #36564
. Thanks to Tim for spotting that this cleanup was now possible, and to Jacob Walls and Clifford Gama for reviewing!
Template variable
forloop.length
Moving on to templates, let’s start with this nice little addition:
The new variable forloop.length is now available within a for loop.
This small extension makes it possible to write a template loop like this:
Previously, you’d need to refer to the length in an another way, like
{{ geese|length }}
, which is a bit less flexible.
Thanks to Jonathan Ströbele for contributing this idea and implementation in
Ticket #36186
, and to David Smith, Paolo Melchiorre, and Sarah Boyce for reviewing.
querystring
template tag enhancements
There are two extensions to
the
querystring
template tag
, which was added in Django 5.1 to help with building links that modify the current request’s query parameters.
Release note:
The
querystring
template tag now consistently prefixes the returned query string with a
?
, ensuring reliable link generation behavior.
This small change improves how the tag behaves when an empty mapping of query parameters are provided. Say you had a template like this:
<ahref="{%querystringparams%}">Reset search</a>
…where
params
is a dictionary that may sometimes be empty. Previously, if
params
was empty, the output would be:
<ahref="">Reset search</a>
Browsers treat this as a link to the same URL
including the query parameters
, so it would not clear the query parameters as intended. Now, with this change, the output will be:
<ahref="?">Reset search</a>
Browsers treat
?
as a link to the same URL
without any query parameters
, clearing them as the user would expect.
Thanks to Django Fellow Sarah Boyce for spotting this improvement and implementing the fix in
Ticket #36268
, and for Django Fellow Natalia Bidart for reviewing!
Release note:
The
querystring
template tag now accepts multiple positional arguments, which must be mappings, such as
QueryDict
or
dict
.
This enhancement allows the tag to merge multiple sources of query parameters when building the output. For example, you might have a template like this:
…where
super_search_params
is a dictionary of extra parameters to add to make the current search “super”. The tag merges the two mappings, with later mappings taking precedence for duplicate keys.
Thanks again to Sarah Boyce for proposing this improvement in
Ticket #35529
, to Giannis Terzopoulos for implementing it, and to Natalia Bidart, Sarah Boyce, and Tom Carrick for reviewing!
Fin
That’s a wrap! Thank you for reading my highlights. There are plenty more changes to read about in
the release notes
.
Also, there are always many more behind-the-scenes improvements and bug fixes that don’t make it into the release notes. Optimizations and micro-improvements get merged all the time, so don’t delay, upgrade today!
Thank you to
all 174 people
who contributed to Django 6.0, as counted in
this list
by Mariusz Felisiak.
May your upgrade be swift, smooth, safe, and secure,
Fake ‘One Battle After Another’ torrent hides malware in subtitles
Bleeping Computer
www.bleepingcomputer.com
2025-12-12 17:12:47
A fake torrent for Leonardo DiCaprio's 'One Battle After Another' hides malicious PowerShell malware loaders inside subtitle files that ultimately infect devices with the Agent Tesla RAT malware. [...]...
A fake torrent for Leonardo DiCaprio’s 'One Battle After Another' hides malicious PowerShell malware loaders inside subtitle files that ultimately infect devices with the Agent Tesla RAT malware.
The malicious torrent file was discovered by Bitdefender researchers while investigating a spike in detections related to the movie.
One Battle After Another is a highly rated Paul Thomas Anderson movie released on September 26, 2025, starring Leonardo DiCaprio, Sean Penn, and Benicio del Toro.
Cybercriminals taking advantage of interest around new movies by uploading malicious torrents isn't anything new, but Bitdefender notes this case stands out for its unusually complex and stealthy infection chain.
"It's impossible to estimate how many people downloaded the files, but we saw that the supposed movie had thousands of seeders and leechers,"
explained Bitdefender
.
Launching malware from subtitles
The downloaded One Battle After Another movie torrent used in the attacks contains various files, including a movie file (One Battle After Another.m2ts), two image files (Photo.jpg, Cover.jpg), a subtitles file (Part2.subtitles.srt), and a shortcut file (CD.lnk) that appears as a movie launcher.
When the CD shortcut is executed, it launches Windows commands that extract and run a malicious PowerShell script embedded in the subtitle file between lines 100 and 103.
Malicious PowerShell script hidden in subtitles
This PowerShell script will then extract numerous AES-encrypted data blocks from the subtitles file again to reconstruct five PowerShell scripts that are dropped to 'C:\Users\<USER>\AppData\Local\Microsoft\Diagnostics.'
Other encrypted PowerShell commands in the subtitles
Source: BleepingComputer
The extracted PowerShell scripts act as a malware dropper, performing the following actions on the host:
Stage 1
– Extracts the One Battle After Another.m2ts file as an archive using any available extractor.
Stage 2
– Creates a hidden scheduled task (RealtekDiagnostics) that runs RealtekCodec.bat
Stage 3
– Decodes embedded binary data from Photo.jpg and writes restored files to the Windows Sound Diagnostics Cache directory.
Stage 5
– Extracts Cover.jpg contents into the Cache directory, including batch files and PowerShell scripts.
The files extracted in the final stage are used to check whether Windows Defender is active, install Go, extract the final payload (AgentTesla), and load it directly into memory.
AgentTesla is a long-running (since 2014) Windows RAT and information stealer, commonly used to steal browser, email, FTP, and VPN credentials, as well as to capture screenshots.
While Agent Tesla is not new, it remains widely used due to its reliability and ease of deployment.
Bitdefender has noted that in other movie titles, for example, 'Mission: Impossible – The Final Reckoning,' it has observed other families used, such as Lumma Stealer.
Torrent files from anonymous publishers often contain malware, so it is recommended that users avoid pirating new movies entirely for safety.
Broken IAM isn't just an IT problem - the impact ripples across your whole business.
This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.
Show HN: tomcp.org – Turn any URL into an MCP server
curl -X POST https://tomcp.org/chat \
-H "Content-Type: application/json" \
-d '{"url": "docs.stripe.com", "message": "How do I create a payment intent?"}'
AI Models
Free Models (No API Key Required)
These models are available for everyone with no setup:
Llama 3.1 8B
(Meta) - Default model, fast and capable
This is Behind the Blog, where we share our behind-the-scenes thoughts about how a few of our top stories of the week came together. This week, we discuss conversational AI, a behind the scenes of the zine, and more.
EMANUEL:
I made the terrible mistake of looking at some Hacker News comments this week for my story about a developer whose Google accounts were banned after he uploaded training data to Google Drive.
Unbeknownst to him, the training data contained CSAM
.
As we’ve explained in previous stories, CSAM is a subject we dread covering not only because it’s one of the most awful things one could think about, but because it’s extremely difficult and legally risky. For understandable reasons, the laws around viewing, let alone possessing CSAM, are strict and punishing, which makes verification for reporting reasons challenging. For similar reasons, it’s something we need to write about very carefully, making sure we don’t wrongfully associate or whitewash someone when it comes to such horrible behavior.
This post is for paid members only
Become a paid member for unlimited ad-free access to articles, bonus podcast content, and more.
Oracle (
ORCL
) stock has tumbled over 40% from its September peak, erasing more than $360 billion from its market capitalization. Nearly $67 billion of that
decline
occurred on Thursday alone, as Oracle’s second quarter results failed to assuage a key concern for investors — that the company is too heavily reliant on OpenAI (
OPAI.PVT
).
Oracle’s AI-fueled growth targets outlined in its first quarter sent the stock to a record on Sept. 10, briefly making its founder, Larry Ellison,
the world's richest man
. In September, the company told investors its remaining performance obligations (RPO) — or the value of its future revenue from customer contracts signed — had soared nearly 360% to $455 billion.
It was later revealed that ChatGPT developer OpenAI accounted for
at least $300 billion
of its customer commitments as part of the Stargate project. Since then, its stock has struggled.
Rising concerns about OpenAI’s mounting costs — set to hit
$1.4 trillion
due to its
deal spree
with firms including Nvidia (
NVDA
), CoreWeave (
CRWV
), AMD (
AMD
), and Broadcom (
AVGO
), in addition to Oracle — and increasing competition from Google's (
GOOG
) Gemini models have made investors even more wary.
"Clearly there's been a reversal in terms of the market's perception of OpenAI in the last couple of months," BNB Paribas analyst Stefan Slowinski told Yahoo Finance. “The OpenAI ecosystem obviously has been suffering as a result.”
Slowinski and other Wall Street analysts agree that OpenAI’s potential inability to pay for its wide-ranging AI infrastructure commitments is Oracle’s biggest risk.
OpenAI CEO Sam Altman declared a
“code red”
last week as the upstart faces greater rivalry from Google, threatening its ability to monetize its AI products and meet its ambitious
revenue targets
.
"[Oracle is] in this tough situation where they have to build out [data center] capacity for this customer and borrow a lot of money to do that when there's a very high uncertainty this customer will be able to pay for that capacity," DA Davidson analyst Gil Luria said.
Oracle’s second quarter results this week only deepened investor concerns.
The company’s $12 billion in capital expenditures was higher than expected, just as its free cash flow loss of $10 billion was much heavier than the $6 billion outflow anticipated. Oracle also substantially hiked its full-year capital expenditures forecast to $50 billion from $35 billion.
Oracle office building in Irvine, Calif. (Reuters/Mike Blake)
·
Reuters / Reuters
Executives’ attempts to quell worries over the company's high debt load, rising costs, and dependence on OpenAI didn’t help.
Japan law opening phone app stores to go into effect dec.18th
A new Japanese law is going into effect that could loosen the dominance of tech giants over smartphone services. It aims to bring users greater choice for app stores and more.
Starting December 18, firms like Apple and Google will be prohibited from blocking third party app stores on iPhone and Android devices.
The law also aims to loosen their grip on web browsers and search. The firms will now be required to give first-time users multiple choices for default services. This also applies when people update their operating system.
The Fair Trade Commission says the changes will improve convenience by encouraging new market entrants.
But some public comments released by the commission expressed concern that the legislation could undermine user security.
curl experimented with using pthread_cancel to timeout async DNS requests and
it blew up
. What else can we do?
Out of curiosity, I decided to review some alternatives and see how they work. My personal priorities are control over events; no background threads or signals or secret mechanisms.
getaddrinfo
The tried and true classic technique is to call
getaddrinfo
in a thread. Probably with more than one thread so you don’t get stuck behind a single slow request, but probably not boundless either. You can also use a separate process if you don’t use threads.
This is probably good enough for many uses.
getaddrinfo_a
glibc provides
getaddrinfo_a
which basically does the thread dance for you. Some of it. It comes with some caveats, and it’s distinctly non portable, and probably doesn’t mesh with your idea of an event loop. Passing.
c-ares
c-ares
is a standalone DNS library. It supports async queries via a threaded backend or an event driven system. I think the thread backend has the same issues, in that it uses a callback and then you need to push the results back into your application.
Alas, the event system uses lots of callbacks as well. This also includes some dire warnings in the documentation. “When the associated callback is called, it is called with a channel lock so care must be taken to ensure any processing is minimal to prevent DNS channel stalls.” Everyone knows the ideal callback just sets a flag, etc., but also everyone is inevitably tempted to do just one more thing, and hey look, it works fine, wait, why did it break. And thus I have a strong preference for library interfaces where you call into it, get some results, but any time you’re in your own code, you’re free to do what you want.
But worth a try. Based on the
sample code
I wrote the quickest dirtiest demo I could.
c-ares code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <poll.h>
#include <arpa/inet.h>
#include <ares.h>
struct server {
char name[32];
char ip[16];
int status;
};
struct everything {
struct server servers[1];
int nservers;
struct pollfd pfds[4];
int npfds;
};
static void
addrinfo_cb(void *arg, int status, int timeouts, struct ares_addrinfo *result)
{
struct server *server = arg;
server->status = 3;
if (!result)
return;
for (struct ares_addrinfo_node *node = result->nodes; node != NULL; node = node->ai_next) {
if (node->ai_family == AF_INET) {
struct sockaddr_in *in_addr = (void *)node->ai_addr;
inet_ntop(node->ai_family, &in_addr->sin_addr, server->ip, sizeof(server->ip)); }
}
}
static void
socket_cb(void *arg, ares_socket_t fd, int readable, int writable)
{
struct everything *state = arg;
printf("socket: %d r/w: %d %d\n", fd, readable, writable);
int idx = -1;
for (int i = 0; i < 4; i++) {
if (state->pfds[i].fd == fd) {
idx = i;
break;
}
}
if (idx == -1) {
for (int i = 0; i < 4; i++) {
if (state->pfds[i].fd == -1) {
idx = i;
state->pfds[idx].fd = fd;
state->npfds++;
break;
}
}
}
if (idx == -1)
abort();
if (!readable && !writable) {
state->pfds[idx].fd = -1;
state->npfds--;
return;
}
state->pfds[idx].fd = fd;
state->pfds[idx].events = 0;
if (readable)
state->pfds[idx].events |= POLLIN;
if (writable)
state->pfds[idx].events |= POLLOUT;
}
int
main(int argc, char **argv)
{
struct everything state;
memset(&state, 0, sizeof(state));
strlcpy(state.servers[0].name, argv[1], sizeof(state.servers[0].name));
state.servers[0].status = 1;
state.nservers = 1;
for (int i = 0; i < 4; i++)
state.pfds[i].fd = -1;
ares_library_init(ARES_LIB_INIT_ALL);
struct ares_options options;
memset(&options, 0, sizeof(options));
int optmask = 0;
options.flags = ARES_FLAG_EDNS | ARES_FLAG_DNS0x20;
optmask |= ARES_OPT_FLAGS;
options.sock_state_cb = socket_cb;
options.sock_state_cb_data = &state;
optmask |= ARES_OPT_SOCK_STATE_CB;
ares_channel_t *channel;
ares_init_options(&channel, &options, optmask);
ares_fd_events_t ares_fds[1];
while (1) {
printf("top of loop\n");
for (int i = 0; i < state.nservers; i++) {
printf("processing server %d\n", i);
struct server *server = &state.servers[i];
switch (server->status) {
case 1:
{
struct ares_addrinfo_hints hints;
memset(&hints, 0, sizeof(hints));
hints.ai_family = AF_UNSPEC;
hints.ai_flags = ARES_AI_CANONNAME;
ares_getaddrinfo(channel, argv[1], NULL, &hints, addrinfo_cb, server);
server->status = 2;
}
break;
case 2:
printf("woke up while working\n");
break;
case 3:
printf("got it, done: %s -> %s\n", server->name, server->ip);
return 0;
}
}
if (state.npfds == 0) {
printf("confused. nothing to poll\n");
return 1;
}
int res = poll(state.pfds, 4 /* state.npfds */, 2000);
printf("poll results: %d\n", res);
if (res > 0) {
ares_fd_events_t events[4];
int nevents = 0;
for (int i = 0; i < 4 /* state.npfds */; i++) {
if (!state.pfds[i].revents)
continue;
events[nevents].fd = state.pfds[i].fd;
events[nevents].events = 0;
if (state.pfds[i].revents & (POLLERR|POLLHUP|POLLIN))
events[nevents].events |= ARES_FD_EVENT_READ;
if (state.pfds[i].revents & (POLLOUT))
events[nevents].events |= ARES_FD_EVENT_WRITE;
nevents++;
}
ares_process_fds(channel, events, nevents, 0);
}
}
}
It’s okay, but the callbacks are annoying. Notifying me which descriptors need watching means I’m required to pack up my poll structure so I can access it in the callbacks, etc. Everything gets bound just a little bit tighter.
wadns
Among the
alternatives
the c-ares project helpfully lists, is
dns.c
. This sounds enticing.
On the downside, it’s not clear where the demo code stops and the functional code begins. As in, there’s a getaddrinfo sample, but it incorporates a lot of other code that doesn’t seem to be public. The public header doesn’t actually expose a means to interface with an event loop. The code is meant to be integrated into a project, which is understandable and even advantageous, but it means no demo today.
asr
The asr code was written for smtpd in OpenBSD. It doesn’t use threads and requires the caller to push events. Unfortunately, a portable version currently only exists in the
OpenSMTPD
repo. On the plus side, it’s used as the basis for the libc resolver in OpenBSD, which means the “sample” code to replace
getaddrinfo
literally is getaddrinfo.c.
I rewrote the c-ares demo to use asr. It comes out quite a bit shorter, and I think clearer as well.
asr code
#include <sys/types.h>#include <sys/socket.h>#include <stdio.h>#include <stdlib.h>#include <string.h>#include <poll.h>#include <netdb.h>#include <asr.h>#include <arpa/inet.h>struct server {char name[32];
char ip[16];
int status;
struct asr_query *aq;
int ar_fd;
};
int
main(int argc,char**argv){struct server servers[1]={};
strlcpy(servers[0].name, argv[1], sizeof(servers[0].name));
servers[0].status =1;
int nservers =1;
while(1){struct pollfd pfds[4];
int npfds =0;
printf("top of loop\n");
for(int i =0; i < nservers; i++){
printf("processing server %d\n", i);
struct server *server = &servers[i];
switch(server->status){case1:{struct addrinfo hints;
memset(&hints,0, sizeof(hints));
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;
server->aq = getaddrinfo_async(server->name,"80", &hints,NULL);
server->status =2;
}// fallthroughcase2:{
printf("ready to run\n");
struct asr_result ar;
int rv = asr_run(server->aq, &ar);
switch(rv){case0:
pfds[npfds].fd = ar.ar_fd;
pfds[npfds].events =0;
if(ar.ar_cond == ASR_WANT_READ)
pfds[npfds].events = POLLIN;
else
pfds[npfds].events = POLLOUT;
npfds++;
server->ar_fd = ar.ar_fd;
server->status =3;
break;
case1:{struct addrinfo *res;
for(res = ar.ar_addrinfo; res; res = res->ai_next){if(res->ai_family == AF_INET){struct sockaddr_in *in_addr =(void*)res->ai_addr;
inet_ntop(res->ai_family, &in_addr->sin_addr, server->ip, sizeof(server->ip));
}}
server->status =4;
}break;
}}break;
case3:
printf("woke up while working\n");
break;
case4:
printf("got it, done: %s -> %s\n", server->name, server->ip);
return0;
}}if(npfds ==0)continue;
int res = poll(pfds, npfds,2000);
printf("poll results: %d\n", res);
if(res >0){for(int i =0; i < npfds; i++){if(!pfds[i].revents)continue;
for(int j =0; j < nservers; j++){if(pfds[i].fd == servers[j].ar_fd)
servers[j].status =2;
}}}}}
I like this API. It’s very much like
read
or
write
in that it either gives you an answer, or tells you to come back later, and then it’s up to you to decide when that is.
Posted 25 Sep 2025 18:33 by tedu Updated: 25 Sep 2025 18:33
Tagged:
c
programming
EFF and 12 Organizations Urge UK Politicians to Drop Digital ID Scheme Ahead of Parliamentary Petition Debate
Electronic Frontier Foundation
www.eff.org
2025-12-12 16:48:47
The UK Parliament convened earlier this week to debate a petition signed by almost 2.9 million people calling for an end to the government’s plans to roll out a national digital ID. Ahead of that debate, EFF and 12 other civil society organizations wrote to politicians in the country urging MPs to r...
The UK Parliament convened earlier this week to debate a
petition
signed by almost 2.9 million people calling for an end to the
government’s plans
to roll out a national digital ID. Ahead of that debate, EFF and 12 other civil society organizations
wrote to politicians in the country
urging MPs to reject the Labour government’s newly announced digital ID proposal.
The UK’s Prime Minister Keir Starmer
pitched the scheme
as a way to “cut the
faff
” in proving people’s identities by creating a virtual ID on personal devices with information like names, date of birth, nationality, photo, and residency status to verify their right to live and work in the country.
But the case for digital identification has not been made.
As
we detail
in our joint briefing, the proposal follows a
troubling global trend
: governments introducing expansive digital identity systems that are structurally incompatible with a rights-respecting democracy. The UK’s plan raises six interconnected concerns:
Mission creep
Infringements on privacy rights
Serious security risks
Reliance on inaccurate and unproven technologies
Discrimination and exclusion
The deepening of entrenched power imbalances between the state and the public.
Digital ID schemes don’t simply verify who you are—they redefine who can access services and what those services look like. They become a gatekeeper to essential societal infrastructure, enabling governments and state agencies to close doors as easily as they open them. And they
disproportionately harm
those already at society’s margins, including people seeking asylum and undocumented communities, who already face heightened surveillance and risk.
Even the strongest recommended safeguards cannot resolve the
core problem
: a mandatory digital ID scheme that shifts power dramatically away from individuals and toward the state. No one should be coerced—technically or socially—into a digital system in order to participate fully in public life. And at a time when almost 3 million people in the UK have called on politicians to reject this proposal, the government must listen to people and say no to digital ID.
Age verification will not keep children safe online. Rather, it is a large proverbial hammer that nails everyone—adults and young people alike—into restrictive parameters of what the government deems appropriate content. That reality is more obvious and tangible now that we’ve seen age-restrictive regulations roll out in various states and...
After multiple delays of the REAL ID Act of 2005 and its updated counterpart, the REAL ID Modernization Act, in the United States, the May 7th deadline of REAL ID enforcement has finally arrived. Does this move our security forward in the skies? The last 20 years says we got...
A mobile driver’s license (often called an mDL) is a version of your ID that you keep on your phone instead of in your pocket. In theory, it would work wherever your regular ID works—TSA, liquor stores, to pick up a prescription, or to get into a bar. This sounds...
How many times do you pull out your driver’s license a week? Maybe two to four times to purchase age restricted items, pick up prescriptions, or go to a bar. If you get a mobile driver’s license (mDL) or other forms of digital identification (ID) being offered in Google and...
Senator endorses discredited book that claims chemical treats autism, cancer
For years, Sen. Ron Johnson has been spreading conspiracy theories and misinformation about COVID-19 and the safety of vaccines.
He’s promoted
disproven treatments for COVID-19
and claimed, without evidence, that athletes are “dropping dead on the field” after getting the COVID-19 vaccination. Now the Wisconsin politician is endorsing a book by a discredited doctor promoting an unproven and dangerous treatment for autism and a host of ailments: chlorine dioxide, a chemical used for disinfecting and bleaching.
Kory has said there’s a globally coordinated campaign by public health agencies, the drug industry and the media to suppress evidence of the medicinal wonders of chlorine dioxide. His book, according to its website, contends that the “remarkable molecule” works “to treat everything from cancer and malaria to autism and COVID.”
The book jacket features a prominent blurb from Johnson calling the doctor’s treatise: “A gripping tale of corruption and courage that will open eyes and prompt serious questions.”
Chlorine dioxide is a chemical compound that has a range of applications, including as a disinfectant and deodorizer. Food processing plants apply it to sanitize surfaces and equipment. Hospitals use it to sterilize medical devices, and some municipalities use low levels to treat public water supplies. Paper mills rely on it to whiten wood pulp. Safety experts advise those who handle it to work in well-ventilated spaces and to wear protective gloves.
Concentrations in drinking water systems higher than 0.8 milligrams per liter can be harmful, especially to infants, young children and fetuses, according to the Environmental Protection Agency.
Still, for many years people in online discussion groups have been promoting the use of chlorine dioxide in a mixture that they call a “miracle mineral solution,” ingested to rid people of a host of maladies. The Food and Drug Administration has warned that drinking these chlorine dioxide mixtures can
cause injury and even death
.
It is not medicinal, despite Kory’s contention. “It is all lunacy. Absolutely, it’s 100% nonsense,” said Joe Schwarcz, director of McGill University’s Office for Science and Society in Montreal and an expert on the
threat of pseudoscience.
Schwarcz has
written articles
about the so-called miracle mineral solution, calling it “a poison” when it’s in high concentrations.
The cover of the paperback version of “The War on Chlorine Dioxide” features a quote from Sen. Ron Johnson.
Bella Luna Press
Kory’s book, set to be released to the public in January, argues that word of chlorine dioxide’s effectiveness has been suppressed by government and medical forces that need people to remain perpetually ill to generate large profits. The use of the word “war” in the title is fitting, Kory
said in a recent online video
on his co-author’s Substack. “In the book I detail many, many assassination attempts of doctors who try to bring out knowledge around chlorine dioxide,” he said.
Johnson confirmed to ProPublica in an email that he authorized the statement on the cover. “After reading the entire book, yes I provided and approved that blurb,” he said. “Have you read the book?”
ProPublica asked Kory and his co-author, Jenna McCarthy, to provide an advance copy, an interview and responses to written questions. Kory did not respond. McCarthy wrote in an email to ProPublica that she was addressing some of the questions on her Substack. (She did not send a book or agree to an interview.)
The book “is a comprehensive examination of the existing evidence and a plea for open-minded inquiry and rigorous research,” she wrote on Substack. She dismissed warnings about chlorine dioxide’s toxicity in high concentrations, writing: “Everything has a toxic dose — including nutmeg, spinach, and tap water.”
She said that chlorine dioxide is being studied in controlled settings by researchers in the United States and Latin America and that “the real debate is how it should be used, at what dose, and in which clinical contexts.”
Johnson did not agree to an interview and did not answer questions emailed to his office by ProPublica, including whether he views chlorine dioxide as a world-changing medical treatment and whether he believes the FDA warnings are false.
“It’s Called Snake Oil”
Johnson has been an advocate of Kory’s for years, calling the doctor as an expert witness in two 2020
Senate hearings.
In one, Kory championed taking the drug ivermectin, an antiparasite medicine, to treat COVID-19.
In 2021,
an analysis of data
from clinical trials concluded that ivermectin could reduce deaths from COVID-19 and may produce other positive effects. McCarthy cited that analysis in her Substack response.
In 2022, however, the American Journal of Therapeutics, which had published the study,
warned that suspicious data
“appears to invalidate the findings” regarding ivermectin’s potential to decrease deaths.
In 2024 the American Board of Internal Medicine, which credentials physicians in certain specialties, revoked Kory’s certifications in internal medicine, pulmonary disease and critical care for making false and misleading public statements about the ability of ivermectin to treat COVID-19. Hospitals and many insurance networks typically require doctors to be board certified.
Kory vigorously fought the disciplinary action, arguing to the ABIM that he provided substantial medical and scientific evidence to support his
recommendations
for addressing COVID-19, though not the “consensus-driven” approach. He also sued the board in federal court, citing his free speech rights in a case that is still progressing in the 5th U.S. Circuit Court of Appeals. On Substack, McCarthy excoriated the ABIM, saying it “bullies physicians” and “enforces ideological conformity.”
In 2022, Johnson and Kory penned
a Fox News op-ed
opposing
a California bill
that would strip doctors’ licenses for espousing misinformation about COVID-19. The bill became law but
was repealed
after a court fight. A federal judge found the statute’s definition of misinformation
to be too vague
, which could infringe on doctors’ right to free speech.
Johnson, who has been in Congress since 2011, has a history of advocating for experimental treatments and viewing the government as an impediment. Dr. Peter Lurie, president and executive director of the Center for Science in the Public Interest, a public health advocacy group, said that among members of Congress, Johnson was “an early adopter of anti-science ideas.”
Lurie said that Johnson is no longer an outlier in Washington, which now has many more elected lawmakers whom he considers anti-science. “What may have started off as the cutting edge of an anti-science movement has now turned into a much more broader-based movement that is supported by millions of people,” he said.
Earlier this year, Johnson held a hearing highlighting a flawed study claiming that
vaccinated children
had an increased rate of serious chronic diseases when compared to children who were not vaccinated. The conclusion questions the scientific consensus that vaccines are safe. The
study’s researchers
chose not to publish it because of problems they found in their data and methodology.
HHS did not respond to requests from ProPublica about Kennedy’s views on chlorine dioxide. At his confirmation hearing, Kennedy praised President Donald Trump for his wide search for a COVID-19 remedy in his first term, which Kennedy said included vaccines, various drugs, “even chlorine dioxide.”
Kory’s publisher is listed as Bella Luna Press, which has issued at least two other titles by McCarthy. “Thanks to the Censorship Industrial Complex, you won’t find
The War on Chlorine Dioxide
on Amazon or at Barnes & Noble. We had to design and build this website, figure out formatting and printing and shipping, and manage every aspect of order processing ourselves,” the book’s website states. (A representative for Bella Luna could not be reached for comment.)
As this new book is released, the autism community is also grappling with another controversy: the unsubstantiated assertion by Kennedy that Tylenol use by pregnant women poses an increased risk of autism. In addition, under Kennedy, the Centers for Disease Control and Prevention revised its website in November to cast doubt on the long-held scientific conclusion that childhood vaccines do not cause autism.
Some parents of children with autism, desperate for a remedy, have long reached for
dubious and at times dangerous panaceas,
including hyperbaric oxygen chambers and chelation therapy, used for the treatment of heavy metal poisoning. Neither method has been proven effective.
Helen Tager-Flusberg, director of the Center for Autism Research Excellence at Boston University, said Johnson has “acted extremely irresponsibly” in lending his name to a book making claims about chlorine dioxide treating autism.
“Wisconsin is filled with experts — clinical experts, medical experts, scientists — who understand and have studied autism and treatments for autism for many many years,” she said. “He’s chosen to completely ignore the clinical and the scientific community.”
People with autism may take medication to reduce anxiety, address attention problems, or reduce severe irritability. Many benefit from behavioral interventions and special education services to help with learning and functional abilities. But there is no cure, said Tager-Flusberg.
Referring to chlorine dioxide, she said: “We have had examples of this probably throughout the history of medicine. There’s a word for this, it’s called snake oil.”
In her response on Substack to ProPublica, McCarthy wrote that “chlorine dioxide is being used to treat (nobody said ‘cure’) autism with life-changing results.”
The Search for Miracle Cures
The mother of an autistic son, Melissa Eaton of
North Carolina
, heard Kory reference his book in early November on The HighWire, an internet talk show hosted by Del Bigtree, a prominent vaccine skeptic and former
communications director
for Kennedy’s 2024 presidential campaign. She then looked up the book online and noticed Johnson’s endorsement.
Eaton for many years has worked to expose people who peddle chlorine dioxide and to report apparent injuries to authorities. She monitors social media forums where parents discuss giving it to their children orally or via enemas. Sometimes the families reveal that their children are sick. “They’re throwing up and vomiting and having diarrhea and rashes,” Eaton said.
Some adherents advise parents that the disturbing effects indicate that the treatment is working, ridding the body of impurities, or that the parents should alter the dosage.
“Most of these kids are nonverbal,” Eaton said. “They’re not able to say what’s hurting them or what’s happening to them. The parents feel they’re doing the right thing. That’s how they view this: They’re helping to cure autism.”
The idea that chlorine dioxide can be a miracle cure began to spread about 20 years ago when a gold prospector, Jim Humble, wrote a book claiming his team in Guyana fell ill with malaria and recovered after drinking safe amounts of chlorine dioxide.
Humble later co-founded a “health and healing” church in Florida with a man named Mark Grenon, who called himself an archbishop and sold a chlorine dioxide solution as a cure for COVID-19. They described it as a “miracle mineral solution,” or MMS.
Grenon
went to prison in 2023
for conspiring to defraud the United States by distributing an unapproved and misbranded drug. The scheme took in more than $1 million, according to prosecutors.
An affidavit in the case filed by a special agent with the FDA Office of Criminal Investigations noted: “FDA has received numerous reports of adverse reactions to MMS. These adverse reactions include hospitalizations, life-threatening conditions, and death.”
Grenon, who is now out of prison, told ProPublica that he too is writing a book about chlorine dioxide. “My book will tell the truth.” He declined further comment.
Chlorine dioxide is currently used in many ways that are not harmful. It is found in some consumer products like mouthwashes, but it is not meant to be swallowed in those instances. (One popular mouthwash warns to “keep out of reach of children.”) It’s also available to consumers in do-it-yourself packages where they combine drops from two bottles of different compounds — commonly sodium chlorite and hydrochloric acid — and add it to water. Hikers often carry the drops, or tablets, using small amounts to make quarts of fresh water potable.
But numerous online shoppers post product reviews that go further, referring to it as a tonic. Various online guides, some aimed at parents of autistic children, recommend a shot-glass-size dose, sometimes given multiple times a day and even hourly. That can far exceed the threshold the EPA considers safe.
McCarthy, addressing ProPublica on Substack, wrote: “You point to various online guides that offer what could be considered dangerous dosing instructions. We agree, the internet is a terrifying wasteland of misinformation and disinformation.”
In the Substack video, Kory said he felt compelled to spread the word about chlorine dioxide much as he did about ivermectin, even though it cost him professionally.
He no longer has a valid medical license in Wisconsin or California, where he did not renew them, according to the Substack post. His medical licenses in New York and Michigan are active.
“I like to say I was excommunicated from the church of the medical establishment,” he said in the Substack video. As a result, he said, he turned to telehealth and started a practice.
In the Nov. 6 HighWire episode hosted by Bigtree, the discussion included talk not just of chlorine dioxide’s medicinal potential but also of how cheap and easy it is to obtain.
“On Amazon, it’s literally, you get two bottles, well, it comes in two,” Kory started to explain, before stopping that train of thought.
“I wouldn’t know how to make it,” he said.
Secondary school maths showing that AI systems don't think
At a time when many young people are using AI for personal and learning purposes, schools are trying to figure out what to teach about AI and how (find out more in this
summer 2025 data about young people’s usage of AI in the UK
). One aspect of this is how technical we should get in explaining how AI works, particularly if we want to debunk naive views of the capabilities of the technology, such as that AI tools ‘think’. In this month’s research seminar, we found out how AI contexts can be added to current classroom maths to make maths more interesting and relevant while teaching the core concepts of AI.
Research Associate
Stephan Kindler
(Karlsruhe Institute of Technology (KIT), Germany)
They talked about how maths already taught in secondary schools can be used to demystify AI. At first glance, this seems difficult to do, as it is often assumed that school-aged learners will not be able to understand how these systems work. This is especially the case for artificial neural networks, which are usually seen as a black box technology — they may be relatively easy to use, but it’s not as easy to understand how they work. Despite this, the Austrian and German team have developed a clear way to explain some of the fundamental elements of AI using school-based maths.
Sarah Schönbrodt started by challenging us to consider that learning maths is an essential part in developing AI skills, as:
AI systems using machine learning are data-driven and are based on mathematics, especially statistics and data
Authentic machine learning techniques can be used to bring to life existing classroom maths concepts
Real and relevant problems and associated data are available for teachers to use
A set of workshops for secondary maths classrooms
Sarah explained how the CAMMP team have developed a range of teaching and learning materials on AI (and beyond) with an overall goal to “allow students to solve authentic, real and relevant problems using mathematical modeling and computers”.
She reflected that much of school maths is set in contexts that are abstract, and may not be very interesting or relevant to students. Therefore, introducing AI-based contexts, which are having a huge impact on society and students’ lives, is both an opportunity to make maths more engaging and also a way to demystify AI.
Old-fashioned contexts are often used to teach classroom maths concepts. Those same concepts could be taught using real-world AI contexts. (Slide from the researchers’ presentation.)
Workshops designed and researched by the team include contexts such as privacy in social networks to learn about decision trees, personalised Netflix recommendations to learn about k-nearest neighbour, word predictions to learn about N-Grams, and predicting life expectancy to learn about regression and neural networks.
Learning about classification models: traffic lights and the support vector machine
For the seminar, Sarah walked through the steps to learn about support vector machines. This is an upper secondary workshop for students aged 17 to 18 years old. The context of the lesson is an image problem — specifically, classifying the data representing the colours of a simplified traffic light system (two lights to start with) to work out if a traffic light is red or green.
She walked through each of the steps of the maths workshop:
Plotting data points of two classes, the representation of green and red traffic lights
Finding a line that best separates the data points of both classes
Figuring out what best is
Classifying the data points in relation to the chosen (separating) line
Validating the model statistically to see if it is useful in classifying new data points, including using test data and creating a contingency table (also called a confusion matrix)
Discussing limitations, including social and ethical issues
Explaining how three traffic lights can be expressed as three-dimensional data by using planes
By classifying green and red traffic light data, students are learning about lines, classifying data, and considering limitations. (Slide from the researchers’ presentation.)
Throughout the presentation, Sarah pointed out where the maths taught was linked to the Austrian and German mathematics curriculum.
Learning about planes, separating planes, and starting to see how data can be represented in vectors. (Slide from the researchers’ presentation.)
Learning about social and ethical issues
Learning about the social and ethical issues in data-driven systems. (Slide from the researchers’ presentation.)
As well as learning about lines, planes, distances, dot product and statistical measures, learners are also engaged in discussing the social and ethical issues of the approach taken. They are encouraged to think about bias, data diversity, privacy, and the impact of errors on people. For example, if the model wrongly predicts a light as green when it is red, then an autonomous car would run through a red traffic light. This would likely be a bigger consequence than stopping at a green traffic light that was mis-predicted as red. So should the best line reduce this kind of error?
To teach the workshops, Sarah explained they have developed interactive Jupyter notebooks, where no programming skills are needed. Students fill in the gaps of example code, explore simulations, and write their ideas for discussion for the whole class. No software needs to be installed, feedback is direct, and there are in-depth tasks and staggered hints.
Learning about regression models: Weather forecasting and the toy artificial neural network
Stephan went on to introduce artificial neural networks (ANNs), which are the basis of generative AI applications like chatbots and image generation systems. He focused on regression models, such as those used in weather forecasting.
ANNs are very complex. Therefore, to start to understand the fundamentals of this technology, he introduced a ‘toy ANN’ with one input, three nodes, and one output. A function is performed on the input data at each node. With the toy network, the team wants to tackle a major and common misconception: that students think that ANN systems learn, recognise, see, and understand, when really it’s all just maths.
Tackling misconceptions about ANNs by exploring how they work in a toy version. (Slide from the researchers’ presentation.)
The learning activity starts by looking at one node with one input and one output, and can be described as a mathematical function, with a concatenation of two functions (in this case a linear and activation function). Stephan shared an
online simulator
that visualises how the toy neural network can be explored as students change two parameters (in this case, weight and bias of the functions). Students then look at the overall network, and the way that the output from the three nodes is combined. Again, they can explore this in the simulator. Students compare simple data about weather prediction to the model, and discover they need more functions — more nodes to better fit the data. The activity helps students learn that ANN systems are just highly adjustable mathematical functions that, by adding nodes, can approximate relationships in a given data set. But the approximation only works in the bounds (intervals) in which data points are given, showing that ANNs do not ‘understand’ or ’know’ — it’s just maths.
Stephen finished by explaining the mutual benefits of AI education and maths education. He suggested maths will enable a deeper understanding of AI, and give students a way to realistically assess the opportunities and risks of AI tools and show them the role that humans have in designing AI systems. He also explained that classroom maths education can benefit from incorporating AI contexts. This approach highlights how maths underpins the design and understanding of everyday systems, supports more effective teaching, and promotes an interdisciplinary way of learning across subjects.
Some personal reflections — which may not be quite right!
I have been researching the teaching of AI and machine learning for around five years now, since before ChatGPT and other similar tools burst on the scene. Since then, I have seen an increasing number of resources to teach about the social and ethical issues of the topic, and there are a bewildering number of learning activities and tools for students to train simple models. There are frameworks for the data lifecycle, and an emerging set of activities to follow to prepare data, compare model types, and deploy simple applications. However, I felt the need to understand and to teach about, at a very simple level, the basic building blocks of data-driven technologies. When I heard the CAMMP team present their work at the
AIDEA conference in February 2025
, I was entirely amazed and I asked them to present here at our research seminar series. This was a piece of the puzzle that I had been searching for — a way to explain the
‘bottom of the technical stack of fundamental concepts’
. The team is taking very complex ideas and reducing them to such an extent that we can use secondary classroom maths to show that AI is not magic and AI systems do not think. It’s just maths. The maths is still hard, and teachers will still need the skills to carefully guide students step by step so they can build a useful mental model.
I think we can simplify these ideas further, and create unplugged activities, simulations, and ways for students to explore these basic building blocks of data representation, as well as classification and representing approximations of complex patterns and prediction. I can sense the beginnings of new ideas in computational thinking, though they’re still taking shape. We’re researching these further and will keep you updated.
Finding out more
If you would like to find out more about the CAMMP resources, you can
watch the seminar recording
,
look at the CAMMP website
or try out their online materials. For example, the team shared a link to the
jupyter notebooks
they use to teach the workshops they demonstrated (and others). You can use these with a username of ‘cammp_YOURPSEUDONYM’, where you can set ‘YOURPSEUDONYM’ to any letters, and you can choose any password. They also shared their
toy ANN simulation
.
The CAMMP team are not the only researchers who are investigating how to teach about AI in maths lessons. You can find a set of
other research papers here
.
Join our next seminar
In our current seminar series, we’re exploring teaching about AI and data science. Join us at our last seminar of the series on Tuesday, 27 January 2026 from 17:00 to 18:30 GMT to hear Salomey Afua Addo talk about using
unplugged approaches to teach about neural networks
.
To sign up and take part, click the button below. We’ll then send you information about joining. We hope to see you there.
One of the key components in the kernel's development process is the
linux-next repository. Every day, a large number of branches, each
containing commits intended for the next kernel development cycle, is
pulled into linux-next and integrated. If there are conflicts between
branches, the linux-ne...
Reader subscriptions are a necessary way
to fund the continued existence of LWN and the quality of its content.
If you are already an LWN.net subscriber, please log in
with the form below to read this content.
Please consider
subscribing to LWN
. An LWN
subscription provides numerous benefits, including access to restricted
content and the warm feeling of knowing that you are helping to keep LWN
alive.
(Alternatively, this item will become freely
available on December 25, 2025)
String Theory Inspires a Brilliant, Baffling New Math Proof
Years ago, an audacious Fields medalist outlined a sweeping program that, he claimed, could be used to resolve a major problem in algebraic geometry. Other mathematicians had their doubts. Now he says he has a proof.
Introduction
I
n August, a team of mathematicians posted a paper claiming to solve a major problem in algebraic geometry — using entirely alien techniques. It instantly captivated the field, stoking excitement in some mathematicians and skepticism in others.
The result deals with polynomial equations, which combine variables raised to powers (like
y
=
x
or
x
2
− 3
xy
=
z
2
). These equations are some of the simplest and most ubiquitous in mathematics, and today, they’re fundamental to lots of different areas of study. As a result, mathematicians want to study their solutions, which can be represented as geometric shapes like curves, surfaces and higher-dimensional objects
called manifolds
.
There are infinitely many types of polynomial equations that mathematicians want to tame. But they all fall into one of two basic categories — equations whose solutions can be computed by following a simple recipe, and equations whose solutions have a richer, more complicated structure. The second category is where the mathematical juice is: It’s where mathematicians want to focus their attention to make major advances.
But after sorting just a few types of polynomials into the “easy” and “hard” piles, mathematicians got stuck. For the past half-century, even relatively simple-looking polynomials have resisted classification.
Then this summer, the new proof appeared. It claimed to end the stalemate, offering up a
tantalizing vision for how to classify
lots of other types of polynomials that have until now seemed completely out of reach.
The problem is that no one in the world of algebraic geometry understands it. At least, not yet. The proof relies on ideas imported from the world of string theory. Its techniques are wholly unfamiliar to the mathematicians who have dedicated their careers to classifying polynomials.
Some researchers trust the reputation of one of the paper’s authors, a Fields medalist named
Maxim Kontsevich
. But Kontsevich also has a penchant for making audacious claims, giving others pause. Reading groups have sprung up in math departments across the world to decipher the groundbreaking result and relieve the tension.
This review may take years. But it’s also revived hope for an area of study that had stalled. And it marks an early victory for a broader mathematical program that Kontsevich has championed for decades — one that he hopes will build bridges between algebra, geometry and physics.
“The general perception,” said
Paolo Stellari
, a mathematician at the University of Milan who was not involved in the work, “is that we might be looking at a piece of the mathematics of the future.”
The Rational Approach
The effort to classify all polynomials deals with the oldest kind of math: solving equations. To solve the simple polynomial
y
= 2
x
, for instance, you just need to find values of
x
and
y
that satisfy the equation. There are infinitely many solutions to this equation, such as
x
= 1,
y
= 2. When you graph all the solutions in the coordinate plane, you get a line.
Other polynomials are harder to solve directly, and their solutions cut out more complicated, higher-dimensional shapes in space.
But for some of these equations, it turns out, there’s a really simple way to find every possible solution. Instead of separately plugging different numbers into each variable, you can get all the solutions at once by rewriting the variables in terms of a new variable,
t
.
Consider the polynomial
x
2
+
y
2
= 1, which defines a circle. Now set
x
equal to 2
t
/(1 +
t
2
), and
y
equal to (1 −
t
2
)/(1 +
t
2
). When you plug these new formulas back into your original equation, you get 1 = 1, a statement that’s always true, no matter what
t
is. This means that by choosing any real-number value for
t
, you’ll instantly get a solution to the original polynomial. For instance, when you set
t
equal to 1, you get
x
= 2(1)/(1 + (1)
2
) = 1, and
y
= 0. And indeed,
x
= 1,
y
= 0 is a solution to the original equation: (1)
2
+ (0)
2
= 1.
This straightforward way of framing all your solutions is called a rational parameterization. It’s equivalent to mapping every point on the graph of your original polynomial — in this case, a circle — to a unique point on a straight line.
Any degree-1 polynomial equation — that is, any polynomial whose terms are raised to a power of at most 1 — can be parameterized like this. It doesn’t matter how many variables the equation has: It might have two variables, or 200. Once you go beyond two variables, the solutions to your polynomial equation will form complicated higher-dimensional shapes. But because the polynomial can still be parameterized, there’s a way to map every point in your high-dimensional shape to points on a particularly simple space in the same number of dimensions (like the line). This, in turn, gives you a straightforward way to compute all the polynomial’s solutions.
Similarly, any degree-2 polynomial (whose terms are raised to a power of at most 2) has a rational parameterization.
But if an equation’s degree is 3 or more, it can’t always be parameterized. It depends on how many variables the equation has.
Take a typical kind of degree-3 polynomial: elliptic curves, like
y
2
=
x
3
+ 1, which have only two variables. “Elliptic curves are glorious, they’re wonderful, but you can’t possibly parameterize them,” said
Brendan Hassett
of Brown University. There’s no simple formula for
x
and
y
that gives you all of an elliptic curve’s solutions, so there’s no way to map the curve to a straight line. “If you could, they would not be so much fun,” Hassett said.
Instead, the solutions to an elliptic curve have a far richer structure — one that’s played a vital role in number theory for centuries, and that cryptographers have taken advantage of to encode secret messages.
What about degree-3 equations with more variables, then? Are they parameterizable, or is the structure of their solutions more fun, the way it is for elliptic curves?
In 1866, the German mathematician Alfred Clebsch showed that degree-3 equations with three variables — whose solutions form two-dimensional surfaces — are usually parameterizable. More than a century later, Herbert Clemens and Phillip Griffiths published a monumental proof in which they showed that the opposite is true for most degree-3 equations with four variables. These equations, which form three-dimensional manifolds called three-folds,
are not parameterizable
: Their solutions can’t be mapped to a simple 3D space.
Many mathematicians suspected that the next polynomial to be classified — degree-3 equations with five variables (forming four-dimensional manifolds known as four-folds) — wouldn’t usually be parameterizable either. In fact, they figured that polynomials should never be parameterizable past a certain point. But Clemens and Griffiths’ techniques didn’t work for four-folds.
And so for decades, the classification effort lay dormant.
Converting a Prophet
Mathematicians were surprised when, at a conference in Moscow in the summer of 2019, Maxim Kontsevich got up to speak about classifying four-folds.
For one thing, Kontsevich is known for taking a high-level approach to mathematics, preferring to pose ambitious conjectures and sketch out broad programs, often leaving the subtler details and formal proof-writing to others. He’s described himself as something between a prophet and a daydreamer.
For the past three decades, he’s been focused on developing a program called homological mirror symmetry, which has its roots in string theory. In the 1980s, string theorists wanted to count the number of curves on high-dimensional manifolds to answer questions about how the building blocks of the universe might behave. To count the curves on a given manifold, they considered its “mirror image” — another manifold that, though very different from the original, had related properties. In particular, they found that an algebraic object associated to the mirror image, called a Hodge structure, could reveal the number of curves on the original manifold. The reverse was also true: If you count the curves on the mirror image, you’ll get information about the original manifold’s Hodge structure.
In 1994, Kontsevich sketched out a program to explain the underlying reason for this correspondence. His program also predicted that the correspondence extended to all kinds of manifolds beyond those relevant to string theory.
For now, no one knows how to prove Kontsevich’s mirror symmetry program. “It will be next-century mathematics,” he said. But over the years, he’s made partial progress toward a proof — while also exploring the program’s potential consequences.
In 2002, one of Kontsevich’s friends,
Ludmil Katzarkov
of the University of Miami, hypothesized one such consequence: that the program might be relevant to the classification of polynomial equations.
Katzarkov was familiar with Clemens and Griffiths’ 1972 proof that three-folds aren’t parameterizable. In that work, the pair looked at a given three-fold’s Hodge structure directly. They then used it to show that the three-fold couldn’t be mapped to a simple 3D space. But the Hodge structures associated with four-folds were too complicated to analyze using the same tools.
Katzarkov’s idea was to access the four-fold’s Hodge structure indirectly — by counting how many curves of a particular type lived on its mirror image. Typically, mathematicians studying the Hodge structures of four-folds don’t think about curve counts like these: They only come up in seemingly unrelated areas of math, like string theory. But if the mirror symmetry program is true, then the number of curves on the mirror image should illuminate features of the original four-fold’s Hodge structure.
In particular, Katzarkov wanted to break the mirror image’s curve count into pieces, then use the mirror symmetry program to show that there was a corresponding way to break up the four-fold’s Hodge structure. He could then work with these pieces of the Hodge structure, rather than the whole thing, to show that four-folds can’t be parameterized. If any one of the pieces couldn’t be mapped to a simple 4D space, he’d have his proof.
But this line of reasoning depended on the assumption that Kontsevich’s mirror symmetry program was true for four-folds. “It was clear that it should be true, but I didn’t have the technical ability to see how to do it,” Katzarkov said.
He knew someone who did have that ability, though: Kontsevich himself.
But his friend wasn’t interested.
Digging In
For years, Katzarkov tried to convince Kontsevich to apply his research on mirror symmetry to the classification of polynomials — to no avail. Kontsevich wanted to focus on the whole program, not this particular problem. Then in 2018, the pair, along with
Tony Pantev
of the University of Pennsylvania, worked on another problem that involved breaking Hodge structures and curve counts into pieces. It convinced Kontsevich to hear Katzarkov out.
Katzarkov walked him through his idea again. Immediately, Kontsevich discovered an alternative path that Katzarkov had long sought but never found: a way to draw inspiration from mirror symmetry without actually relying on it. “After you’ve spent years thinking about this, you see it happening in seconds,” Katzarkov said. “That’s a spectacular moment.”
Kontsevich argued that it should be possible to use the four-fold’s own curve counts — rather than those of its mirror image — to break up the Hodge structure. They just had to figure out how to relate the two in a way that gave them the pieces they needed. Then they’d be able to focus on each piece (or “atom,” as they called it) of the Hodge structure separately.
This was the plan Kontsevich laid out for his audience at the 2019 conference in Moscow. To some mathematicians, it sounded as though a rigorous proof was just around the corner. Mathematicians are a conservative bunch and often wait for absolute certainty to present new ideas. But Kontsevich has always been a little bolder. “He’s very open with his ideas, and very forward-thinking,” said
Daniel Pomerleano
, a mathematician at the University of Massachusetts, Boston, who studies mirror symmetry.
There was a major ingredient they still had no idea how to address, Kontsevich warned: a formula for how each atom would change as mathematicians tried to map the four-fold to new spaces. Only with such a formula in hand could they prove that some atom would never reach a state corresponding to a properly “simplified” four-fold. This would imply that four-folds weren’t parameterizable, and that their solutions were rich and complicated. “But people somehow got the impression that he said it was done,” Pomerleano said, and they expected a proof soon.
When that didn’t come to pass, some mathematicians began to doubt that he had a real solution. In the meantime,
Tony Yue Yu
, then at the French National Center for Scientific Research, joined the team. Yu’s fresh insights and meticulous style of proof, Kontsevich said, turned out to be crucial to the project.
When lockdowns began during the Covid pandemic, Yu visited Kontsevich at France’s nearby Institute for Advanced Scientific Studies. They relished the quiet of the deserted institute, spending hours in lecture halls where there were more blackboards, Yu recalled.
Meeting regularly with Pantev and Katzarkov over Zoom, they quickly completed the first part of their proof, figuring out precisely how to use the number of curves on a given four-fold to break its Hodge structure into atoms. But they struggled to find a formula to describe how the atoms could then be transformed.
What they didn’t know was that a mathematician who had attended Kontsevich’s lecture in Moscow —
Hiroshi Iritani
of Kyoto University — had also started pursuing such a formula. “He was enchanted by my conjecture,” Kontsevich said. “I didn’t know, but he started to work on it.”
In July 2023, Iritani
proved a formula
for how the atoms would change as four-folds were mapped to new spaces. It didn’t give quite as much information as Kontsevich and his colleagues needed, but over the next two years, they figured out how to hone it. They then used their new formula to show that four-folds would always have at least one atom that couldn’t be transformed to match simple 4D space. Four-folds weren’t parameterizable.
Still Processing
When the team posted their proof in August, many mathematicians were excited. It was the biggest advance in the classification project in decades, and hinted at a new way to tackle the classification of polynomial equations well beyond four-folds.
But other mathematicians weren’t so sure. Six years had passed since the lecture in Moscow. Had Kontsevich finally made good on his promise, or were there still details to fill in?
And how could they assuage their doubts, when the proof’s techniques were so completely foreign — the stuff of string theory, not polynomial classification? “They say, ‘This is black magic, what is this machinery?’” Kontsevich said.
“Suddenly they come with this completely new approach, using tools that were previously widely believed to have nothing to do with this subject,” said
Shaoyun Bai
of the Massachusetts Institute of Technology. “The people who know the problem don’t understand the tools.”
Bai is one of several mathematicians now trying to bridge this gap in understanding. Over the past few months, he has co-organized a “reading seminar” made up of graduate students, postdoctoral researchers and professors who hope to make sense of the new paper. Each week, a different mathematician digs into some aspect of the proof and presents it to the rest of the group.
But even now, after 11 of these 90-minute sessions, the participants still feel lost when it comes to major details of the proof. “The paper contains brilliant original ideas,” Bai said, which “require substantial time to absorb.”
Similar reading groups have been congregating in Paris, Beijing, South Korea and elsewhere. “People all over the globe are working on the same paper right now,” Stellari said. “That’s a special thing.”
Hassett likens it to Grigori Perelman’s 2003 proof of the Poincaré conjecture, which also used entirely new techniques to solve a famous problem. It was only after other mathematicians reproduced Perelman’s proof using more traditional tools that the community truly accepted it.
“There will be resistance,” Katzarkov said, “but we did the work, and I’m sure it’s correct.” He and Kontsevich also see it as a major win for the mirror symmetry program: While they’re not closer to proving it, the result provides further evidence that it’s true.
“I’m very old, and very tired,” Katzarkov said. “But I’m willing to develop this theory as long as I’m alive.”
KDE Gear 25.12 released
Linux Weekly News
lwn.net
2025-12-12 16:13:49
KDE has announced the
release of KDE Gear 25.12. This release adds more
"extractors" to the Itinerary travel-assistant
application, improved Git support in the Kate text editor, better PDF
export in Konqueror, and
much more. See the changelog
for all new features, improvements, and bug fix...
KDE has
announced
the
release of KDE Gear 25.12. This release adds more
"extractors" to the
Itinerary
travel-assistant
application, improved Git support in the
Kate
text editor, better PDF
export in
Konqueror
, and
much more. See the
changelog
for all new features, improvements, and bug fixes.
Epic celebrates "the end of the Apple Tax" after court win in iOS payments case
Back in April, District Court Judge Yvonne Gonzalez Rogers
delivered a scathing judgment
finding that Apple was in “willful violation” of
her 2021 injunction
intended to open up iOS App Store payments. That contempt of court finding has now been almost entirely upheld by the Ninth Circuit Court of Appeals, a development that Epic Games’ Tim Sweeney tells Ars he hopes will “do a lot of good for developers and start to really change the App Store situation worldwide, I think.”
The ruling
, signed by a panel of three appellate court judges, affirmed that Apple’s initial attempts to charge a 27 percent fee to iOS developers using outside payment options “had a prohibitive effect, in violation of the injunction.” Similarly, Apple’s restrictions on how those outside links had to be designed were overly broad; the appeals court suggests that Apple can only ensure that internal and external payment options are presented in a similar fashion.
The appeals court also agreed that Apple acted in “bad faith” by refusing to comply with the injunction, rejecting viable, compliant alternatives in internal discussions. And the appeals court was also not convinced by Apple’s process-focused arguments, saying the district court properly evaluated materials Apple argued were protected by attorney-client privilege.
While the district court barred Apple from charging
any
fees for payments made outside of its App Store, the appeals court now suggests that Apple should still be able to charge a “reasonable fee” based on its “actual costs to ensure user security and privacy.” It will be up to Apple and the district court to determine what that kind of “reasonable fee” should look like going forward.
Speaking to reporters Thursday night, though, Epic founder and CEO Tim Sweeney said he believes those should be “super super minor fees,” on the order of “tens or hundreds of dollars” every time an iOS app update goes through Apple for review. That should be more than enough to compensate the employees reviewing the apps to make sure outside payment links are not scams and lead to a system of “normal fees for normal businesses that sell normal things to normal customers,” Sweeney said.
When Mayor-elect Zohran Mamdani takes office on January 1, he will immediately feel the crushing weight of the housing crisis, including 25 years of
astronomical rent inflation
and
350,000
New Yorkers who don't have homes. Like all mayors before him, he will not have control over
tariffs
, the cost of multifamily lending, or other global factors that shape our financialized housing system. Instead, Mayor Mamdani will confront a Rube Goldberg machine of contradictory municipal laws; messy inter- and intra-agency dynamics; complex and contradictory federal, state, and local jurisdictional hierarchies; and complicated relationships that help shape New York City's housing landscape.
Mamdani's campaign astutely identified the Rent Guidelines Board as one important piece of the housing affordability puzzle that would allow him to make the lives of
996,600 households
living in rent-stabilized apartments easier with a rent freeze. Interestingly, Eric Adams's efforts to shield his executive power with a
charter revision commission
and to generate a development frenzy for his second term with the
City of Yes
zoning reforms will make Mamdani's other housing promise—200,000 new, permanently affordable units
over the next decade
—a little bit easier to achieve. And a last-minute lawmaking push by the outgoing City Council may give the incoming administration additional tools, including an
overhauled municipal foreclosure system
and a
legal pathway
for community purchases of some multifamily buildings.
However, all housing policies and programs run into a temporal problem: the promise of an affordable apartment by 2036 is cold comfort if you are struggling to pay rent now. Even Mamdani's signature proposal to freeze stabilized rents would not go into effect until October 2026 (or
October 2027
, if Mayor Adams and
First Deputy Mayor Randy Mastro
's final "fuck you" to the city's tenants
comes to pass
).
Framework Computer had worked to keep their memory prices lower than other laptop vendors amid the ongoing memory shortages throughput the industry worldwide. But today they've finally had to cave in and increase their DDR5 memory modules for the Framework Laptop DIY Editions by 50%.
Due to the ongoing price hikes around system memory with shortages throughout the supply chain, Framework raised their DDR5 memory options today by 50% for the Framework Laptop DIY Edition. Framework Computer is keeping the prior prices for existing pre-orders and also is foregoing any price changes for their pre-built laptops or the Framework Desktop. Framework Computer also lets you order DIY laptops without any memory at all if so desired for re-using existing modules or should you score a deal elsewhere.
Due to their memory pricing said to be more competitive below market rates, they also adjusted their return policy to prevent scalpers from purchasing DIY Edition laptops with memory while then returning just the laptops. The DDR5 must be returned now with DIY laptop order returns.
More details on Framework Computer needing to begin raising system memory prices can be found via the
Framework Blog
.
OpenAI Releases GPT-5.2
Daring Fireball
openai.com
2025-12-12 15:53:32
OpenAI:
In ChatGPT, GPT‑5.2 Instant, Thinking, and Pro will begin rolling
out today, starting with paid plans. In the API, they are
available now to all developers.
Overall, GPT‑5.2 brings significant improvements in general
intelligence, long-context understanding, agentic tool-calling,
and vi...
When I started building
Fedify
, an ActivityPub server framework, I ran into a problem that surprised me: I couldn't figure out how to add logging.
Not because logging is hard—there are dozens of mature logging libraries for JavaScript. The problem was that they're primarily designed for
applications
, not for libraries that want to stay unobtrusive.
I wrote about this
a few months ago
, and the response was modest—some interest, some skepticism, and quite a bit of debate about whether the post was AI-generated. I'll be honest: English isn't my first language, so I use LLMs to polish my writing. But the ideas and technical content are mine.
Several readers wanted to see a real-world example rather than theory.
The problem: existing loggers assume you're building an app
Fedify helps developers build federated social applications using the ActivityPub protocol. If you've ever worked with federation, you know debugging can be painful. When an activity fails to deliver, you need to answer questions like:
Did the HTTP request actually go out?
Was the signature generated correctly?
Did the remote server reject it? Why?
Was there a problem parsing the response?
These questions span multiple subsystems: HTTP handling, cryptographic signatures, JSON-LD processing, queue management, and more. Without good logging, debugging turns into guesswork.
But here's the dilemma I faced as a library author: if I add verbose logging to help with debugging, I risk annoying users who don't want their console cluttered with Fedify's internal chatter. If I stay silent, users struggle to diagnose issues.
I looked at the existing options. With winston or Pino, I would have to either:
Configure a logger inside Fedify (imposing my choices on users), or
Ask users to pass a logger instance to Fedify (adding boilerplate)
There's also
debug
, which is designed for this use case. But it doesn't give you structured, level-based logs that ops teams expect—and it relies on environment variables, which some runtimes like Deno restrict by default for security reasons.
None of these felt right. So I built
LogTape
—a logging library designed from the ground up for library authors. And Fedify became its first real user.
The solution: hierarchical categories with zero default output
The key insight was simple: a library should be able to log without producing any output unless the
application
developer explicitly enables it.
Fedify uses LogTape's hierarchical category system to give users fine-grained control over what they see. Here's how the categories are organized:
Category
What it logs
["fedify"]
Everything from the library
["fedify", "federation", "inbox"]
Incoming activities
["fedify", "federation", "outbox"]
Outgoing activities
["fedify", "federation", "http"]
HTTP requests and responses
["fedify", "sig", "http"]
HTTP Signature operations
["fedify", "sig", "ld"]
Linked Data Signature operations
["fedify", "sig", "key"]
Key generation and retrieval
["fedify", "runtime", "docloader"]
JSON-LD document loading
["fedify", "webfinger", "lookup"]
WebFinger resource lookups
…and about a dozen more. Each category corresponds to a distinct subsystem.
This means a user can configure logging like this:
await configure({ sinks: { console: getConsoleSink() }, loggers: [ // Show errors from all of Fedify { category: "fedify", sinks: ["console"], lowestLevel: "error" }, // But show debug info for inbox processing specifically { category: ["fedify", "federation", "inbox"], sinks: ["console"], lowestLevel: "debug" }, ],});
When something goes wrong with incoming activities, they get detailed logs for that subsystem while keeping everything else quiet. No code changes required—just configuration.
Request tracing with implicit contexts
The hierarchical categories solved the filtering problem, but there was another challenge: correlating logs across async boundaries.
In a federated system, a single user action might trigger a cascade of operations: fetch a remote actor, verify their signature, process the activity, fan out to followers, and so on. When something fails, you need to correlate all the log entries for that specific request.
Fedify uses LogTape's implicit context feature to automatically tag every log entry with a
requestId
:
With this configuration, every log entry automatically includes a
requestId
property. When you need to debug a specific request, you can filter your logs:
And you'll see every log entry from that request—across all subsystems, all in order. No manual correlation needed.
The
requestId
is derived from standard headers when available (
X-Request-Id
,
Traceparent
, etc.), so it integrates naturally with existing observability infrastructure.
What users actually see
So what does all this configuration actually mean for someone using Fedify?
If a Fedify user doesn't configure LogTape at all, they see nothing. No warnings about missing configuration, no default output, and minimal performance overhead—the logging calls are essentially no-ops.
For basic visibility, they can enable error-level logging for all of Fedify with three lines of configuration. When debugging a specific issue, they can enable debug-level logging for just the relevant subsystem.
And if they're running in production with serious observability requirements, they can pipe structured JSON logs to their monitoring system with request correlation built in.
The same library code supports all these scenarios—whether the user is running on Node.js, Deno, Bun, or edge functions, without extra polyfills or shims. The user decides what they need.
Lessons learned
Building Fedify with LogTape taught me a few things:
Design your categories early.
The hierarchical structure should reflect how users will actually want to filter logs. I organized Fedify's categories around subsystems that users might need to debug independently.
Use structured logging.
Properties like
requestId
,
activityId
, and
actorId
are far more useful than string interpolation when you need to analyze logs programmatically.
Implicit contexts turned out to be more useful than I expected.
Being able to correlate logs across async boundaries without passing context manually made debugging distributed operations much easier. When a user reports that activity delivery failed, I can give them a single
jq
command to extract everything relevant.
Trust your users.
Some library authors worry about exposing too much internal detail through logs. I've found the opposite—users appreciate being able to see what's happening when they need to. The key is making it opt-in.
Try it yourself
If you're building a library and struggling with the logging question—how much to log, how to give users control, how to avoid being noisy—I'd encourage you to look at how Fedify does it.
LogTape
isn't trying to replace winston or Pino for application developers who are happy with those tools. It fills a different gap: logging for libraries that want to stay out of the way until users need them. If that's what you're looking for, it might be a better fit than the usual app-centric loggers.
Berlin Approves New Expansion of Police Surveillance Powers
Berlin’s regional parliament has passed a far-reaching overhaul of its “security” law, giving police new authority to conduct both digital and physical surveillance.
The CDU-SPD coalition, supported by AfD votes, approved the reform of the
General Security and Public Order Act (ASOG)
, changing the limits that once protected Berliners from intrusive policing.
Interior Senator Iris Spranger (SPD) argued that the legislation modernizes police work for an era of encrypted communication, terrorism, and cybercrime. But it undermines core civil liberties and reshapes the relationship between citizens and the state.
One of the most controversial elements is the expansion of police powers under paragraphs 26a and 26b. These allow investigators to hack into computers and smartphones under the banner of “source telecommunications surveillance” and “online searches.”
Police may now install state-developed spyware, known as trojans, on personal devices to intercept messages before or after encryption.
If the software cannot be deployed remotely, the law authorizes officers to secretly enter a person’s home to gain access.
This enables police to install surveillance programs directly on hardware without the occupant’s knowledge. Berlin had previously resisted such practices, but now joins other federal states that permit physical entry to install digital monitoring tools.
IT security experts caution that maintaining hidden system vulnerabilities for state use exposes everyone to greater cyber risk. They also question the constitutional legitimacy of combining digital espionage with physical intrusion into private homes.
The revised law also changes how police use body cameras. Paragraph 24c permits activation of bodycams inside private homes when officers believe there is a risk to life or limb.
The government presents this as a measure for officer safety, but many view it as an open door to video surveillance within citizens’ most private settings.
Paragraph 26e expands “cell tower queries,” allowing police to obtain data on every mobile phone connected to a specific tower during a chosen timeframe.
This form of data collection can identify the movements of thousands of uninvolved individuals, including people who might simply have attended a protest.
Under paragraph 24d, automatic license plate recognition systems will be used to record and cross-check vehicle plates with databases. Paragraph 24h also grants police the ability to neutralize or even take control of drones in certain situations.
Paragraph 28a introduces biometric face and voice matching, using publicly available information from the internet.
This gives Berlin’s police the ability to compare surveillance footage with images posted on social media platforms. This as a major step toward automated identification of individuals in public life.
A further innovation, paragraph 42d, authorizes the use of real investigative data, such as photos, videos, and text messages, for “training and testing” artificial intelligence systems.
This breaks the principle that data collected for one purpose cannot later be reused. Because AI models can reveal patterns from the original material, this clause risks turning police archives into training sets for machine learning systems.
The law also lengthens preventive detention periods. Under paragraph 33, individuals may now be held for up to five days, or up to seven in terrorism-related cases.
Lawmakers discussed this provision in connection with protests by the environmental group “Last Generation,” whose civil resistance actions have triggered repeated detentions.
The group NoASOG denounced the law as an attack on civil society, while the Society for Civil Rights (GFF) announced plans to prepare a constitutional complaint.
Berlin’s data protection commissioner, Meike Kamp, had already warned that approving the state trojan amounts to “a frontal attack on the IT security of all citizens.” She said the overall framework creates “a constitutionally highly questionable density of surveillance.”
Berlin now joins the list of German states that have widened police authority in recent years, but the scope of this legislation stands out. It links physical home entry, digital interception, and artificial intelligence analysis under one legal structure, reducing the barriers between policing and private life.
The range of new powers granted to police shifts the balance decisively toward state control of personal information.
Berlin is a city once known for strong privacy traditions and the ASOG reform marks a decisive moment. Whether it withstands constitutional review will determine how far Germany’s commitment to individual privacy can bend in the name of security.
Kali Linux 2025.4 released with 3 new tools, desktop updates
Bleeping Computer
www.bleepingcomputer.com
2025-12-12 15:27:16
Kali Linux has released version 2025.4, its final update of the year, introducing three new tools, desktop environment improvements, and enhanced Wayland support. [...]...
Kali Linux has released version 2025.4, its final update of the year, introducing three new tools, desktop environment improvements, and enhanced Wayland support.
Kali Linux is a distribution designed for cybersecurity professionals and ethical hackers to perform red-teaming, penetration testing, security assessments, and network research.
The distribution is available as an installable operating system or a live environment and supports a wide range of hardware, including Raspberry Pi devices and compatible Android phones through Kali NetHunter.
New tool added to Kali Linux 2025.4
Every new Kali release brings a few fresh tools to play with, and this update is no exception.
evil-winrm-py
- Python-based tool for executing commands on remote Windows machines using the WinRM
hexstrike-ai
- MCP server that lets AI agents autonomously run tools
Desktop environment updates
Kali Linux 2025.4 brings many new updates to its desktop environments, including Gnome 49, KDE Plasma, and Xfce.
GNOME 49 includes refreshed themes, a new Showtime video player, reorganized tool folders in the app grid, and new shortcuts for quickly opening a terminal. GNOME also entirely removes X11 support in this release, now running solely on Wayland.
The developers also added support for keyboard shortcuts to open a terminal quickly.
"Another quality-of-life improvement is the addition of a shortcut to quickly open a terminal (finally!), using Ctrl+Alt+T or Win+T - just like in our other desktops," explains the
Kali Linux 2025.4 announcement
.
Gnome app grid layout
Source: Kali
KDE Plasma has been updated to version 6.5, introducing improved window tiling, an enhanced screenshot tool, easier clipboard access, and more flexible fuzzy search in KRunner.
Xfce now supports color themes that offer functionality similar to that already available in GNOME and KDE, allowing users to adjust icons and interface colors more easily.
With GNOME now running entirely on Wayland, the Kali Linux team has added full VM guest utilities support for VirtualBox, VMware, and QEMU.
Kali Nethunter updates
Kali NetHunter received new updates with this release, including expanded device support for Android 16 on the Samsung Galaxy S10 and the OnePlus Nord, and Android 15 on Xiaomi Mi 9.
The NetHunter Terminal has also been restored with updated compatibility for Magisk versions that use interactive mode. This prevents terminal sessions from closing when pressing CTRL+C.
Wifipumpkin3 also sees enhancements, including updated phishing templates and the addition of a preview tab in the NetHunter app.
Other changes
This release also includes additional updates and improvements, including:
The Kali Live image is now distributed only via BitTorrent, as its size has grown too large for traditional HTTP downloads.
Three new community mirrors have been added in Asia and one in the United States to improve download availability.
Kali Cloud and the Kali WSL app received several behind-the-scenes improvements and reliability fixes.
Broken IAM isn't just an IT problem - the impact ripples across your whole business.
This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.
This little guy doesn't have an HDMI port, Ethernet, or even USB. It's a special version of the 'Compute Module' line of boards. Little Raspberry Pi 'System on Modules' (SoMs), they're called.
Compute Modules are entire Linux computers about the size of a regular desktop CPU that you 'plug in' to another board, to give it life.
Compute modules are everywhere, in kiosks, signage,
3D printers
, and even the new
Ableton Move
. If you just need a little bit of Linux for networking and remote control, these are perfect for that.
And the CM0 is now the smallest version, a little bigger than a postage stamp.
But unlike all the other Compute Modules, the CM0 has castellated edges like a Pico. That way, a company integrating this into their product can just pick and place it and solder it onto their main PCB, instead of working with more delicate board-to-board connectors.
But why is this only in China? I'll get to that, but first I wanted to thank EDAtec for sending a CM0 and their CM0NANO dev board for testing. Without them, I don't think I'd ever be able to show these Pis to you.
Video
I posted this story to my YouTube channel, but if you're on the blog already, chances are you favor reading over video, so scroll on!
ED-CM0NANO
EDAtec's CM0NANO seems to be the official IO board for the CM0. It breaks out every feature on the
RP3A0 chip
at the heart of the Pi Zero 2 and CM0.
There's 10/100 Ethernet through a little USB to Ethernet chip (
CoreChips SR9900A
), two USB 2.0 ports, full-size HDMI, and USB-C for power and flashing the eMMC. Then there are display and camera connectors, GPIO, and a few more headers.
To flash the onboard eMMC, I had to switch the
RPI_BOOT_SW
switch towards the RTC battery slot, then use
rpiboot
to mount it on my Mac. Then I used Raspberry Pi Imager to flash Pi OS 13 on it.
The eMMC on here is
very
slow
compared to what I'm used to with the Pi 5 generation, like on the CM5. Its top speed seems to be around 19-20 MB/sec.
Once it's flashed, it's a full Linux computer, complete with Raspberry Pi's desktop environment.
EDAtec has a
firmware support package
you can install from their package repository, and once that's done, I did what nobody should do on this small of a computer: fired up Chromium.
Browsing the web on here is almost completely out of the question, since it only has 512 Megs of RAM—which is so little it pops a warning saying Chromium should only be used with 1 GB of more of RAM!
I did try browsing this website, and it took something like a minute to just quit the browser, after I was clicking the X to close the tab over and over again!
But with WiFi, Ethernet, USB, HDMI, and everything else the Pi ecosystem has to offer, some products that just want to slap a well-supported Linux environment on top of their product (and not integrate an SoC, memory, storage, and wireless chip) now have this.
Global distribution possibilities
Do I think companies and makers here in the US and over in other parts of the world would also benefit from the CM0? Yes. Do I think it'll happen? Doubtful.
The Zero 2 W and CM0 share something in common, besides their entire architecture:
No plans to make it available outside China at the moment, but we'll see how we get on.
That was back
before
the RAM shortages got bad.
I followed up asking a Pi engineer about it, and it sounds like one big problem is the RP3A0 chip that integrates an LPDDR2 RAM chip stacked on top of the Pi's SoC.
He said the CM0 would compete with Pi Zero 2 for LPDDR2 memory, which is in shorter supply these days (it's not being produced anymore, so stocks will only become more limited over time), and they want to make sure the popular Zero 2 W can stay in stock for makers and education.
The CM0 is targeted squarely at the lower end market, integrated into products built on assembly lines. So because of that, it's anyone's guess if the CM0 will ever make it out of China.
I'm not doing a full review of the board
here
, because:
In a shocking twist, Keir Starmer’s TikToks are borderline competent
Guardian
www.theguardian.com
2025-12-12 15:08:05
The PM’s social media sortie has not been a total embarrassment, which may be a shame for him The scene opens on the interior of an aeroplane. A suited man in a luxurious seat looks pensively out the window, his face partially obscured, his chin delicately resting on his hand. Continue reading......
A suited man in a luxurious seat looks pensively out the window, his face partially obscured, his chin delicately resting on his hand.
Dreamy synths reverberate as the camera pans to show a fighter jet, hovering above the clouds just past the plane’s wing.
It turns and flies away, its dark shadow set against the warm yellow sunset.
“I’d explain, but it’s classified,” the TikTok video’s caption reads, the username above revealing the identity of the mystery man:
Keir Starmer
.
In the comment section, one user puts a voice to the question on a thousand lips.
“Why is our prime minister aura farming?”
Allow TikTok content?
This article includes content provided by
TikTok
. We ask for your permission before anything is loaded, as they may be using cookies and other technologies. To view this content,
click 'Allow and continue'
.
When the UK prime minister launched his
TikTok
account earlier this week, I assumed we’d get the same slate of cringeworthy content that so many elected officials have given us before.
Stiff line delivery, policy talking points awkwardly shoehorned into already outdated memes, and the general feeling a PR person is holding them at gunpoint just out of shot.
Alas, no. In a shocking twist, Starmer’s TikToks are borderline competent.
The majority of the videos seem to be attempts at ultra short-form cinéma vérité: a camera operator following the prime minister around, catching snippets of him saying good morning to security guards, questioning where chief mouser, Larry the cat, is and greeting the Ukrainian president, Volodymyr Zelenskyy.
Allow TikTok content?
This article includes content provided by
TikTok
. We ask for your permission before anything is loaded, as they may be using cookies and other technologies. To view this content,
click 'Allow and continue'
.
The “peek behind the curtain” style is clearly designed to make the prime minister feel more relatable to young UK voters, and while there’s definitely potential here, all his videos share the same fatal flaw.
Starmer cares about looking cool.
“Aura farming” is an internet term for someone posting content trying to seem effortlessly suave, handsome or charismatic.
And look, for my own sanity, I have to assume Starmer’s team was being tongue-in-cheek when they captioned that plane video: “I’d explain, but it’s classified” – that they were poking fun at people trying to seem cool on the internet.
But the more I look through his TikToks, with every shot so carefully curated to make Starmer seem competent and in control, the more I began to feel the accusation of “aura farming” fitted.
And on a platform such as TikTok, which trades off vulnerability and intimacy, being caught trying to seem aloof is a crime worse than murder. (Or at least worse than the “millennial pause”, and that’s pretty bad.)
There are some rare examples of politicians feeling authentically at home on the app: in the US,
Alexandria Ocasio-Cortez
has found great success speaking frankly to her iPhone camera from her living room couch. Even a lower-profile politician such as the Australian MP
Julian Hill
has cultivated a dedicated following by sharing his frustrations with the opposition from his cluttered parliamentary office.
Allow TikTok content?
This article includes content provided by
TikTok
. We ask for your permission before anything is loaded, as they may be using cookies and other technologies. To view this content,
click 'Allow and continue'
.
The politicians that truly succeed on TikTok are the ones where you can suspend your disbelief just enough to believe they’re actually hitting “post” themselves. Where a little part of you is holding out hope that they might actually reply to your comment.
But Starmer never gets within a metre of the camera lens, let alone a comment section keyboard. A style, no doubt, influenced by the fact that TikTok is technically banned on government phones, due to data security concerns, and his team is, no doubt, terrified to imply he might actually have the app downloaded.
Allow TikTok content?
This article includes content provided by
TikTok
. We ask for your permission before anything is loaded, as they may be using cookies and other technologies. To view this content,
click 'Allow and continue'
.
Numbers wise, the videos are going well, two of them already crack 1m views, but that lack of intimacy comes at a cost. The comments under any politician’s post are going to be filled with far more vitriol than praise – that’s just how the internet works. What’s notable about Starmer’s is just how generic the comments are.
It’s all “get this clown out”, “vote reform” and the occasional “best prime minister ever”, but barely any mention of the actual content at hand.
Because ultimately, the videos don’t have any content – besides a fleeting sense of novelty, there’s no reason I would ever send them to friends, let alone bring them up at the pub. These videos only exist to prove what a cool guy Starmer is. And he isn’t.
So no, the UK prime minister’s first foray into the world of TikTok hasn’t been an utter embarrassment. But it might have been better if it was.
Like most politicians, Starmer is an innately dorky man and if he is really serious about winning the hearts (and votes) of young people, his TikTok needs to embrace and celebrate that, not unconvincingly hide it away.
I have some ideas for what he could do, and I would explain, but hey, it’s classified.