From profiling to kernel patch: the journey to an eBPF performance fix

Lobsters
www.rovarma.com
2025-12-13 22:02:23
Comments...
Original Article

For the Linux version of Superluminal (a CPU profiler) we make heavy use of eBPF to capture performance data. This is the story about how an innocent profiling session led to a change to the Linux kernel that makes eBPF map-in-map updates much faster.

What is eBPF

eBPF (originally “ e xtended B erkeley P acket F ilter”, though now used as a standalone term) is a powerful system in the Linux kernel that allows you to safely run custom programs directly inside the kernel. These programs can be attached to various hooks in the kernel called tracepoints, kprobes, or perf events. You can think of an eBPF program as C code that executes whenever a specific kernel event occurs. An example of this is the sched_switch tracepoint, which triggers on every thread context switch.

Superluminal uses eBPF to collect performance data such as context switches and sampling events.

eBPF maps

Data exchange between a kernelspace eBPF program and the userspace controlling program (in our case, Superluminal) goes through eBPF “maps”. An eBPF map is a shared memory structure that acts as a bridge between kernel and userspace. Each map represents an underlying data structure; examples of map types are arrays, hash maps, ring buffers, and many more .

eBPF programs running in kernelspace can update maps to send data back to userspace. For example, Superluminal’s eBPF backend uses the ring buffer map type to output performance events (such as context switches and samples) from the eBPF program to userspace. The controlling program can also update maps from userspace to make data available for use in the kernelspace eBPF program.

As explained in a previous article , Superluminal makes use of .eh_frame data in a binary to retrieve stack backtraces when sampling. Since sampling happens in kernelspace through an eBPF program as described above, we need to upload the .eh_frame data to the eBPF program from userspace for each relevant binary so that the eBPF program can make use of the data.

The .eh_frame data is stored in an eBPF map of type BPF_MAP_TYPE_ARRAY_OF_MAPS , which essentially represents a 2D-array. In C++, you could express this as a std::vector<std::vector<UnwindRow>> , where there is one entry in the outer vector per unique binary loaded in the profiled process(es) and the inner vector holds the actual unwind data for that binary.

The process to go from a binary to unwind data being available for use in eBPF is as follows:

  1. The unwind data is extracted from the .eh_frame section. This is described in the linked article, and is already very efficient.
  2. The unwind data is converted to our internal format that’s highly optimized for speed & memory efficiency.
  3. The converted unwind data is uploaded to eBPF through the bpf_map_update_elem userspace function, which inserts the unwind data for each unique binary into the outer array.

From there on, the eBPF programs can make use of the unwind data.

Performance problems are never where you think they are

It is important that the unwind data is made available to eBPF as soon as possible, since the eBPF code won’t be able to unwind callstacks before the unwind data has been uploaded. To lower this latency as far as possible, we use various mechanisms, one of which is precaching the unwind data before profiling starts. This is done by enumerating the needed binaries (i.e. the main executable, and shared libraries it depends on) for each relevant process and then extracting, converting and uploading the unwind data for each binary to eBPF.

We saw in the previous article that the extract step took much longer than expected, which caused this precaching step to take much longer than we wanted. After optimizing that part, the precache step was much faster, but still much slower than we’d expected it to be.

Fortunately, we happen to be developing a CPU profiler, and what’s the point of that if you’re not going to use it? So let’s profile the profiler to see what’s going on.

A profile of this part of the capturing process looks like this:

If you’re not familiar with Superluminal, this is showing the wall-clock timeline for each thread in the process. A green color means the thread is executing code at that point, any other color means it’s waiting on something (i.e. a lock, IO, network, etc).

In this test, there are about 1400 binaries that need to be precached, and the profile shows that this step takes ~830ms end-to-end. The actual work of precaching is spread over the available CPUs using our job scheduler: a job is started for each binary, where each job does the extract/convert/upload for that binary, and then inserts the uploaded data into the outer map.

I’m testing on a machine with 32 logical CPUs, so while 830ms may seem like it’s not worth worrying about, it actually represents ~25 seconds of work spread across those 31 cores (32 minus 1 for the thread that starts the jobs). That feels like it’s way too long for what this is doing, especially with the optimizations we previously made to the unwind data extraction.

We would expect most time to be taken up by the conversion process, since that does the actual work, whereas the upload should just be copying memory from user to kernelspace, and the insert into the outer map should be very fast. But looking at the timeline for the various JobScheduler threads we see surprisingly little actual work happening (i.e. green colors), some minor blips here and there, and a whole lot of waiting (i.e. red colors) instead.

Expanding one of the threads that’s spending all its time waiting and zooming in a bit, we can see what it’s doing in detail:

This is very unexpected.

Just at a glance you can immediately see all time is being taken up by bpf_map_update_elem , highlighted in white. This function is responsible for inserting the unwind data in the outer eBPF map as described above. While there might reasonably be some overhead involved with copying data across the user/kernel boundary, this is excessive.

The function statistics show that there’s a total of 25 seconds in this function alone across all job scheduler threads, with each call taking ~18ms on average:

Function Name:    bpf_map_update_elem
Number of calls:  1363
Total time:       25s 188ms 912µs
Maximum:          82ms 870µs
Top Quartile:     18ms 940µs
Average:          18ms 480µs
Median:           17ms 708µs
Bottom Quartile:  15ms 932µs
Minimum:          123µs

We can also see that when the thread is executing this function, it is in a wait state: the thread overview at the top of the thread shows the red color. This means the function is not actually doing any work: it’s waiting on something. By clicking on the corresponding wait state (i.e. one of the red areas), we can see the callstack that caused that thread to block. In this case the stack that caused the wait looks like this, with the relevant frames highlighted:

So it looks like the bpf_map_update_elem userspace function results in a map_update_elem syscall in the kernel, which calls synchronize_rcu_normal , which is what eventually causes the thread to switch out. This is where you’d normally reach the limit of what you can do with regards to optimization, since this is all happening in kernelspace.

Linux, however, is open source, which means we can dig into the kernel source to better understand what’s going on here.

Down the rabbit hole

Let’s look at map_update_elem first. This is the implementation of the syscall that bpf_map_update_elem eventually results in. Most of the function is not that interesting, just sanity checking inputs. The actual work the function is doing looks like this:

err = bpf_map_update_value(map, fd_file(f), key, value, attr->flags);
if (!err)
    maybe_wait_bpf_programs(map);

The bpf_map_update_value function being called here is a helper function that actually updates the value for the specified key . We can see that there is no direct call to the synchronize_rcu_normal function we’re looking for, but we do see a call to maybe_wait_bpf_programs when bpf_map_update_value succeeds.

Let’s look at the code for it:

static void maybe_wait_bpf_programs(struct bpf_map *map)
{
	/* Wait for any running non-sleepable BPF programs to complete so that
	 * userspace, when we return to it, knows that all non-sleepable
	 * programs that could be running use the new map value. For sleepable
	 * BPF programs, synchronize_rcu_tasks_trace() should be used to wait
	 * for the completions of these programs, but considering the waiting
	 * time can be very long and userspace may think it will hang forever,
	 * so don't handle sleepable BPF programs now.
	 */
	if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS ||
	    map->map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS)
		synchronize_rcu();
}

So we found our call to synchronize_rcu . There are a few things of note here. First of all, this call only happens when the map being updated is of type BPF_MAP_TYPE_HASH_OF_MAPS or BPF_MAP_TYPE_ARRAY_OF_MAPS . These map types are also known as “map-in-map” types. And it so happens that we’re indeed updating a map of type BPF_MAP_TYPE_ARRAY_OF_MAPS as described earlier.

It is very interesting that the call to synchronize_rcu is conditional on the type of the map being updated. If the call was unconditional, then it’s probably there for a very good reason. But the fact that it’s conditional means that there are code paths where this expensive call isn’t needed (i.e. for regular map types), and so that might be an indication we could do something about this.

There is also a comment that explains what this code aims to achieve, though it’s hard to understand the comment without more knowledge of how eBPF works, and in particular how synchronization between userspace & kernelspace works when it comes to data structures like eBPF maps.

So let’s unpack that first.

Synchronization without waiting

As we described earlier, eBPF maps are used for bi-directional data exchange between kernel & userspace. Let’s assume we have an eBPF program that looks like this (pseudocode-ish):

// Equivalent to std::vector<std::vector<UnwindRow>> as described earlier
BPF_MAP_TYPE_ARRAY_OF_MAPS unwindData;

void ContextSwitchHandler()
{
    int key = 10; // some key uniquely identifying a particular binary

    // find the inner array for the key; equivalent to std::vector<UnwindRow>
    void* binaryUnwindData = bpf_map_lookup_elem(&unwindData, &key);

    // do something with binaryUnwindData, for example, unwind the stack
}

The question is: what would you expect to happen when the value for a key in a map (in this case 10 ) is updated from userspace (via bpf_map_update_elem ), while there are still eBPF programs running in kernelspace that are using the “previous” value for that key (in this case binaryUnwindData )?

This kind of concurrent access to a shared datastructure (in this case the eBPF map) requires some kind of synchronization between reader (the eBPF program) and the writer (the userspace program) to prevent the reader from getting its data pulled out from under it. Without synchronization, you have the problem that when the value is updated and the old value is deleted, any readers of that old value may be left with a dangling pointer.

The way the eBPF system (and indeed, the kernel in general) deals with these kinds of synchronization issues is quite elegant.

The key insight is that the synchronization problem here isn’t that the value is updated , the problem is that the old value is deleted . Taking the example of our eBPF program above, this program could continue working with binaryUnwindData just fine, even if the value for key 10 in the map is replaced with a new value, as long as it’s guaranteed that the memory containing binaryUnwindData is not freed until after the eBPF program finishes executing.

The way the kernel makes this guarantee is in essence quite simple. Instead of deleting the old value immediately after an update, the deletion of the old value is queued on a special kernel thread. This kernel thread, typically called rcu_sched or rcu_preempt , waits for the system to reach a state where it is guaranteed that no readers are still accessing any old data. This state is called the “quiescent state”, and the time it takes for the system to reach this state is called the “grace period”. Once the system reaches this state, the kernel thread deletes any queued old values via their associated callback.

The Linux kernel calls this system the R ead- C opy- U pdate, or RCU, system. The reality behind this system/how it works is of course much more complicated than this (extremely) simplified description. For example, the way the kernel determines that the system has reached the quiescent state is quite complicated.

The full details on how this system works are outside the scope of this article, but if you’re curious, see the official RCU documentation or this excellent article.

An important observation about this system is that it’s non-blocking: since the deletion is deferred, the writer doesn’t have to wait for the deletion to complete. In our case, the writer is map_update_elem (via bpf_map_update_elem ) and for non-map-in-map types it returns immediately after updating the value, while the kernel handles freeing the old value at some later point in time.

Armed with this knowledge we can attempt to understand the comment in maybe_wait_bpf_programs again. The relevant part of the comment is this, stripped of the parts that aren’t relevant to understanding this issue:

Wait for any running BPF programs to complete so that userspace, when we return to it, knows that all programs that could be running use the new map value

So what this code is trying to achieve is in some ways the opposite of what bpf_map_update_elem does for non-map-in-map types.

As we just saw, for the regular case, any eBPF programs that are running concurrently with bpf_map_update_elem will continue running with whatever value they retrieved from the map, while bpf_map_update_elem immediately returns to the caller after updating the map. There is therefore no guarantee which “version” of the value for the updated key is in use at any given point in time: it could be the old value, the new value, or a mix of the two.

However, per the comment, for map-in-map types it is apparently important to guarantee that after bpf_map_update_elem returns, the old value is no longer in use: any running eBPF programs should be using the new value. But, since it is not possible to “update” (i.e. patch) already-running eBPF programs to use the new value, there is only one way for bpf_map_update_elem to achieve that guarantee, and that is by waiting for the system to reach the quiescent state we described in the previous section.

That’s exactly what synchronize_rcu does: it blocks until the system reaches that state, turning the normally asynchronous bpf_map_update_elem into a blocking operation. It is essentially a global synchronization point.

That also explains the performance issue we’re seeing. The blocking wait for the system to reach the quiescent state can take an indeterminate amount of time, and is dependent on the state of the system. This can potentially take many milliseconds (we’ve measured 8-20ms across different systems), and we’re calling it across 31 threads.

What’s happening is that we read and convert the unwind data across our job scheduler threads. This runs in parallel and takes very little time, due to previously made optimizations. All jobs then attempt to upload the unwind data they just converted at approximately the same time, and they all hit this blocking wait in bpf_map_update_elem simultaneously. The blocking waits via synchronize_rcu then finish in sequence, which serializes the upload, making the upload step effectively single threaded. After that’s done, the process repeats.

But why

So that’s the what of the performance issue we’re seeing: we’re hitting an expensive synchronization point on every update. But to determine what (if anything) we can do about this, we also need to understand the why :

  1. Why is this guarantee about the new value of the map important?
  2. Why is it apparently only important for these two types of maps, and not the many other map types?

To answer these questions, let’s look at the commit that introduced this code:

The map-in-map frequently serves as a mechanism for atomic snapshotting of state that a BPF program might record. The current implementation is dangerous to use in this way, however, since userspace has no way of knowing when all programs that might have retrieved the “old” value of the map may have completed.

This change ensures that map update operations on map-in-map map types always wait for all references to the old map to drop before returning to userspace.

…that didn’t really help. Fortunately, development on the Linux kernel happens mostly in the open, and each patch has a corresponding mailing list discussion associated with it. In this case, that discussion can be found here . You can read it if you’re interested, but the summary is that this code was added to support the following scenario.

Let’s say you have an eBPF program that looks something like this (pseudocode):

// The statistics we're interested in tracking
enum EStatistics
{
	EStatistics_Duration,
	// ...
}

// Record various EStatistics for context switches. Equivalent to std::unordered_map<EStatistics, std::vector<uint64>>
BPF_MAP_TYPE_HASH_OF_MAPS recordedCSwitchStatistics;

void ContextSwitchHandler()
{
	__u64 start = bpf_ktime_get_ns();

	// ... perform potentially expensive work here ...

	__u64 duration = bpf_ktime_get_ns() - start;

    // find the inner array for the key; equivalent to std::vector<uint64>
	int key = EStatistics_Duration;
	void* durationStatistics = bpf_map_lookup_elem(&recordedCSwitchStatistics, &key);

    // add the duration of this event to the array; equivalent to timestampStatistics.push_back(duration)
	bpf_map_update_elem(durationStatistics, nextIndex++, duration);
}

So this is an eBPF program that runs on every context switch. It does some work to handle the context switch, and it wants to report how long it took back to userspace. To do so, there is a BPF_MAP_TYPE_HASH_OF_MAPS containing statistics. In this case there’s just EStatistics_Duration , but there could be others.

On every run of this program, it records the start & end timestamps of the work it’s doing to calculate the duration. Then it adds that duration to the statistics map. The inner map in this case is a list of all individual durations.

Now, the goal here is for the userspace controlling program to periodically read out the statistics that have been logged so far. Again in pseudocode, this could look like this:

void readStatisticsFromEBPF()
{
	// get the current inner array with the statistics
	int key = EStatistics_Duration;
	void* currentDurationStatistics = bpf_map_lookup_elem(&recordedCSwitchStatistics, &key);

	// do something with the statistics
}

The problem is that there’s now unsynchronized concurrent access to currentDurationStatistics : while userspace is reading the values from the map, the eBPF program can still be writing statistics to it. For this inner map type ( BPF_MAP_TYPE_ARRAY ), concurrent reads and writes aren’t automatically synchronized: it’s essentially shared memory without built-in locking. This is a race because userspace could read a partially updated array or read while eBPF is writing to it, leading to inconsistent data.

We can attempt to solve this by having two arrays: one that userspace is reading from, and one that eBPF is writing to, essentially double buffering:

void readStatisticsFromEBPF()
{
	// get the current inner array with the statistics
	int key = EStatistics_Duration;
	void* oldDurationStatistics = bpf_map_lookup_elem(&recordedCSwitchStatistics, &key);

	// replace (swap) the array in the map with a new one so that eBPF starts writing to that one
	void* newDurationStatistics = create_array(1024);
	bpf_map_update_elem(&recordedCSwitchStatistics, &key, newDurationStatistics);	

	// do something with the statistics
}

This almost works, but the problem is that bpf_map_update_elem is not atomic: as we saw before, it updates the value for the key (in this case EStatistics_Duration ) and then returns before all readers have finished. This means that after it returns, there may still be eBPF programs running that are making use of oldDurationStatistics .

So this is still a race, and it is this race that the commit fixes: with the added synchronize_rcu call, bpf_map_update_elem is atomic for map-in-map types. After it returns, it is guaranteed that the old value of the key (in this case oldDurationStatistics ) is no longer in use by any eBPF programs and is thus safe to do with whatever you want.

Reading the discussion, before ending up at the final commit, the patch went through several iterations.

It started out as a new BPF_SYNCHRONIZE_MAP_TO_MAP_REFERENCES command (syscall) in eBPF that could be issued from userspace as an explicit synchronization point where needed. The maintainers felt that this was exposing too many eBPF implementation details to userspace, and that it would be hard for users to understand exactly what the new command does and when it should be used.

Instead, they suggested just always doing this sync in bpf_map_update_elem for map-in-map types:

I believe the only issue being discussed is user space doesn’t know when it’s ok to start draining the inner map when it was replaced by bpf_map_update syscall command with another map, right? If we agree on that, should bpf_map_update handle it then? Wouldn’t it be much easier to understand and use from user pov?

The original submitter responded that it didn’t seem right to force this synchronization on all users, given the relatively niche usecase:

Maybe with a new BPF_SYNCHRONIZE flag for BPF_MAP_UPDATE_ELEM and BPF_MAP_DELETE_ELEM. Otherwise, it seems wrong to make every user of these commands pay for synchronization that only a few will need.

The maintainers still felt that it would be good idea , as the cost of this was anticipated to be small:

I don’t think extra flag is needed. Extra sync_rcu() for map-in-map is useful for all users. I would consider it a bugfix, since users that examine deleted map have this race today and removing the race is always a good thing especially since the cost is small.

As we’ve seen, however, the cost of this is far from small, but that’s hindsight for you.

Optimizing it

Now that we thoroughly understand the code and problem, we can start thinking about ways to resolve it. Let’s consider our options, starting from the most direct approach.

The most obvious fix would be to remove this sync point from bpf_map_update_elem for map-in-map types and to change it to be an optional sync via an opt-in flag instead, as originally suggested on the mailing list. Unfortunately, this behavior has been in the kernel since 2018. That makes it impossible to change, since any modifications might break existing programs that (perhaps unknowingly) depend on this behavior 1 , and as we all know “ WE DO NOT BREAK USERSPACE 2 . So that’s not a real option.

The next most obvious fix would be to make use of batched eBPF map updates. Right now, the problem is that we’re uploading the unwind data for each binary individually using separate bpf_map_update_elem calls, which means we’re hitting this sync point for each upload. The eBPF API also has a function bpf_map_update_batch since kernel 5.6, which can update multiple elements. Using this function would mean this sync point is hit only once per batch .

For the precache step this would be a perfect fit. We know up front how many binaries we need to upload, so we can relatively simply divide them in batches, which are then all uploaded at the same time. This might still hit the sync point across multiple threads as before, but due to the batching, the number of sync points is much lower. For example, if we choose a batch size of 100, you would only hit the sync point 14 times instead of once per job. That would be a massive improvement.

That being said, the precache step is not the only time where we upload unwind data to eBPF. When a program is running, it might load in (many) additional shared libraries. For example, some applications we’ve tested against dynamically load hundreds of shared libraries at startup. When a shared library is loaded, we also need to upload the corresponding unwind data.

In that case we don’t want to batch uploads, because that increases the latency between the time a library is loaded and the time the unwind data is made available for unwinding to eBPF. This means that when the rate of shared library loading is high, you would still run into this perf issue. We needed a more general solution, so let’s see what other options there are.

Opting out

As we saw, in the original discussion on the mailing list, it was suggested that this explicit sync point should be a flag instead of the default behavior. The patch went the other way, but now that it’s the default, we can also consider adding an opt- out flag to the eBPF API to disable this behavior for cases (like ours) where you know that this is not the behavior you want.

Adding such an opt-out flag is exactly what we suggested on the eBPF kernel mailing list. The discussion around this was productive, initially leaning towards acceptance. But then somebody asked whether modifying the kernel to use synchronize_rcu_expedited instead of synchronize_rcu in this case made any difference to performance.

We weren’t aware of that function beforehand, but reading up on it, synchronize_rcu_expedited is a version of synchronize_rcu that’s supposed to reach the quiescent state of the system much faster. It was a good suggestion to at least try out, since it would be a less invasive change than adding an entirely new userspace flag would be. If this suggestion worked, it would mean the performance of bpf_map_update_elem would just transparently improve for all users, without needing to be aware of a new flag.

This required compiling our own kernel, which took some doing, but we were able to test this change when we got that running. Did it make a difference? See for yourself, and note that this screenshot is taken at the same zoom level as the original:

It makes a huge difference. The total time for the precache step now takes a total of ~26ms instead of the ~830ms it previously did, or 31x faster. Looking at the functions statistics for bpf_map_update_elem shows that the average time in this function is now 59 micro seconds instead of the 18ms it was before, or 305x faster, for a total time of 80ms across the same 31 threads. That is much more reasonable compared to where we started.

While adding an opt-out flag would get this down even further, at this point we felt it was not worth adding that flag anymore, given the other concerns around exposing a new userspace flag.

Why wasn’t this found before

It’s interesting to think about why this bottleneck wasn’t found before, given that this code was introduced in 2018.

When you read articles about profiling on Linux, you’ll often encounter the terms “on-cpu” vs “off-cpu” profiling. On-cpu analysis involves figuring out what code that’s actually running is doing and is typically what a sampling profiler does. Off-cpu analysis in contrast is about figuring out what threads that aren’t currently running are doing, i.e. to investigate what they’re waiting on (a lock, network, etc).

These two kinds of analyses are often described as things you look at separately, with “on cpu” being seen as the thing you look at primarily, and “off cpu” as something you look at occasionally when you need to. This is reflected in the defaults of tools such as perf : when you record using a default commandline such as perf record -o ./perf.data --call-graph dwarf --all-cpus only sampling data (i.e. “on cpu”) will be recorded. It is possible to perform off-cpu analysis with perf , but it requires being aware of the difference, and the specific commandline arguments that are needed to enable it.

In contrast, in Superluminal we take the view that the distinction between the two is irrelevant: when profiling you’re always interested in where your time is going. It doesn’t matter whether your program is spending its time actively executing code (on-cpu) or whether it’s waiting for something (off-cpu). Both things contribute to the total time taken by your program and in today’s multi-core world, off-cpu analysis is as important as on-cpu analysis to understand the performance of software. We therefore always collect both on-cpu and off-cpu data by default to give you the complete picture.

This article hopefully demonstrates why: the bottleneck we found went undetected for 8 years because most performance analysis on Linux is done using purely sampling profilers. In a sampling profiler this bottleneck is invisible, because the root problem is that the bpf_map_update_elem enters a wait state via synchronize_rcu and it’s not executing any code. As a test, now that we know what the issue is, we tried using perf in sampling-only mode to find the same bottleneck, and as expected, perf reported bpf_map_update_elem as taking almost no time at all.

An instrumenting profiler would have done slightly better: even if you’d thought to mark up bpf_map_update_elem , which you most likely wouldn’t have, with instrumentation you’d at least be able to see that the function had high wall-clock time. But it wouldn’t be able to give you any information about why the function takes a long time, since you can only instrument your own code, and not the kernel itself.

Because Superluminal shows both sampling and wait information on a wall-clock timeline with full kernel visibility, however, the problem was immediately obvious and allowed us to find & fix the problem.

Wrapping up

What started out as a regular profiling session of our own code ended up with a trip down the kernel rabbithole, where we discovered and fixed an 8-year-old bottleneck affecting all eBPF map-in-map users. bpf_map_update_elem is now much faster for these map types, resulting in a 31x speedup of capture startup time on our end.

We submitted a patch with this change, which was accepted and will be shipped in the Linux 6.19 kernel update. If you’re using BPF_MAP_TYPE_ARRAY_OF_MAPS or BPF_MAP_TYPE_HASH_OF_MAPS in eBPF, your program will transparently get much faster from 6.19.

So! I guess we’re kernel contributors now.

foreshadowing
foreshadowing
  1. Hyrum’s law : with a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody.

  2. This is from the kernel’s point of view. On Linux, the job of breaking userspace is left to glibc instead, which is more than happy to do so. But that’s another story .

Some surprising things about DuckDuckGo you probably don't know

Hacker News
gabrielweinberg.com
2025-12-13 21:57:28
Comments...
Original Article
  1. We have hundreds of easter-egg logos (featuring our friendly mascot Dax Brown) that surface when you make certain queries on our search engine . Our subreddit is trying to catch ‘em all . They’ve certainly caught a lot, currently 504, but we keep adding more so it’s a moving target. The total as of this post is 594. I’m the one personally adding them in my spare time just for fun and I recently did a Duck Tales episode (our new podcast ) with more details on the process. This incarnation of specialty logos is relatively new, so if you are a long-term user and haven’t noticed them, that’s probably why (aside from of course that you’d have to search one of these queries and notice the subtle change in logo). And, no promises, but I am taking requests.

  1. There is a rumor continuously circulating that we’re owned by Google, which of course couldn’t be farther from the truth . I was actually a witness in the U.S. v. Google trial for the DOJ. I think this rumor started because Google used to own the domain duck.com and was pointing it at Google search for several years. After my public and private complaining for those same years, in 2018 we finally convinced Google to give us the duck.com domain , which we now use for our email protection service, but the rumor still persists.

  2. We’ve been blocked in China since 2014 , and are on-and-off blocked in several other countries too like Indonesia and India because we don’t censor search results .

  3. We’ve been an independent company since our founding in 2008 and been working on our own search indexes for as many years. For over fifteen years now (that whole time) we’ve been doing our own knowledge graph index (like answers from Wikipedia), over ten years for local and other instant-answer indexes (like businesses), and in the past few years we’ve been ramping up our wider web index to support our Search Assist and Duck.ai features. DuckDuckGo began with me crawling the web in my basement, and in the early days, the FBI actually showed up at my front door since I had crawled one of their honeypots .

  4. The plurality of our search traffic now comes from our own browsers. Yes, we have our own browsers with our search engine built in along with a ton of other protections. How do they compare to other popular browsers and extensions, you ask? We made a comparison page so you can see the differences. Our mobile browsers on iOS & Android launched back in 2018 (wow, that’s seven years ago), and our desktop browsers on Mac and Windows in 2022/23. Our iOS browser market share continues to climb and we’re now #3 in the U.S. (behind Safari and Chrome) and #4 on Android (behind Chrome, Samsung, and Firefox). People appreciate all the protections and the front-and-center (now customizable) fire button that quickly clears tabs and data in an (also customizable) animation of fire.

  5. About 13% of U.S. adults self-report as a “current user” of DuckDuckGo. That’s way more than most people think. Our search market share is lower since all of those users don’t use us on all of their devices, especially on Android where Google makes it especially hard. Once you realize that then it is less surprising that we have the highest search market share on Mac at about 4% in the U.S., followed by iOS at about 3%. I’m talking about the U.S. here since about 44% of our searches are from the U.S., and no other country is double digits, but rounding out the top ten countries are Germany, the United Kingdom, France, Canada, India, the Netherlands, Indonesia, Australia, and Japan.

  6. Our approach to AI differs from most other companies trying to shove it down your throat in that we are dedicated to making all AI features private, useful, and optional . If you like AI, we offer private AI search answers at duckduckgo.com and private chat at duck.ai , which are built-into our browsers . If you don’t like or don’t want AI, that’s cool with us too. You can easily turn all of these features off. In fact, we made a noai.duckduckgo.com search domain that automatically sets those settings for you, including a recent setting we added that allows you to hide many AI-generated images within image search. Another related thing you might find surprising is search traffic has continued to grow steadily even since the rise of ChatGPT (with Duck.ai traffic growing even faster).

  7. If you didn’t know we have a browser, you probably also don’t know we have a DuckDuckGo Subscription (launched last year), that includes our VPN , more advanced AI models in Duck.ai, and in the U.S., Personal Information Removal and Identity Theft Restoration . It’s now available in 30 countries with a similar VPN footprint and our VPN is run by us (see latest security audit and free trials ).

  8. Speaking of lots of countries, our team has been completely distributed from the beginning, now at over 300 across about 30 countries as well, with less than half in the U.S. And we’re still hiring . We have a unique work culture that, among other things, avoids standing meetings on Wednesdays and Thursdays. We get the whole company together for a week once a year.

  9. We played a critical role in the Global Privacy Control standard and the creation of search preference menus . I have a graduate degree in Technology and Public Policy and so we’ve done more of this kind of thing than one might expect, even going so far to draft our own Do Not Track legislation before we got GPC going. We also donate yearly to like-minded organizations ( here’s our 2025 announcement ), with our cumulative donations now at over $8 million. Check our donations page for details going back to 2011. We can do this since we’ve been profitable for about that long, and more recently have even started investing in related startups as well.

If this hodge-podge of stuff makes you think of anything, please let me know. I’m not only taking requests for easter-egg logo ideas, but also for stuff to write about.

Share

Recovering Anthony Bourdain's (really) lost Li.st's

Hacker News
sandyuraz.com
2025-12-13 21:18:01
Comments...
Original Article

🍇 At least 1 day ago

Loved reading through GReg TeChnoLogY Anthony Bourdain’s Lost Li.st’s and seeing the list of lost Anthony Bourdain li.st’s made me think on whether at least some of them we can recover.

Having worked in security and crawling space for majority of my career—I don’t have the access nor permission to use the proprietary storages—I thought we might be able to find something from publicly available crawl archives.

Common Crawl

If Internet Archive had the partial list that Greg published, what about the Common Crawl ? Reading through their documentation , it seems straightforward enough to get prefix index for Tony’s lists and grep for any sub-paths.

Putting something up with help of Claude to prove my theory, we have commoncrawl_search.py that makes a single index request to a specific dataset and if any hits discovered, retrieve them from the public s3 bucket—since they are small straight-up HTML documents, seems even more feasible than I had initially thought.

Simply have a python version around 3.14.2 and install the dependencies from requirements.txt . Run the below and we are in business. Now, below, you’ll find the command I ran and then some manual archeological effort to prettify the findings.

NOTE

Images have been lost . Other avenues had struck no luck. I’ll try again later.

Any and all emphasis, missing punctuation, cool grammar is all by Anthony Bourdain. The only modifications I have made is to the layout, to represent li.st as closely as possible with no changes to the content.

NOTE

If you see these blocks, that’s me commenting if pictures have been lost.

Recovering what we lost

From Greg’s page, let’s go and try each entry one by one, I’ll put the table of what I wasn’t able to find in Common Crawl , but I would assume exists elsewhere—I’d be happy to take another look. And no, none of this above has been written by AI, only the code since I don’t really care about warcio encoding or writing the same python requests method for the Nth time. Enjoy!

Things I No Longer Have Time or Patience For

  1. Cocaine

  2. True Detective

  3. Scripps Howard

  4. Dinners where it takes the waiter longer to describe my food than it takes me to eat it.

  5. Beer nerds

Nice Views

I admit it: my life doesn’t suck. Some recent views I’ve enjoyed

  1. Montana at sunset : There’s pheasant cooking behind the camera somewhere. To the best of my recollection some very nice bourbon. And it IS a big sky .

  2. Puerto Rico: Thank you Jose Andres for inviting me to this beautiful beach!

  3. Naxos: drinking ouzo and looking at this. Not a bad day at the office .

  4. LA: My chosen final resting place . Exact coordinates .

  5. Istanbul: raki and grilled lamb and this ..

  6. Borneo: The air is thick with hints of durian, sambal, coconut..

  7. Chicago: up early to go train #Redzovic

If I Were Trapped on a Desert Island With Only Three Tv Series

  1. The Wire

  2. Tinker, Tailor, Soldier, Spy (and its sequel : Smiley’s People)

  3. Edge of Darkness (with Bob Peck and Joe Don Baker )

The Film Nobody Ever Made

Dreamcasting across time with the living and the dead, this untitled, yet to be written masterwork of cinema, shot, no doubt, by Christopher Doyle, lives only in my imagination.

  1. This guy

  2. And this guy

  3. All great films need:

  4. The Oscar goes to..

  5. And

NOTE

Sorry, each item had a picture attached, they’re gone.

I Want Them Back

If you bought these vinyls from an emaciated looking dude with an eager, somewhat distracted expression on his face somewhere on upper Broadway sometime in the mid 80’s, that was me . I’d like them back. In a sentimental mood.

NOTE

There were 11 images here.

Objects of Desire

material things I feel a strange, possibly unnatural attraction to and will buy (if I can) if I stumble across them in my travels. I am not a paid spokesperson for any of this stuff .

  1. Vintage Persol sunglasses : This is pretty obvious. I wear them a lot. I collect them when I can. Even my production team have taken to wearing them.

  2. 19th century trepanning instruments: I don’t know what explains my fascination with these devices, designed to drill drain-sized holes into the skull often for purposes of relieving "pressure" or "bad humours". But I can’t get enough of them. Tip: don’t get a prolonged headache around me and ask if I have anything for it. I do.

  3. Montagnard bracelets: I only have one of these but the few that find their way onto the market have so much history. Often given to the indigenous mountain people ’s Special Forces advisors during the very early days of America’s involvement in Vietnam .

  4. Jiu Jitsi Gi’s: Yeah. When it comes to high end BJJ wear, I am a total whore. You know those people who collect limited edition Nikes ? I’m like that but with Shoyoroll . In my defense, I don’t keep them in plastic bags in a display case. I wear that shit.

  5. Voiture: You know those old school, silver plated (or solid silver) blimp like carts they roll out into the dining room to carve and serve your roast? No. Probably not. So few places do that anymore. House of Prime Rib does it. Danny Bowein does it at Mission Chinese. I don’t have one of these. And I likely never will. But I can dream.

  6. Kramer knives: I don’t own one. I can’t afford one . And I’d likely have to wait for years even if I could afford one. There’s a long waiting list for these individually hand crafted beauties. But I want one. Badly. http://www.kramerknives.com/gallery/

  7. R. CRUMB : All of it. The collected works. These Taschen volumes to start. I wanted to draw brilliant, beautiful, filthy comix like Crumb until I was 13 or 14 and it became clear that I just didn’t have that kind of talent. As a responsible father of an 8 year old girl, I just can’t have this stuff in the house. Too dark, hateful, twisted. Sigh...

  8. THE MAGNIFICENT AMBERSONS : THE UNCUT, ORIGINAL ORSON WELLES VERSION: It doesn’t exist. Which is why I want it. The Holy Grail for film nerds, Welles’ follow up to CITIZEN KANE shoulda, coulda been an even greater masterpiece . But the studio butchered it and re-shot a bullshit ending. I want the original. I also want a magical pony.

NOTE

Each bulleted point had an image too.

Four Spy Novels by Real Spies and One Not by a Spy

I like good spy novels. I prefer them to be realistic . I prefer them to be written by real spies. If the main character carries a gun, I’m already losing interest. Spy novels should be about betrayal.

  1. Ashenden–Somerset Maugham
    Somerset wrote this bleak, darkly funny, deeply cynical novel in the early part of the 20th century. It was apparently close enough to the reality of his espionage career that MI6 insisted on major excisions. Remarkably ahead of its time in its atmosphere of futility and betrayal.

  2. The Man Who Lost the War–WT Tyler
    WT Tyler is a pseudonym for a former "foreign service" officer who could really really write. This one takes place in post-war Berlin and elsewhere and was, in my opinion, wildly under appreciated. See also his Ants of God.

  3. The Human Factor–Graham Greene
    Was Greene thinking of his old colleague Kim Philby when he wrote this? Maybe. Probably. See also Our Man In Havana.

  4. The Tears of Autumn -Charles McCarry
    A clever take on the JFK assassination with a Vietnamese angle. See also The Miernik Dossier and The Last Supper

  5. Agents of Innocence–David Ignatius
    Ignatius is a journalist not a spook, but this one, set in Beirut, hewed all too closely to still not officially acknowledged events. Great stuff.

Hotel Slut (That’s Me)

I wake up in a lot of hotels, so I am fiercely loyal to the ones I love. A hotel where I know immediately wher I am when I open my eyes in the morning is a rare joy. Here are some of my favorites

  1. CHATEAU MARMONT ( LA) : if I have to die in a hotel room, let it be here. I will work in LA just to stay at the Chateau.

  2. CHILTERN FIREHOUSE (London): Same owner as the Chateau. An amazing Victorian firehouse turned hotel. Pretty much perfection

  3. THE RALEIGH (Miami): The pool. The pool!

  4. LE CONTINENTAL (Saigon): For the history.

  5. HOTEL OLOFSSON (Port au Prince): Sagging, creaky and leaky but awesome .

  6. PARK HYATT (Tokyo): Because I’m a film geek.

  7. EDGEWATER INN (Seattle): kind of a lumber theme going on...ships slide right by your window. And the Led Zep "Mudshark incident".

  8. THE METROPOLE (Hanoi): there’s a theme developing: if Graham Greene stayed at a hotel, chances are I will too.

  9. GRAND HOTEL D'ANGKOR (Siem Reap): I’m a sucker for grand, colonial era hotels in Asia.

  10. THE MURRAY (Livingston,Montana): You want the Peckinpah suite

Steaming Hot Porn

from my phone

  1. Bun Bo Hue

  2. Kuching Laksa

  3. Pot au Feu

  4. Jamon

  5. Linguine

  6. Meat

  7. Dessert

  8. Light Lunch

  9. Meat on a Stick

  10. Oily Little Fish

  11. Snack

  12. Soup

  13. Homage

NOTE

Pictures in each have not been recovered.

5 Photos on My Phone, Chosen at Random

Not TOO random

  1. Madeline

  2. Beirut

  3. Musubi

  4. BudaeJiggae

  5. Dinner

NOTE

Shame, indeed, no pictures, there was one for each.

People I’d Like to Be for a Day

  1. Bootsy Collins

  2. Bill Murray

I’m Hungry and Would Be Very Happy to Eat Any of This Right Now

  1. Spaghetti a la bottarga . I would really, really like some of this. Al dente, lots of chili flakes

  2. A big, greasy double cheeseburger. No lettuce. No tomato. Potato bun.

  3. A street fair sausage and pepper hero would be nice. Though shitting like a mink is an inevitable and near immediate outcome

  4. Some uni. Fuck it. I’ll smear it on an English muffin at this point.

  5. I wonder if that cheese is still good?

Observations From a Beach

In which my Greek idyll is Suddenly invaded by professional nudists

  1. Endemic FUPA. Apparently a prerequisite for joining this outfit.

  2. Pistachio dick

  3. 70’s bush

  4. T-shirt and no pants. Leading one to the obvious question : why bother?

Guilty Pleasures

  1. Popeye’s Mac and Cheese

  2. The cheesy crust on the side of the bowl of Onion Soup Gratinee

  3. Macaroons . Not macarons . Macaroons

  4. Captain Crunch

  5. Double Double Animal Style

  6. Spam Musubi

  7. Aerosmith

Some New York Sandwiches

Before he died, Warren Zevon dropped this wisdom bomb: "Enjoy every sandwich". These are a few locals I’ve particularly enjoyed:

  1. PASTRAMI QUEEN: (1125 Lexington Ave. ) Pastrami Sandwich. Also the turkey with Russian dressing is not bad. Also the brisket.

  2. EISENBERG'S SANDWICH SHOP: ( 174 5th Ave.) Tuna salad on white with lettuce. I’d suggest drinking a lime Rickey or an Arnold Palmer with that.

  3. THE JOHN DORY OYSTER BAR: (1196 Broadway) the Carta di Musica with Bottarga and Chili is amazing. Is it a sandwich? Yes. Yes it is.

  4. RANDOM STREET FAIRS: (Anywhere tube socks and stale spices are sold. ) New York street fairs suck. The same dreary vendors, same bad food. But those nasty sausage and pepper hero sandwiches are a siren song, luring me, always towards the rocks. Shitting like a mink almost immediately after is guaranteed but who cares?

  5. BARNEY GREENGRASS : ( 541 Amsterdam Ave.) Chopped Liver on rye. The best chopped liver in NYC.

Great Dead Bars of New York

A work in progress

  1. SIBERIA in any of its iterations. The one on the subway being the best

  2. LADY ANNES FULL MOON SALOON a bar so nasty I’d bring out of town visitors there just to scare them

  3. THE LION'S HEAD old school newspaper hang out

  4. KELLY'S on 43rd and Lex. Notable for 25 cent drafts and regularly and reliably serving me when I was 15

  5. THE TERMINAL BAR legendary dive across from port authority

  6. BILLY'S TOPLESS (later, Billy’s Stopless) an atmospheric, working class place, perfect for late afternoon drinking where nobody hustled you for money and everybody knew everybody. Great all-hair metal jukebox . Naked breasts were not really the point.

  7. THE BAR AT HAWAII KAI. tucked away in a giant tiki themed nightclub in Times Square with a midget doorman and a floor show. Best place to drop acid EVER.

  8. THE NURSERY after hours bar decorated like a pediatrician’s office. Only the nursery rhyme characters were punk rockers of the day.

Lost page

It was surprising to see that only one page was not recoverable from the common crawl.

What’s next?

I’ve enjoyed this little project tremendously—a little archeology project. Can we declare victory for at least this endeavor? Hopefully, we would be able to find images, but that’s a little tougher, since that era’s cloudfront is fully gone.

What else can we work on restoring and setting up some sort of a public archive to store them? I made this a git repository for the sole purpose so that anyone interested can contribute their interest and passion for these kinds of projects.

Thank you and until next time! ◼︎

Workday project at Washington University hits $266M

Hacker News
www.theregister.com
2025-12-13 20:58:50
Comments...
Original Article

The total cost of a Workday implementation project at Washington University in St. Louis is set to hit almost $266 million, it was revealed after the project was the subject of protests from students.

In late October, students demonstrated outside the Faculty Senate demanding the University’s leadership reveal more details about its finances, including its spending on Workday, amid concerns about job losses at the institution.

In an email to Student Life , the institution’s independent student newspaper, David Gray, executive vice chancellor for finance and chief financial officer (CFO), said the total cost of the project was set to reach upwards of $265 million over at least seven years, roughly $16,000 per student.

The student newspaper said the Workday project was broken down into $81 million for financial and human resources services (HCM), $98.9 million for the student application called Sunrise, and $56.5 million for planning, data integration, and financial aid. Meanwhile $23.8 million in the 2026 financial year is for support and $5.7 million for annual licensing.

The project started with HCM in 2018, which went live in 2021. The student application started planning in 2020 and went live in 2024 and 2025.

“The legacy student information system was in its last phase of life. It was a 1990s era set of fragile, homegrown applications including WebSTAC, WebFAC, SIS Admin and other platforms. With the transition, the University replaced nearly 80 separate student systems with Workday,” Gray told the newspaper.

We contacted both the University and Workday for comment and will update this article if we hear back.

Washington University in St. Louis is a private research university in Missouri. It is not to be confused with the University of Washington, a public university in Washington State.

Co-incidentally, the latter has also implemented Workday in a project which similarly attracted criticism. In March last year, hundreds of research grants were stuck in processing limbo, as the institution grappled with the $340 million implementation.

The US West Coast university spent more than five years shifting to a centralized cloud-based SaaS finance and HR system. At the time, it said it had made significant progress with its workstreams, but there was still more to do.

In late 2024, Workday CEO Carl Eschenbach told The Register that more than 90 percent of the SaaS HR and finance application vendor's rollouts were a success, putting aside the company's high-profile difficulties in Maine and Iowa state-level projects. ®

Purdue University Approves New AI Requirement for All Undergrads

Hacker News
www.forbes.com
2025-12-13 20:54:32
Comments...
Original Article
Purdue University in West Lafayette, IN.

As part of its larger AI strategy, Purdue University will require all undergraduates to demonstrate basic AI competency, beginning next year.

getty

Purdue University will begin requiring that all of its undergraduate students demonstrate basic competency in artificial intelligence starting with freshmen who enter the university in 2026.

The new “AI working competency” graduation requirement was approved by the university’s Board of Trustees at its meeting on December 12. It’s part of a broader AI@Purdue strategy that spans five areas: Learning with AI, Learning about AI, Research AI, Using AI and Partnering in AI.

“The reach and pace of AI’s impact to society, including many dimensions of higher education, means that we at Purdue must lean in and lean forward and do so across different functions at the university,” said Purdue President Mung Chiang in a news release. “AI@Purdue strategic actions are part of the Purdue Computes strategic initiative, and will continue to be refreshed to advance the missions and impact of our university.”

The requirement will be embedded into every undergraduate program at Purdue, but it won’t be done in a “one-size-fits-all” manner. Instead, the Board is delegating authority to the provost, who will work with the deans of all the academic colleges to develop discipline-specific criteria and proficiency standards for the new campus-wide requirement. Chiang said students will have to demonstrate a working competence through projects that are tailored to the goals of individual programs. The intent is to not require students to take more credit hours, but to integrate the new AI expectation into existing academic requirements.

Although the requirement doesn’t officially kick in until next fall, some of the underlying educational resources and innovations will be made available to currently enrolled students as soon as the spring semester.

While the news release claimed that Purdue may be the first school to establish such a requirement, at least one other university has introduced its own institution-wide expectation that all its graduates acquire basic AI skills. Earlier this year, The Ohio State University launched an AI Fluency initiative , infusing basic AI education into core undergraduate requirements and majors, with the goal of helping students understand and use AI tools— no matter their major.

Purdue wants its new initiative to help graduates:

  • Understand and use the latest AI tools effectively in their chosen fields, including being able to identify the key strengths and limits of AI technologies;
  • Recognize and communicate clearly about AI, including developing and defending decisions informed by AI, as well as recognizing the influence and consequences of AI in decision-making;
  • Adapt to and work with future AI developments effectively.

Purdue Provost Patrick Wolfe said that it was “absolutely imperative that a requirement like this is well informed by continual input from industry partners and employers more broadly,” and therefore he has “asked that each of our academic colleges establishes a standing industry advisory board focusing on employers’ AI competency needs and that these boards are used to help ensure a continual, annual refresh of our AI curriculum and requirements to ensure that we keep our discipline-specific criteria continually current.”

Purdue already has BA and BS degree programs in AI, and it offers a Masters of Science in Artificial Intelligence as well. Recently, it has taken major steps to develop its AI research capacity in areas such as agriculture and food systems, manufacturing, transportation and logistics, and health sciences, and it has equipped faculty and staff with additional AI resources like Microsoft 365 Copilot .

In November, Purdue and Google announced plans to strengthen their educational and research partnership, and the university has collaborated with Apple to launch a Spatial Computing Hub on campus. You can learn more about Purdue’s overall AI resources and strategy here .

As nearly every business sector adopts artificial intelligence into its core operations, creating a growing demand for workers with basic AI skills, look for more colleges and universities to place a new emphasis on how best to educate students about artificial intelligence tools. New AI majors and minors are being introduced, interdisciplinary AI centers are being formed, and faculty and students are using AI tools to advance research in a wide range of fields.

Not too long ago, colleges’ main concern about AI was how to prevent students from using it to cheat on assignments, short-changing their learning in the process. Now, that apprehension is being replaced by a new priority — preparing students for the demands of a workforce rapidly being transformed by artificial intelligence technologies.

Lucas de Groot, Designer of Calibri, on the State Department’s Switch Back to Times New Roman

Daring Fireball
news.ycombinator.com
2025-12-13 20:38:18
From the LucasFonts account, in a comment on Hacker News: Professional typography can be achieved with both serif and sans-serif fonts. However, Times New Roman — a typeface older than the current president — presents unique challenges. Originally crafted in Great Britain for newspaper printing,...
Original Article

Our studio, LucasFonts, designed Calibri. Here are our CEO Luc(as) de Groot’s thoughts on the matter:

The decision to abandon Calibri on the grounds of it being a so-called “wasteful diversity font” is both amusing and regrettable. Calibri was specifically designed to enhance readability on modern computer screens and was selected by Microsoft in 2007 to replace Times New Roman as the default font in the Office suite. There were sound reasons for moving away from Times: Calibri performs exceptionally well at small sizes and on standard office monitors, whereas serif fonts like Times New Roman tend to appear more distorted. While serif fonts are well-suited to high-resolution displays, such as those found on modern smartphones, on typical office screens the serifs introduce unnecessary visual noise and can be particularly problematic for users with impaired vision, such as older adults.

Professional typography can be achieved with both serif and sans-serif fonts. However, Times New Roman—a typeface older than the current president—presents unique challenges. Originally crafted in Great Britain for newspaper printing, Times was optimised for paper, with each letterform meticulously cut and tested for specific sizes. In the digital era, larger size drawings were repurposed as models, resulting in a typeface that appears too thin and sharp when printed at high quality.

Serif fonts are often perceived as more traditional, but they are also more demanding to use effectively. While a skilled typographer can, in theory, produce excellent results with Times, using it in its default digital form is not considered professional practice.

Calibri, by contrast, incorporates extensive spacing adjustments and language-specific refinements. The digital version of Times New Roman, developed in the early days of computing, offers only minimal kerning and letter-pair adjustments. This is especially evident in words set in all capitals—such as “CHICAGO”—where the spacing is inconsistent: the letters “HIC” are tightly packed, while “CAG” are spaced too far apart. Microsoft cannot rectify these issues without altering the appearance of existing documents.


Faster double-to-string conversion

Lobsters
vitaut.net
2025-12-13 20:00:03
Comments...
Original Article

There comes a time in every software engineer’s life when they come up with a new binary-to-decimal floating-point conversion method. I guess my time has come. I just wrote one, mostly over a weekend: https://github.com/vitaut/zmij .

It incorporates lessons learned from implementing Dragon4 , Grisu and Schubfach along with a few new ideas from myself and others. The main guiding principle is Alexandrescu’s “no work is less work than some work” so a number of improvements come from removing things from Schubfach (conditional branches, computations and even candidate numbers).

Performance

Here is how it performs on dtoa-benchmark :

  • ~68% faster than Dragonbox , the previous leader (at least among algorithms with correctness proofs)
  • ~2 times faster than my Schubfach implementation that closely follows the original paper
  • ~3.5 times faster than std::to_chars from libc++ (Ryu?)
  • ~6.8 times faster than Google’s double-conversion (Grisu3)
  • ~59 times (not percent!) faster than sprintf on macOS (Dragon4?)

Converting a single double takes about 10 to 20 ns on Apple M1.

What are the improvements?

Here is a list of improvements compared to Schubfach:

  • Selection from 1-3 candidates instead of 2-4
  • Fewer integer multiplications in the shorter case
  • Faster logarithm approximations
  • Faster division and modulo
  • Fewer conditional branches
  • More efficient significand and exponent output

Let’s take a look at some of them.

The first small improvement is having a single branch to quickly check for special cases: NaN, infinity, zero or subnormals. There are still additional checks within that path but the common case is more streamlined.

Another improvement is using faster fixed-point logarithm approximations. Schubfach does the following:

// log10_2_sig = round(log10(2) * 2**log10_2_exp)
constexpr int64_t log10_2_sig = 661'971'961'083;
constexpr int log10_2_exp = 41;

// Computes floor(log10(pow(2, e))) for e <= 5456721.
auto floor_log10_pow2(int e) noexcept -> int {
  return e * log10_2_sig >> log10_2_exp;
}

which uses 64-bit multiplication:

floor_log10_pow2(int):
        movsxd  rcx, edi
        movabs  rax, 661971961083
        imul    rax, rcx
        sar     rax, 41
        ret

However, for the range of inputs (exponents) we could use 32-bit approximations:

// log10_2_sig = round(log10(2) * 2**log10_2_exp)
constexpr int log10_2_sig = 315'653;
constexpr int log10_2_exp = 20;

auto floor_log10_pow2(int e) noexcept -> int {
  return e * log10_2_sig >> log10_2_exp;
}

resulting in

floor_log10_pow2(int):
        imul    eax, edi, 315653
        sar     eax, 20
        ret

Dragonbox also uses 32-bit approximations with slightly different constants.

Similarly, we can replace some integer divisions with integer multiplications. Compilers already know how to do this, but we can do better when we know that the range of inputs is small:

// Returns {value / 100, value % 100} correct for values of up to 4 digits.
inline auto divmod100(uint32_t value) noexcept -> divmod_result {
  assert(value < 10'000);
  constexpr int exp = 19;  // 19 is faster or equal to 12 even for 3 digits.
  constexpr int sig = (1 << exp) / 100 + 1;
  uint32_t div = (value * sig) >> exp;  // value / 100
  return {div, value - div * 100};
}

Another optimization and simplification is branchless handling of irregular rounding intervals. I wrote about rounding intervals in my earlier blog post , but for the purposes of this post it is sufficient to know that a rounding interval for a floating-point number is an interval that contains all real numbers that round back to that number. Normally the intervals are symmetric, except when there is a jump in the exponent (the irregular case):

Most algorithms handle irregular intervals via a completely separate path or at least some branching. This is not terrible, because irregular cases are rare for random floating-point numbers. However, it is possible to handle it cheaply and branchlessly, avoiding extra complexity, which is what I did.

A more interesting improvement comes from a talk by Cassio Neri Fast Conversion From Floating Point Numbers . In Schubfach, we look at four candidate numbers. The first two, of which at most one is in the rounding interval, correspond to a larger decimal exponent. The other two, of which at least one is in the rounding interval, correspond to the smaller exponent. Cassio’s insight is that we can directly construct a single candidate from the upper bound in the first case.

This improvement has a nice effect: it allows us to avoid scaling the value itself by a power of 10, because we only need the lower and upper bounds. This saves two 64-bit integer multiplications in the shorter case.

Unfortunately, this does not help in the longer case, but there are improvements to be made there as well. Classic Schubfach first checks whether there is only one candidate from the second set in the rounding interval and returns early in that case. We can combine this check with the closedness check. This seems counterintuitive, because we do more work (sorry, Andrei), but it eliminates a poorly predicted conditional branch and also simplifies the code.

So we go from this:

uint64_t dec_sig_over = dec_sig_under + 1;
bool under_in = lower + bin_sig_lsb <= (dec_sig_under << 2);
bool over_in = (dec_sig_over << 2) + bin_sig_lsb <= upper;
if (under_in != over_in) {
  // Only one of dec_sig_under or dec_sig_over are in the rounding interval.
  return write(buffer, under_in ? dec_sig_under : dec_sig_over, dec_exp);
}

// Both dec_sig_under and dec_sig_over are in the interval - pick the closest.
int cmp = scaled_sig - ((dec_sig_under + dec_sig_over) << 1);
bool under_closer = cmp < 0 || cmp == 0 && (dec_sig_under & 1) == 0;
return write(buffer, under_closer ? dec_sig_under : dec_sig_over, dec_exp);

to this:

// Pick the closest of dec_sig_under and dec_sig_over and check if it's in
// the rounding interval.
int64_t cmp = int64_t(scaled_sig - ((dec_sig_under + dec_sig_over) << 1));
bool under_closer = cmp < 0 || (cmp == 0 && (dec_sig_under & 1) == 0);
bool under_in = (dec_sig_under << 2) >= lower;
write(buffer, (under_closer & under_in) ? dec_sig_under : dec_sig_over,
      dec_exp);

There are also many improvements in significand and exponent output. The simplest one, which has been used for many years in {fmt} and which I learned from Alexandrescu’s talk “Three Optimization Tips for C++”, is using a lookup table to output pairs of decimal digits. This alone halves the number of integer multiplications and is particularly important here, because the significand is often 16–17 digits long.

Another trick is branchless removal of trailing zeros using another small lookup table, which I believe comes from the excellent Drachennest project by Alexander Bolz. There are ideas for improving this further and potentially getting rid of the lookup table entirely.

Is this a new algorithm?

Does it deserve to be called a new algorithm, or is it just an optimization of Schubfach?

I am not sure, but at the very least it deserves a separate GitHub project =).

Where this can be used

This method, or at least some elements of it, will be used in {fmt}, and it is also a good candidate for JSON serialization in Thrift and elsewhere. If you have other applications that could benefit from faster floating-point formatting, feel free to check it out now, or wait until it is integrated into {fmt}.

Thanks to my ISO C++ paper P2587 “to_string or not to_string”, std::to_string will also be able to use this or a similar method. This will make this standard API both performant and actually useful.

Current limitations

Despite the name, the implementation is not fully polished yet. In particular, it currently supports only exponential, also known as scientific, format, although adding fixed format should be straightforward.

“Fun” fact

My former colleague David Gay wrote an early dtoa implementation back at Bell Labs, and it was widely used for many years.


Last modified on 2025-12-13

Go is portable, until it isn't

Lobsters
simpleobservability.com
2025-12-13 19:53:23
Comments...
Original Article

We thought Go would give us a single, portable agent binary for every Linux distro. Turns out… not exactly. But also, kind of yes.

This post kicks off a series about the traps we fell into while building a cross-platform server monitoring agent.

First, some theory . simob is our open source server monitoring agent that powers the Simple Observability platform. We like to think of it as a passive sensor, not a long running program or daemon. Because in the real world a passive sensor does not come with a long list of requirements. It’s small, self contained and can fit inside the existing system. That is the same goal we have for simob: a lightweight standalone binary with no requisites or external dependencies.

The same idea also applies to how we wanted to ship it. We wanted a project that you can compile from source on your development machine and run anywhere across your infrastructure. No complicated pipelines. No third party build services. Just a simple build that produces a portable binary.

Why we chose Go

In the observability world, if you're building an agent for metrics and logs, you're probably writing it in Go. Promtail , Telegraf , Grafana Alloy and many others are all written in Go.

And there are good reasons for that. First it’s compiled. A whole class of runtime errors gets caught before you even run the binary.

Then there is the garbage collector. For something that’s constantly ingesting and forwarding data, not having to manage memory is a massive advantage.

The Goroutines are also an excellent abstraction. We knew our agent would need to manage a lot of parallel task: tailing log files, reading from input plugins, and sending data upstream. We could write clear, sequential-looking code for each task and let the runtime handle the concurrency

And of course, because we thought we could compile it for any platform. "Just set GOOS and GOARCH at compile time and you're done"

The simple stuff

Most of the early work was simple. The Go ecosystem is more than a decade old and very rich. For core metrics collection we relied on gopsutil, a Go port of Python’s psutil . It gives you CPU, memory, network and disk metrics with a pretty clean API. It supports a wide range of operating systems and CPU architectures, removing the need for system specific code that we would otherwise have to write ourselves.

When it starts getting hard, the case of journal collector

Things became more complex once users asked for systemd journal log support. Journal logs are not stored in plain text. They use a binary format and live in /var/log/journal or /run/log/journal (depending on whether persistent logging is enabled). The format is structured, indexed and can include inline compression.

We had two options. The first was to write our own parser. The file format is documented and the systemd source is available

Tools like Kaitai Struct could help us generate the parser code. It was not impossible. But it required time and careful reading of both the spec and the real implementation.

"Note that the actual implementation in the systemd codebase is the only ultimately authoritative description of the format, so if this document and the code disagree, the code is right"

— A comforting note from the systemd journal documentation. Nothing says "stable, well-documented binary format" like the docs telling you they might be wrong.

Our real concern was compatibility. We wanted a binary that works everywhere. That means support for past, current and future version of the journal format. We did not want to spend time maintaining a backward compatible parser or doing code archaeology. So this option was discarded.

The second option was to use the C API provided by systemd for reading the journal. A Go wrapper already exists. It exposes the journald C API directly. On paper this looked like the right solution, so this is what we chose.

Once we started using it, Go added some constraints. Because the wrapper calls the C API directly, the systemd library is dynamically linked. It must be present on the target machine at runtime. That part is fine. A machine without systemd has no journal logs to collect anyway. It does, however, introduce new build problems.

The first problem is that the build breaks on non systemd systems such as macOS. Since libsystemd is not available, you cannot build from or cross compile to Linux. You must build from a Linux system.

This affects both release builds and development builds. You cannot even run go run locally on a non systemd machine because the compiler cannot find the systemd library. Thankfully Go has build tags to tell the compiler what to include on each platform.

This line instructs the Go compiler to only build this file on Linux systems

It does add some code bloat, since a stub file is required for other systems so the package still compiles.


    // myfunc_linux.go
    //go:build linux

    package mypkg

    func MyFunc() string {
      // real Linux implementation
    }

    // myfunc_stub.go
    //go:build !linux

    package mypkg

    func MyFunc() string {
      // "stub for other systems"
    }
            

Separate files with build tags let you provide a real implementation for Linux while keeping a stub so the package still compiles elsewhere.

The second problem is that libsystemd differs between architectures. You need an amd64 version to build an amd64 binary and an arm64 version to build an arm64 binary. You cannot simply set GOARCH to produce every target from one worker. Each architecture build must run on a worker that has the matching libsystemd.

The glibc problem

There is another issue that shows up and is much harder to spot at first.

Go has a build flag called CGO_ENABLED . When it is enabled, the Go compiler links any C dependencies dynamically. This includes explicit C wrappers, like the sdjournal package, but also indirect calls inside the Go standard library. A common example is DNS resolution, which relies on glibc on Linux systems. With CGO_ENABLED set to 1, the final binary links to libc at runtime.

The default value depends on the environment. It is enabled by default when building natively on a system that supports cgo. It is disabled when cross compiling or when the C compiler is not available on the PATH . These defaults usually make sense. You generally do not want to enable cgo for cross compilation or for targets where glibc does not exist, such as Windows.

The problem is that a dynamically linked libc does not work on all Linux systems. Some Linux distributions do not use glibc. Mainly Alpine Linux, that uses musl. This means a binary built for a Linux system with CGO_ENABLED will work on Ubuntu or Debian but will fail at runtime on Alpine.


  /bin/sh: ./simob: Permission denied
            

Don't get fooled by the "Permission denied". On Alpine and other musl systems, this error, when permissions are clearly set, almost always means the kernel can't find the required glibc dynamic linker.

This forces you to build a separate version of the agent for non glibc systems.

So, is Go the problem?

Not really. Go behaved exactly as documented. We were the ones assuming that "portable" meant "effortless". Once we pulled in low-level C libraries and tarted targeting a mix of glibc and non-glibc systems, the simple story fell apart. None of it is dramatic, just a set of constraints you only notice once you trip over them.

Our initial idea of building everything on a laptop and shipping the same binary everywhere did not survive for long. We now rely on GitHub Actions with the right runners for each architecture. It is more moving parts than we wanted, but it works and it stays out of the critical path.

Local builds are still possible with containers or emulation, although a bit more clunky than we hoped.

In the end the build pipeline is more complicated than we imagined, but the binaries we ship remain small and self-contained. That was the original goal, and we managed to keep that part intact.

what is a build system, anyway?

Lobsters
jyn.dev
2025-12-13 19:40:53
Comments...
Original Article

Andrew Nesbitt recently wrote a post titled What is a Package Manager ? This post attempts to do the same for build systems.

big picture

At a high level, build systems are tools or libraries that provide a way to define and execute a series of transformations from input data to output data that are memoized by caching them in an object store .

Transformations are called steps or rules 1 and define how to execute a task that generates zero or more outputs from zero or more inputs. A rule is usually the unit of caching ; i.e. the cache points are the outputs of a rule, and cache invalidations must happen on the inputs of a rule. Rules can have dependencies on previous outputs, forming a directed graph called a dependency graph . Dependencies that form a cyclic graph are called circular dependencies and are usually banned. 2

Outputs that are only used by other rules, but not “interesting” to the end-user, are called intermediate outputs .

A output is outdated , dirty , or stale if one of its dependencies is modified, or, transitively , if one of its dependencies is outdated. Stale outputs invalidate the cache and require the outputs to be rebuilt . An output that is cached and not dirty is up-to-date . Rules are outdated if any of their outputs are outdated. If a rule has no outputs, it is always outdated.

Each invocation of the build tool is called a build . A full build or clean build occurs when the cache is empty and all transformations are executed as a batch job . A cache is full if all its rules are up-to-date. An incremental build occurs when the cache is partially full but some outputs are outdated and need to be rebuilt. Deleting the cache is called cleaning .

A build is correct or sound if all possible incremental builds have the same result as a full build. 3 A build is minimal (occasionally optimal ) if rules are rerun at most once per build, and only run if necessary for soundness ( Build Systems à la Carte , Pluto ).

In order for a build to be sound, all possible cache invalidations must be tracked as dependencies.

A build system without caching is called a task runner or batch compiler . Note that task runners still often support dependencies even if they don't support caching. Build systems with caching can emulate a task runner by only defining tasks with zero outputs, but they are usually not designed for this use case. 4

Some examples of build systems: make , docker build , rustc. Some examples of task runners: just , shell scripts, gcc .

specifying dependencies

A build can be either inter-process , in which case the task is usually a single process execution and its input and output files, or intra-process , in which case a task is usually a single function call and its arguments and return values.

In order to track dependencies, either all inputs and outputs must be declared in source code ahead of time, or it must be possible to infer them from the execution of a task.

Build systems that track changes to a rule definition are called self-tracking . Past versions of the rule are called its history ( Build Systems à la Carte ).

The act of inferring dependencies from runtime behavior is called tracing . If a traced rule depends on a dependency that hasn’t been built yet, the build system may either error, suspend the task and resume it later once the dependency is built, or abort the task and restart it later once the dependency is built ( Build Systems à la Carte ).

Inter-process builds often declare their inputs and outputs, and intra-process builds often infer them, but this is not inherent to the definition. 5

Some example of intra-process builds include spreadsheets, the wild linker, and memoization libraries such as python’s functools.cache .

applicative and monadic structure

A build graph is applicative if all inputs, outputs, and rules are declared ahead of time. We say in this case the graph is statically known . Very few build systems are purely applicative, almost all have an escape hatch.

The graph is monadic if not all outputs are known ahead of time, or if rules can generate other rules dynamically at runtime. Inputs that aren’t known ahead of time are called dynamic dependencies . Dynamic dependencies are weaker than a fully monadic build system, in the sense that they can express fewer build graphs. 6

Build systems that do not require declaring build rules are always monadic.

Some examples of monadic build systems include Shake , ninja dyndeps , and Cargo build scripts.

Some examples of applicative build systems include make (with recursive make disallowed), Bazel (excluding native rules), and map/reduce libraries with memoization, such as this unison program .

early cutoff

If a dirty rule R has an outdated output, reruns, and creates a new output that matches the old one, the build system has an opportunity to avoid running later rules that depend on R. Taking advantage of that opportunity is called early cutoff .

See the rustc-dev-guide for much more information about early cutoff. 7

rebuild detection

In unsound build systems, it’s possible that the build system does not accurately detect that it needs to rebuild. Such systems sometimes offer a way to force-rerun a target: keeping the existing cache, but rerunning a single rule. For inter-process build systems, this often involves touch ing a file to set its modification date to the current time.

the executor

A build executor runs tasks and is responsible for scheduling tasks in an order that respects all dependencies, often using heuristics such as dependency depth or the time taken by the task on the last run. They also detect whether rule inputs have been modified, making the rule outdated; this is called rebuild detection . The build executor is responsible for restarting or suspending tasks in build systems that support it.

Executors often provide progress reporting , and sometimes allow querying the dependency graph. Occasionally they trace the inputs used by the task to enforce they match the declared dependencies, or to automatically add them to an internal dependency graph.

inter-process builds

In the context of inter-process builds, an artifact is an output file generated by a rule. 8 A source file is an input file that is specific to the current project 9 (sometimes repository or workspace ) as opposed to a system dependency that is reused across multiple projects. A project is loosely defined but generally refers to the set of all input and output files that the build system knows about, usually contained in a single directory. Source files can be generated , which means they are an output of a previous rule.

Build files contain rule definitions, including (but not limited to) task definitions, input and output declarations, and metadata such as a human-readable description of the rule. Inputs are usually split into explicit inputs passed to the spawned process, implicit inputs that are tracked by the build system but not used in the task definition, and order-only inputs that must exist before the rule can execute, but do not invalidate the cache when modified.

Process executions have more inputs than just files, such as the rule itself, environment variables, the current time, the current working directory, and occasionally network services or local daemons 10 .

The set of all inputs that are not source files or command line arguments is called the environment . Processes can be sandboxed to prevent them from depending on the network, a daemon, or occasionally system dependencies; this is sometimes called a sandboxed environment or isolated environment .

System dependencies are more expansive than I think they are often understood to be. They include compilers, linkers, programming language libraries 11 , and static and dynamically linked object files , but also the dynamic loader, language runtime, and various system configuration files. The subset of these dependencies needed for building a minimal program in a given language, along with various tools for inspecting and modifying the outputs at runtime, are called a toolchain . Toolchains are inherently specific to a given language, but sometimes (e.g. in GCC) a single compiler will support multiple languages as inputs.

A build is hermetic (rarely, self-contained or isolated 12 ) if it uses no system dependencies and instead defines all its dependencies in the project ( Bazel ). Sandboxing and hermeticity are orthogonal axes; neither one implies the other. For example, docker builds are sandboxed but not hermetic, and nix shells are hermetic but not sandboxed.

Compiler or linkers sometimes have their own incremental caches . Reusing the cache requires you to trust the compiler to be sound when incrementally rebuilding. This is usually implicit, but hermetic or sandboxed builds require an opt-in to reuse the cache. Bazel calls this kind of reuse a persistent worker .

determinism

A build is deterministic if it creates the same output every time in some specific environment. A build is reproducible if it is deterministic and also has the same output in any environment , as long as the system dependencies remain the same.

remote caching

Caching can be remote or local. Remote caching is almost always unsound unless the build is both hermetic and reproducible (i.e. its only environment dependencies are controlled by the build system).

Downloading files from the remote cache is called materializing them. Most build systems with remote caching defer materialization as long as possible, since in large build graphs the cache is often too large to fit on disk. Builds where the cache is never fully materialized are called shallow builds ( Build Systems à la Carte ).

Remote caching usually, but not necessarily, uses content addressed hashing in a key-value store to identify which artifact to download.

Some example build systems that use remote caching: Bazel, Buck2, nix, docker build .

interface

Build systems usually have a way to run a subset of the build. The identifier used to specify which part of the build you want to run is called a target . 13 Targets are usually the filenames of an artifact, but can also be abstract names of one or more rules. Bazel-descended build systems call these names labels . Make-descended build systems call these phony targets . Some build systems, such as cargo, do not use target identifiers but instead only have subcommands with arguments; the combination of arguments together specifies a set of targets.

Some example targets:

  • make all
  • cargo build --test http_integration
  • buck2 build :main

meta-build systems

Inter-process build systems are often divided into a configuration step and a build step. A build system that only runs the configuration step, and requires another tool for the build step, is called a meta-build system .

Usually this meta-build system discovers the rules that need to be executed (often through file globbing or some other programmatic way to describe dependencies), then serializes these rules into an action graph , which can be stored either in-memory or on-disk. On-disk serialized action graphs are usually themselves build files, in the sense that you can write them by hand but you wouldn't want to.

Configuration steps usually allow the developer to choose a set of configuration flags (occasionally, build flags ) that affect the generated rules.

Some build systems also integrate directly with the package manager , but this is uncommon, and usually the build system expects all packages to be pre-downloaded into a known location.

Some examples of meta-build systems are CMake, meson, and autotools.

VFS

Advanced build systems can integrate with a virtual file system (VFS) to check-out source control files on-demand, rather than eagerly ( EdenFS ).

intra-process builds

The equivalent of system dependencies within a process is non-local state , including environment variables, globals, thread-locals, and class member fields (for languages where this is passed implicitly). Especially tricky are function calls that do inter-process communication (IPC), which are basically never sound to cache. Tracing intra-process builds is very very hard since it’s easy to call a function that depends on global state without you knowing. 14

In this intra-process context, most object stores are in-memory caches . A build system that supports saving ( persisting ) the cache to disk is said to have persistence . The system for persisting the cache is sometimes called a database , even if it is not a general-purpose database in the sense the term is normally used ( Salsa ).

tracing

Tracing intra-process build systems are sometimes called a query system . 15 They work similarly to their inter-process equivalents: the interface looks like normal function calls, and the build system tracks which functions call which other functions, so it knows which to rerun later.

Some examples of tools with tracing intra-process build systems: salsa , the rustc query system .

FRP

Intra-process build systems that allow you to explicitly declare dependencies usually come from the background of functional reactive programming (FRP). FRP is most often used in UI and frontend design, but many of the ideas are the same as the build systems used for compiling programs.

Unlike any of the build systems we've talked about so far, FRP libraries let you look at past versions of your outputs, which is sometimes called remembering state ( React ). To make this easier to reason about, rules can be written as event handlers .

Some examples of libraries with dependency declarations: React .

so, what counts as a build system?

A build system is pretty much anything that lets you specify dependencies on a previous artifact 😄 Some more weird examples of build systems:

  • Github Actions (jobs and workflows)
  • Static site generators
  • Docker-compose files
  • Systemd unit files
  • Excel

Hopefully this post has given you both a vocabulary to talk about build systems and a context to compare them!

bibliography

  1. Nearly all build systems are inconsistent about whether a rule refers to an abstract description of how to build an output (i.e., can be reused for multiple sets of inputs and outputs), or a concrete instantiation of that description for a specific set of inputs and outputs. We have to live with the ambiguity, unfortunately.

  2. Weird things can happen here though; for example early cutoff can allow circular dependencies. This sometimes comes up for generated build.ninja files.

  3. The pluto paper defines this as “after a build, generated files consistently reflect the latest source files”. Neither my definition nor pluto's definition are particularly well-defined if the build is non-deterministic. Defining this formally would probably require constructing an isomorphism between all programs with the same runtime behavior; but “runtime behavior” is not well-defined for a general-purpose build system that can output artifacts that are not programs.

  4. As we'll see later, the reverse is also true: a common design for build systems is to automatically inject cache points into an existing task runner, or to design the rule file to look as similar to a shell script or function call as possible.

  5. In particular, nearly all modern inter-process build systems have a limited form of tracing where they ask the compiler to generate "dep-info" files 16 that show which files were used (usually through imports) by a given source file. Note that this dep-info is not available until after the first time a build has run, and that this only works if the compiler supports it.

  6. For more information about the spectrum of designs between applicative and monadic, see the post-modern build system .

  7. Note that the dev-guide assumes that tasks are expensive relative to the cost of constructing the graph. This is true in the context of rustc, where LLVM codegen 17 normally dominates compilation time, but it isn't true for e.g. spreadsheets .

  8. It's possible for tasks to create files that aren't tracked by the build system, but these aren't called artifacts. I don't know a good word for these; "byproducts" is the closest but some build systems use that to mean any intermediate artifacts.

  9. I'm not super happy with this definition because it conflicts with how compilers use the term, but I do think it describes how most build systems think about files.

  10. Poorly written rules can also depend on which other rules are executing at the same time, which is called a race condition . Note this does not require the rule to be unsound, only for it to use intermediate files the build system doesn’t know about.

  11. for C, header files; for other languages, usually source files or intermediate representations.

  12. Yes, this overlaps with the term for sandboxing. Try to avoid the word "isolated" if possible.

  13. This has no relation to a target platfom , which is related to cross-compiling. I wish we had better names for these things.

  14. I would actually describe this as much harder than tracing an inter-process build system, since there aren't very good systems for tracking memory access . See this post about unstable fingerprints for an idea of what bugs this causes in practice.

  15. This actually has very strong analogies to the way "query" is used in a database context: just like a tracing query system, a database has to be able to restart a query's transaction if the data it's trying to access has been modified.

  16. What is a dep-info file? Good question! It's a makefile. It's literally a makefile. Don't you just love proving backslashes by induction ?

  17. Or, more rarely, type-checking, borrow-checking, or coherence checking.

Standards for a Responsible AI Future: Reflections on the Seoul Statement

Internet Exchange
internet.exchangepoint.tech
2025-12-11 18:02:32
The statement comes at a time when principles of justice, dignity, and human rights are increasingly politicized, questioned, or treated as negotiable....
Original Article
internet governance

The statement comes at a time when principles of justice, dignity, and human rights are increasingly politicized, questioned, or treated as negotiable.

Standards for a Responsible AI Future: Reflections on the Seoul Statement
Photo by Nicole Avagliano / Unsplash

By Jacobo Castellanos, Coordinator of the Technology, Threats, and Opportunities team at WITNESS

On December 2, the ITU, ISO and IEC issued their Seoul Statement : a vision for AI standards that account for global contexts, rights impacts, and real-world harms.

The Seoul Statement includes four core commitments:

  • Integrating socio-technical perspectives into standards: ensuring AI standards address not just algorithms and data, but real-world impacts on people, societies, and the environment.
  • Embedding human rights and universal values: protecting dignity, privacy, fairness, and non-discrimination throughout AI design and governance.
  • Building an inclusive, multi-stakeholder community: enabling governments, industry, researchers, and civil society to shape global AI norms together.
  • Strengthening public–private collaboration and capacity-building: reducing global inequalities so all countries and communities can meaningfully benefit from AI.

This vision is not only welcome; it is a meaningful signal of hope.

It comes at a time when principles of justice, dignity, and human rights—once a shared reference point for international cooperation and for civil society’s engagement with governments and companies—are increasingly politicized, questioned, or treated as negotiable.

Why this matters

Standards, like regulation, form the structural base of the AI stack. By committing to explicitly consider human rights and real-world impact into standards development, the ITU, ISO, and IEC can help effectively steer AI’s impact toward protecting human rights, strengthening the information ecosystem, and fostering responsible innovation.

Human rights and civil society groups have been calling for this shift for years (see for example OHCHR’s latest report ). Standards alone won’t solve every AI concern, but they can create a pathway, together with regulation and tooling, that will lead towards rights protections and limiting misuse. At WITNESS, we work at the intersection of technology and human rights, and we have seen this firsthand in our work with the Coalition for Content Provenance and Authenticity (C2PA ), where a harm assessment continues to shape both the design of the standard and the ecosystem around it. By developing Content Credentials, a form of tamper-evident metadata that travels with an image, video, or audio file to show when, where, and how it was created or modified, C2PA offers a practical example of how standards can embed rights considerations from the ground up.

From Promise to Practice

While this vision is promising, a pressing question remains: How will these commitments be translated into action?

The Seoul Statement was presented during a two-day summit held in Seoul, but concrete plans for its implementation were not shared. Representatives from the ITU, ISO, and IEC did not publicly outline how this vision would be realized, and no details were provided regarding budgets, mechanisms, timelines, or accountability measures.

Standards work is inherently slow and resource-intensive. Incorporating socio-technical and human rights considerations adds another layer of complexity that requires significant investment in expertise, time and financial support. Without such investment, the Seoul Statement risks becoming a symbolic gesture rather than a meaningful turning point.

A notable concern was the limited presence of civil society at the Seoul summit. Multistakeholder participation was frequently mentioned, yet only a few human rights groups attended. Government and industry voices were far more visible, which is too narrow a basis for defining future AI norms. For the SDOs’ vision to carry real weight, civil society must be involved consistently, adequately resourced, and included from the beginning, not added in as an afterthought.

A Call to Stay Engaged

Still, there is reason for cautious optimism. The Seoul Statement represents an important first step, formally issued by institutions that will play a fundamental role in shaping the future of AI. By acknowledging that AI standards cannot be “just technical” and must be grounded in human rights and societal wellbeing, it creates a platform to push for meaningful change.

At WITNESS, we will continue to be actively involved in the C2PA, where we co-chair its Threats and Harms Task Force , and we will engage with the World Standards Cooperation’s AI and Multimedia Authenticity Standards Collaboration (ITU, IEC and ISO) as it positions AI standards as a powerful tool for regulation development and enforcement.

We call on civil society, researchers, regulators and funders to remain engaged, not only when milestones are announced, but through the long, technical, often opaque process of drafting, reviewing and implementing standards. We must also hold the ITU, ISO, and IEC accountable to their own vision, while working to extend this commitment to other national and international SDOs, and to the remaining building blocks that sit atop the foundations of regulation and standards in the AI ecosystem.


You can support independent bookstores and get great deals without lining the pockets of billionaires. Help support Internet Exchange by shopping our bookshop The Stack on Bookshop.org. The Stack curates books that connect the dots between code and culture, protocol and protest, theory and practice.


Support the Internet Exchange

If you find our emails useful, consider becoming a paid subscriber! You'll get access to our members-only Signal community where we share ideas, discuss upcoming topics, and exchange links. Paid subscribers can also leave comments on posts and enjoy a warm, fuzzy feeling.

Not ready for a long-term commitment? You can always leave us a tip .

Become A Paid Subscriber

Internet Governance

Digital Rights

Technology for Society

Careers and Funding Opportunities

Opportunities to Get Involved

What did we miss? Please send us a reply or write to editor@exchangepoint.tech .

Pondering Middle East Petrostates as American Media Owners

Daring Fireball
www.businessinsider.com
2025-12-13 17:16:28
Peter Kafka, writing at Business Insider: And last: It’s possible that Middle Eastern countries are investing in an American media conglomerate solely for a financial return, and would have zero interest in the content that conglomerate makes and distributes. But that’s an assertion that many fo...
Original Article

Chief Correspondent covering media and technology

President Donald Trump welcomes Crown Prince and Prime Minister Mohammed bin Salman of Saudi Arabia at the White House,  November 18, 2025

Donald Trump welcomed Saudi Crown Prince Mohammed bin Salman at the White House in November. Now, bin Salman's country is reportedly backing Larry and David Ellison's bid for Warner Bros. Discovery Win McNamee/Getty Images
  • Paramount owners Larry and David Ellison want to buy Warner Bros. Discovery.
  • They are reportedly getting help from the governments of Saudi Arabia, Qatar, and Abu Dhabi.
  • Foreign investors have put money into American media companies before. But this seems meaningfully different.

A deal to combine Paramount and Warner Bros. Discovery would create a media behemoth.

And that behemoth could be partially owned by the governments of Saudi Arabia, Qatar, and Abu Dhabi.

So says Variety , reporting that David and Larry Ellison, who own Paramount and are bidding to buy WBD, are using money from those countries' sovereign wealth funds to finance their proposed deal.

If that story sounds familiar, there's a good reason: In November, Variety reported more or less the same thing — which prompted Paramount to call the story "categorically inaccurate".

But even at the time, it didn't seem implausible that Middle Eastern oil money would be used to help the Ellisons buy WBD . And other outlets had suggested the Ellisons might partner with the Saudis, among others.

Now Variety is doubling down on its initial report. Bloomberg also reports that "Middle East funds" are involved in the Ellisons' bid . A Paramount rep declined to comment. I also reached out to the sovereign wealth funds. The Abu Dhabi Investment Authority declined to comment. I haven't heard back from the others.

But as I noted before, the fact that it's even possible that Middle Eastern petrostates could have ownership stakes in a giant American media conglomerate — one that would control major movie studios, streaming networks, and news outlets — tells us a lot about 2025. A few years ago, this would have seemed like a non-starter; now it seems quite close to happening.

David and Larry Ellison

David and Larry Ellison are charging ahead in their bid to buy Warner Bros Discovery. Eric Charbonneau/Getty Images for The Hollywood Reporter

That's because Paramount, which is competing with Netflix and Comcast in the WBD bidding, still seems like the most likely WBD owner when all of this is done. That's partially because Paramount is offering to buy all of WBD, while Comcast and Netflix only want part of it. And partially because the Ellisons — Larry Ellison in particular — are close to Donald Trump , and we live in a world where people close to Donald Trump often get what they want.

In the absence of anyone involved in the deal talking to me on the record, I can imagine some arguments why a petrostate-backed mega-media conglomerate makes sense:

  • The funds would presumably have minority stakes in a combined Paramount/WBD , and it would presumably remain controlled by Americans.
  • Foreign investors have frequently owned some or all of big, American-based media companies: See, for instance, Japan's Sony, which owns a major movie studio and music label. And Saudi investor Prince Alwaleed bin Talal was a longtime minority investor in Rupert Murdoch's Fox empire; now he has a stake in the company formerly known as Twitter.
  • The Saudi sovereign wealth fund is already set to own almost all of video game giant Electronic Arts , and no one seems to have an issue with that.

A Middle East-financed deal for WBD could raise some eyebrows

All true! But I still think that there are differences that will certainly raise eyebrows, and maybe more forceful pushback, if a combined Ellison/Middle East deal goes forward.

One obvious point: It's one thing to have a private company or investor from another company taking a stake in an American media giant; it's another to have one that's directly controlled by a foreign government.

Another one: As media companies continue to consolidate, the power of the remaining ones gets amplified. On their own for instance, CBS News and CNN have dwindling influence and financial power; a company that combines the two, though, might have more meaningful sway. You can argue that the Saudis owning one of the world's biggest video game companies is also meaningful, but the video game industry never gets the attention it deserves, and that seems likely to continue in this case.

And last: It's possible that Middle Eastern countries are investing in an American media conglomerate solely for a financial return, and would have zero interest in the content that conglomerate makes and distributes. But that's an assertion that many folks would have a hard time taking at face value. And while lots of American companies have sought Middle Eastern funding for years, there was a pause after 2018, following the murder and dismemberment of Washington Post contributor Jamal Khashoggi — a shocking act the CIA concluded was ordered by Saudi Arabia's Crown Prince Mohammed bin Salman himself . (He has denied involvement.)

Now bin Salman might end up owning a piece of major American news outlets and other media arms. How's that going to go over?

Read next

Want to sway an election? Here’s how much fake online accounts cost

Hacker News
www.science.org
2025-12-13 20:48:09
Comments...

Why Twilio Segment moved from microservices back to a monolith

Hacker News
www.twilio.com
2025-12-13 20:30:34
Comments...
Original Article

Goodbye Microservices: From 100s of problem children to 1 superstar

Microservices is a service-oriented software architecture in which server-side applications are constructed by combining many single-purpose, low-footprint network services. The touted benefits are improved modularity, reduced testing burden, better functional composition, environmental isolation, and development team autonomy. The opposite is a Monolithic architecture, where a large amount of functionality lives in a single service which is tested, deployed, and scaled as a single unit.

Twilio Segment adopted this as a best practice early-on, which served us well in some cases, and, as you’ll soon learn, not so well in others.

In the early days of Twilio Segment, we reached a tipping point with a core piece of Twilio Segment’s product . It seemed as if we were falling from the microservices tree, hitting every branch on the way down. Instead of enabling us to move faster, the small team found themselves mired in exploding complexity. Essential benefits of this architecture became burdens. As our velocity plummeted, our defect rate exploded.

Eventually, the team found themselves unable to make headway, with 3 full-time engineers spending most of their time just keeping the system alive. Something had to change. This post is the story of how we took a step back and embraced an approach that aligned well with our product requirements and needs of the team.

Why Microservices worked

Twilio Segment’s customer data infrastructure ingests hundreds of thousands of events per second and forwards them to partner APIs, what we refer to as server-side destinations . There are over one hundred types of these destinations , such as Google Analytics, Optimizely, or a custom webhook.

Years back, when the product initially launched, the architecture was simple. There was an API that ingested events and forwarded them to a distributed message queue. An event, in this case, is a JSON object generated by a web or mobile app containing information about users and their actions. A sample payload looks like the following:

As events were consumed from the queue, customer-managed settings were checked to decide which destinations should receive the event. The event was then sent to each destination’s API, one after another, which was useful because developers only need to send their event to a single endpoint, Twilio Segment’s API, instead of building potentially dozens of integrations. Twilio Segment handles making the request to every destination endpoint.

If one of the requests to a destination fails, sometimes we’ll try sending that event again at a later time. Some failures are safe to retry while others are not. Retry-able errors are those that could potentially be accepted by the destination with no changes. For example, HTTP 500s, rate limits, and timeouts. Non-retry-able errors are requests that we can be sure will never be accepted by the destination. For example, requests which have invalid credentials or are missing required fields.

A flow diagram illustrating a request processing system with components including an API, a queue, a destination worker, and multiple output endpoints such as google-analytics.com, api.optimizely.com, and api.mixpanel.com.

A flow diagram illustrating a request processing system with components including an API, a queue, a destination worker, and multiple output endpoints such as google-analytics.com, api.optimizely.com, and api.mixpanel.com.

At this point, a single queue contained both the newest events as well as those which may have had several retry attempts, across all destinations, which resulted in head-of-line blocking . Meaning in this particular case, if one destination slowed or went down, retries would flood the queue, resulting in delays across all our destinations.

Imagine destination X is experiencing a temporary issue and every request errors with a timeout. Now, not only does this create a large backlog of requests which have yet to reach destination X, but also every failed event is put back to retry in the queue. While our systems would automatically scale in response to increased load, the sudden increase in queue depth would outpace our ability to scale up, resulting in delays for the newest events. Delivery times for all destinations would increase because destination X had a momentary outage. Customers rely on the timeliness of this delivery, so we can’t afford increases in wait times anywhere in our pipeline.

A flow diagram illustrating a data processing pipeline with components including an API, a queue, a destination worker, and multiple output endpoints such as google-analytics.com, api.optimizely.com, and api.mixpanel.com. It also includes annotations indicating "backpressure" and "retry events."

A flow diagram illustrating a data processing pipeline with components including an API, a queue, a destination worker, and multiple output endpoints such as google-analytics.com, api.optimizely.com, and api.mixpanel.com. It also includes annotations indicating "backpressure" and "retry events."

To solve the head-of-line blocking problem, the team created a separate service and queue for each destination. This new architecture consisted of an additional router process that receives the inbound events and distributes a copy of the event to each selected destination. Now if one destination experienced problems, only it’s queue would back up and no other destinations would be impacted. This microservice-style architecture isolated the destinations from one another, which was crucial when one destination experienced issues as they often do.

A flow diagram illustrating a request processing system with components including an API, a router, queues, and multiple output endpoints such as google-analytics.com, api.optimizely.com, and api.mixpanel.com, with flow annotations.

A flow diagram illustrating a request processing system with components including an API, a router, queues, and multiple output endpoints such as google-analytics.com, api.optimizely.com, and api.mixpanel.com, with flow annotations.

The Case for Individual Repos

Each destination API uses a different request format, requiring custom code to translate the event to match this format. A basic example is destination X requires sending birthday as traits.dob in the payload whereas our API accepts it as traits.birthday. The transformation code in destination X would look something like this:

Many modern destination endpoints have adopted Twilio Segment’s request format making some transforms relatively simple. However, these transforms can be very complex depending on the structure of the destination’s API. For example, for some of the older and most sprawling destinations, we find ourselves shoving values into hand-crafted XML payloads.

Initially, when the destinations were divided into separate services, all of the code lived in one repo. A huge point of frustration was that a single broken test caused tests to fail across all destinations. When we wanted to deploy a change, we had to spend time fixing the broken test even if the changes had nothing to do with the initial change. In response to this problem, it was decided to break out the code for each destination into their own repos. All the destinations were already broken out into their own service, so the transition was natural.

The split to separate repos allowed us to isolate the destination test suites easily. This isolation allowed the development team to move quickly when maintaining destinations.

Scaling Microservices and Repos

As time went on, we added over 50 new destinations, and that meant 50 new repos. To ease the burden of developing and maintaining these codebases, we created shared libraries to make common transforms and functionality, such as HTTP request handling, across our destinations easier and more uniform.

For example, if we want the name of a user from an event, event.name() can be called in any destination’s code. The shared library checks the event for the property key name and Name. If those don’t exist, it checks for a first name, checking the properties firstName, first_name, and FirstName. It does the same for the last name, checking the cases and combining the two to form the full name.

The shared libraries made building new destinations quick. The familiarity brought by a uniform set of shared functionality made maintenance less of a headache.

However, a new problem began to arise. Testing and deploying changes to these shared libraries impacted all of our destinations. It began to require considerable time and effort to maintain. Making changes to improve our libraries, knowing we’d have to test and deploy dozens of services, was a risky proposition. When pressed for time, engineers would only include the updated versions of these libraries on a single destination’s codebase.

Over time, the versions of these shared libraries began to diverge across the different destination codebases. The great benefit we once had of reduced customization between each destination codebase started to reverse. Eventually, all of them were using different versions of these shared libraries. We could’ve built tools to automate rolling out changes, but at this point, not only was developer productivity suffering but we began to encounter other issues with the microservice architecture.

The additional problem is that each service had a distinct load pattern. Some services would handle a handful of events per day while others handled thousands of events per second. For destinations that handled a small number of events, an operator would have to manually scale the service up to meet demand whenever there was an unexpected spike in load.

While we did have auto-scaling implemented, each service had a distinct blend of required CPU and memory resources, which made tuning the auto-scaling configuration more art than science.

The number of destinations continued to grow rapidly, with the team adding three destinations per month on average, which meant more repos, more queues, and more services. With our microservice architecture, our operational overhead increased linearly with each added destination. Therefore, we decided to take a step back and rethink the entire pipeline.

Ditching Microservices and Queues

The first item on the list was to consolidate the now over 140 services into a single service. The overhead from managing all of these services was a huge tax on our team. We were literally losing sleep over it since it was common for the on-call engineer to get paged to deal with load spikes.

However, the architecture at the time would have made moving to a single service challenging. With a separate queue per destination, each worker would have to check every queue for work, which would have added a layer of complexity to the destination service with which we weren’t comfortable. This was the main inspiration for Centrifuge. Centrifuge would replace all our individual queues and be responsible for sending events to the single monolithic service. (Note that Centrifuge became the back-end infrastructure for Connections .)

A flow diagram showing an API request passing through a router, then a concentrator, and finally reaching multiple destination services such as google-analytics.com, api.optimizely.com, and api.mixpanel.com.

A flow diagram showing an API request passing through a router, then a concentrator, and finally reaching multiple destination services such as google-analytics.com, api.optimizely.com, and api.mixpanel.com.

Moving to a Monorepo

Given that there would only be one service, it made sense to move all the destination code into one repo, which meant merging all the different dependencies and tests into a single repo. We knew this was going to be messy.

For each of the 120 unique dependencies, we committed to having one version for all our destinations. As we moved destinations over, we’d check the dependencies it was using and update them to the latest versions. We fixed anything in the destinations that broke with the newer versions.

With this transition, we no longer needed to keep track of the differences between dependency versions. All our destinations were using the same version, which significantly reduced the complexity across the codebase. Maintaining destinations now became less time consuming and less risky.

We also wanted a test suite that allowed us to quickly and easily run all our destination tests. Running all the tests was one of the main blockers when making updates to the shared libraries we discussed earlier.

Fortunately, the destination tests all had a similar structure. They had basic unit tests to verify our custom transform logic was correct and would execute HTTP requests to the partner’s endpoint to verify that events showed up in the destination as expected.

Recall that the original motivation for separating each destination codebase into its own repo was to isolate test failures. However, it turned out this was a false advantage. Tests that made HTTP requests were still failing with some frequency. With destinations separated into their own repos, there was little motivation to clean up failing tests. This poor hygiene led to a constant source of frustrating technical debt. Often a small change that should have only taken an hour or two would end up requiring a couple of days to a week to complete.

Building a Resilient Test Suite

The outbound HTTP requests to destination endpoints during the test run was the primary cause of failing tests. Unrelated issues like expired credentials shouldn’t fail tests. We also knew from experience that some destination endpoints were much slower than others. Some destinations took up to 5 minutes to run their tests. With over 140 destinations, our test suite could take up to an hour to run.

To solve for both of these, we created Traffic Recorder. Traffic Recorder is built on top of yakbak , and is responsible for recording and saving destinations’ test traffic. Whenever a test runs for the first time, any requests and their corresponding responses are recorded to a file. On subsequent test runs, the request and response in the file is played back instead requesting the destination’s endpoint. These files are checked into the repo so that the tests are consistent across every change. Now that the test suite is no longer dependent on these HTTP requests over the internet, our tests became significantly more resilient, a must-have for the migration to a single repo.

It took milliseconds to complete running the tests for all 140+ of our destinations after we integrated Traffic Recorder. In the past, just one destination could have taken a couple of minutes to complete. It felt like magic.

Why a Monolith works

Once the code for all destinations lived in a single repo, they could be merged into a single service. With every destination living in one service, our developer productivity substantially improved. We no longer had to deploy 140+ services for a change to one of the shared libraries. One engineer can deploy the service in a matter of minutes.

The proof was in the improved velocity. When our microservice architecture was still in place, we made 32 improvements to our shared libraries. One year later,  we’ve made 46 improvements.

The change also benefited our operational story. With every destination living in one service, we had a good mix of CPU and memory-intense destinations, which made scaling the service to meet demand significantly easier. The large worker pool can absorb spikes in load, so we no longer get paged for destinations that process small amounts of load.

Trade Offs

Moving from our microservice architecture to a monolith overall was huge improvement, however, there are trade-offs:

  1. Fault isolation is difficult. With everything running in a monolith, if a bug is introduced in one destination that causes the service to crash, the service will crash for all destinations. We have comprehensive automated testing in place, but tests can only get you so far. We are currently working on a much more robust way to prevent one destination from taking down the entire service while still keeping all the destinations in a monolith.

  2. In-memory caching is less effective. Previously, with one service per destination, our low traffic destinations only had a handful of processes, which meant their in-memory caches of control plane data would stay hot. Now that cache is spread thinly across 3000+ processes so it’s much less likely to be hit. We could use something like Redis to solve for this, but then that’s another point of scaling for which we’d have to account. In the end, we accepted this loss of efficiency given the substantial operational benefits.

  3. Updating the version of a dependency may break multiple destinations. While moving everything to one repo solved the previous dependency mess we were in, it means that if we want to use the newest version of a library, we’ll potentially have to update other destinations to work with the newer version. In our opinion though, the simplicity of this approach is worth the trade-off. And with our comprehensive automated test suite, we can quickly see what breaks with a newer dependency version.

Conclusion

Our initial microservice architecture worked for a time, solving the immediate performance issues in our pipeline by isolating the destinations from each other. However, we weren’t set up to scale. We lacked the proper tooling for testing and deploying the microservices when bulk updates were needed. As a result, our developer productivity quickly declined.

Moving to a monolith allowed us to rid our pipeline of operational issues while significantly increasing developer productivity. We didn’t make this transition lightly though and knew there were things we had to consider if it was going to work.

  1. We needed a rock solid testing suite to put everything into one repo. Without this, we would have been in the same situation as when we originally decided to break them apart. Constant failing tests hurt our productivity in the past, and we didn’t want that happening again.
  2. We accepted the trade-offs inherent in a monolithic architecture and made sure we had a good story around each. We had to be comfortable with some of the sacrifices that came with this change.

When deciding between microservices or a monolith, there are different factors to consider with each. In some parts of our infrastructure, microservices work well but our server-side destinations were a perfect example of how this popular trend can actually hurt productivity and performance. It turns out, the solution for us was a monolith.

Acknowledgements

The transition to a monolith was made possible by Stephen Mathieson , Rick Branson , Achille Roussel , Tom Holmes , and many more.

Special thanks to Rick Branson for helping review and edit this post at every stage.

Ready to see what Twilio Segment can do for you?

The Customer Data Platform Report 2025

Drawing on anonymized insights from thousands of Twilio customers, the Customer Data Platform report explores how companies are using CDPs to unlock the power of their data.

I fed 24 years of my blog posts to a Markov model

Hacker News
susam.net
2025-12-13 20:19:53
Comments...
Original Article

By Susam Pal on 13 Dec 2025

Yesterday I shared a little program called the Mark V. Shaney Junior at github.com/susam/mvs . It is a minimal implementation of a Markov text generator inspired by the legendary Mark V. Shaney program from the 1980s. If you don't know about Mark V. Shaney, read more about it on the Wikipedia article Mark V. Shaney .

It is a very small program that favours simplicity over efficiency. As a hobby, I often engage in exploratory programming where I write computer programs not to solve a specific problem but simply to explore a particular idea or topic for the sole purpose of recreation. I must have written small programs to explore Markov chains for various kinds of state spaces over a dozen times by now. Every time, I just pick my last experimental code and edit it to encode the new state space I am exploring. That's usually my general approach to such one-off programs. I have hundreds of tiny little experimental programs lying on my disk at any given time.

Once in a while, I get the itch to take one of those exploratory programs, give it some finishing touches, wrap it up in a nice Git repo along with a README.md , CHANGES.md and the whole shebang and share it on github.com/susam and codeberg.org/susam . The Mark V. Shaney Jnr program that I shared yesterday happened to be one such exercise.

If you scroll down the README of this project, you'll find some nice examples of the gibberish produced by this program. The first few examples there are the result of training the model on A Christmas Carol by Charles Dickens, one of my favourite authors. It is often said that Dickens never used fewer words when more would suffice. So I thought there couldn't be a better piece of text when it comes to testing out my tiny Markov model. I'll not reproduce the generated text examples here for the sake of brevity. If you are interested to take a look, just head over to the Gibberish Examples section of the README.

Soon after sharing the project, I wondered what kind of gibberish it would produce if I fed all 24 years of my blog posts and pages into the program. Well, here's one of the results:

$ python3 mvs.py < susam.txt
while a query replace operation is approved by the user. The above
variable defines the build job. It can be incredibly useful while
working on assembly language and machine code. In fact, all internal
resources like the result to refine the search prompt changes from
bck-i-search: to fwd-i-search:. Now type C-SPC (i.e. ctrl+space) to
set a mark causes Emacs to use 32-bit registers like EBP, ESP,
etc. Thus the behaviour is undefined. Such code may behave differently
when compiled with the readily available GNU tools like the shape
of 8. Flipping "P" horizontally makes it a proper quine: cat $0

This is the text that comes out after the program consumes over 200 posts consisting of about 200,000 words. Here is another example:

$ python3 mvs.py < susam.txt
enjoy asking "what happens if" and then type M-x zap-up-to-char RET
b. The buffer for this specific video, the actual fare for 8.3 km and
11 are all written from scratch. No prior knowledge is expected to
slow down in future. For now, I will add a statement like x =
0.999... , the inner corner square as discussed in the code segment
into the REPL window. Unlike Slimv, Vlime can work with and debug
executable files, it can be more convenient. M-x: Execute Extended
Command The key sequence M-q invokes the command cat and type TAB to
indent the

Here is a particularly incoherent but amusing one:

$ python3 mvs.py < susam.txt
Then open a new Lisp source file and the exact answer could harm
students' self-esteem. Scientists have arbitrarily assumed that an
integral domain. However, the string and comment text. To demonstrate
how a build job can trigger itself, pass input to standard output or
standard error), Eshell automatically runs the following command in
Vim and Emacs will copy the message length limit of 512 characters,
etc. For example, while learning to play the game between normal mode
to move the point is on an old dictionary lying around our house and
that is moving to the small and supportive community

No, I have never written anywhere that opening a Lisp source file could harm anyone's self-esteem. The text generator has picked up the 'Lisp source file' phrase from my Lisp in Vim post and the 'self-esteem' bit from the From Perl to Pi post.

By default, this program looks at trigrams (all sequences of three adjacent words) and creates a map where the first two words of the trigram are inserted as the key and the third word is appended to its list value. This map is the model. In this way, the model captures each pair of adjacent words along with the words that immediately follow each pair. The text generator then chooses a key (a pair of words) at random and looks for a word which follows. If there are multiple followers, it picks one at random. That is pretty much the whole algorithm. There isn't much more to it. It is as simple as it gets. For that reason, I often describe a simple Markov model like this as the 'hello, world' of language modelling.

The number of words in the key of the map can be set via command line arguments. By default, it is 2 as described above. This value is also known as the order of the model. So by default the order is 2. If we increase it to, say, 3 or 4, the generated text becomes a little more coherent. Here is one such example:

$ python3 mvs.py 4 < susam.txt
It is also possible to search for channels by channel names. For
example, on Libera Chat, to search for all channels with "python" in
its name, enter the IRC command: /msg alis list python. Although I
have used Libera Chat in the examples above, there are plenty of
infinite fields, so they must all be integral domains too. Consider
the field of rational numbers Q . Another quick way to arrive at this
fact is to observe that when one knight is placed on a type D square,
only two positions for the second knight such that the two knights

Except for a couple of abrupt transitions, the text is mostly coherent. We need to be careful about not increasing the order too much. In fact, if we increase the order of the model to 5, the generated text becomes very dry and factual because it begins to quote large portions of the blog posts verbatim. Not much fun can be had like that.

Before I end this post, let me present one final example where I ask it to generate text from an initial prompt:

$ python3 mvs.py 2 100 'Finally we'
Finally we divide this number by a feed aggregrator for Emacs-related
blogs. The following complete key sequences describe the effects of
previous evaluations shall have taken a simple and small to contain
bad content. This provides an interactive byte-compiled Lisp function
in MATLAB and GNU bash 5.1.4 on Debian is easily reproducible in
Windows XP. Older versions might be able to run that server for me it
played a significant burden on me as soon as possible. C-u F: Visit
the marked files or directories in the sense that it was already
initiated and we were to complete the proof.

Apparently, this is how I would sound if I ever took up speaking gibberish!

The Rise of Computer Games, Part I: Adventure

Hacker News
technicshistory.com
2025-12-13 20:19:10
Comments...
Original Article

Author’s note: I originally intended for this post to cover adventure games, computer role-playing games, wargames and other simulations, a brief look at the home video game market, and finally the rise of hybrids that fused home video game systems with personal computers. In the grand scheme of the story about personal computers that I am trying to tell, it probably does not make sense to lavish nearly 7,000 words on early adventure games alone, but it’s a topic of personal interest to me and the tale grew in the telling.

Play was central to the formation of personal computer culture. For the early hobbyists who were fascinated by the guts of the machine, the computer was a plaything in and of itself. Many of those who joined the hobby in 1975 or 1976 did so because of games: they had experience with the extensive BASIC game culture that circulated in the time-sharing systems of universities, high schools, and even corporations, and wanted to keep playing at home.

Even after the rise of commercial personal computer software, when the first truly useful applications began appearing, games remained by far the most popular software category (counting by number of titles produced and number of units sold, although not by dollar value). One 1980 catalog of Apple II software, for example, lists 265 titles, of which roughly two-thirds are games, from Ack-Ack (an anti-aircraft target shooter)to Wipe Off (a Breakout clone). The rest of the catalog comprises demos, educational programs, and a smattering of business software. Whatever they might say about the practical value of the personal computer, buyers had an evident hunger for games. [1]

The Early Games and Their Market

Computer owners got their hands on games in one of three ways. In the early years, the most common means would be simply copying a paper or cassette tape from a friend or colleague, whether with the permission of the original author or not. In the early years, most hobbyists treated game software as a commons to be freely shared, just as it had been in the time-sharing culture through cooperatives like DECUS. This peer-to-peer copying would never entirely go away, despite the commercialization of game software and various schemes by publishers to try to prevent it.

Many magazines and books also published “type-ins,” complete computer programs (almost always written in BASIC) intended to be manually entered at the keyboard (and then saved to tape or disk), and these, too, were most often games. Dave Ahl’s BASIC Computer Games (first published in 1973 by Digital Equipment Corporation), a collection of over 100 type-ins, reputedly sold one million copies by 1979. Though type-in publication continued through the 1980s, the inherent limits on the length of such programs (only the most dedicated would tackle a type-in that was more than a few hundred lines long) and their reliance on the universality of BASIC (rather than more performant compiled languages) meant that their significance waned as the sophistication of the game market increased. They could serve as fun demos or educational tools for learning to code, but could not compare to similar games available commercially. [2]

Selection from a state capital guessing game type-in, from the first issue of Softside (October 1978). The median type-in was a simplistic game or graphical demo like this.
A selection from a type-in for a much more complex adventure game, published in the September 1980 Softside . This goes on for two-and-a-half more pages, and is about the limit of what was feasible in the type-in format for all but the most steadfast of typers.

Finally, of course, there were the commercial titles offered by software publishers. The game business began in the same way as the personal computer hardware business: with hobby-entrepreneurs selling their creations to fellow hobbyists. In July 1976, for example, D.E. Hipps of Miami, Florida offered a Star Trek game written for MicroSoft’s Altair BASIC for $10 (no one at this stage of the industry paid any attention to niceties such as licensing agreements for the use of the Star Trek name). No common standard data storage standard existed; hobbyists employed a mix of paper teletype tapes, cassette storage, and (for the most extravagant) floppy disks. So Hipps opted to distribute his game as printed source code: a type-in! SCELBI (creators of one of the early, pre-Altair hobby computers), offered another Star Trek variant called Galaxy in the same form. By the late 1970s, the convergence of the industry on a small number of popular storage standards (with CP/M dominant) resolved this problem, and most games were distributed in plastic baggies containing instructions and a cassette or floppy disk. [3]

Contents of a typical computer game package circa 1980. The instructions, command reference, special instruction sheet, and cassette would have come together  in a plastic baggie.[Ernst Krogtof, retro365.blog]

It didn’t take long for other entrepreneurs to see a business opportunity in making it easier for software authors to publish their games. It took some time for clear business models and market verticals to emerge. No categorial distinction existed between publishers of games and publishers of utility and business software prior to 1980: Personal Software’s first big hit was MicroChess , followed by VisiCalc , followed by (as we’ll soon see) Zork . Programma International’s founder began as a hoarder of Apple II software, much of it acquired from copies unauthorized by the original author, then turned legitimate to sell those authors’ software instead. Softape tried selling bundles of software by subscription, and then started its own newsletter for subscribers, Softalk .

Some magazines went the other way around: Softside magazine (located the next town over from BYTE’s Peterborough, New Hampshire headquarters) created The Software Exchange (TSE), while Dave Ahl’s Creative Computing set up a label called Sensational Software. Type-ins printed in the magazines became a gateway drug to more convenient (and often more complex and interesting) software available for sale on cassette or diskette. [4]

Figure 21: Creative Computing heavily advertised the Sensational Software brand in the pages of the magazine, as in this July 1980 example describing some of their most popular hits and offering a free catalog of their full offering of 400 titles.

The early personal computer game culture imitated what came before it. The boundary between mini- and microcomputer culture was permeated by thousands who used time-sharing systems at work or school and then went home to a hobby computer. Prior to 1977, a game written for a personal computer was almost invariably based on a game drawn from the other side of that boundary.

Barring a few exceptions (such as the PLATO system available at some universities), users interacted with such computer systems through teletypes or video teletypes that alternated sending and receiving text. So, the resulting games were turn-based, purely textual, and relied on strategy and calculation (or pure luck) to win, not timing and reaction speed. These textual games suited the early hobbyists perfectly, since almost all of their computers also had text-only interfaces, whether mechanical teletypes or video displays like the TV Typewriter.

Other than simple quizzes, demos, and guessing games, popular titles included simulations such as Hammurabi , Civil War and Lunar Lander ; statistical recreations of sports contests (baseball, basketball, golf, etc.); or classic games or puzzles from the physical world, like checkers, Yahtzee, and the towers of Hanoi. By far the most popular by far, however, judging by the number of variations published and references in hobby magazines, were descendants of Mike Mayfield’s 1971 Star Trek , a strategic game of galactic war against the Klingons. [5]

An example of a user interaction in a Mike Mayfield-style Star Trek game. Each command entered by the user produces a textual response from the computer. There is no continuous display; the SRS command must be used each time the player wants to see their situation.  [The Wargaming Scribe, https://zeitgame.net/archives/1770%5D

Figure 22:

Some early personal computers, however, had video controllers with built-in memory, which allowed for more sophisticated interfaces than the simple back-and-forth exchanges of a teletype. Processor Technology, whose VDM-1 display interface could paint characters at arbitrary points on the screen, sold real-time textual games by Steve Dompier like Trek-80 (released in 1976, despite the name). Its interface (including a galactic sector map made of text characters and readouts of the Enterprise ’s weapon and shield status) updated in real-time in response to player and (simulated) enemy actions, rather than scrolling by one turn at a time. Cromemco, maker of the Dazzler, an Altair-compatible graphics board, offered the only personal computer games to use pixel graphics prior to the Apple II, starting with a version of the seminal Spacewar in early 1977. They followed with a suite of similar games such as Tankwar and Dogfight . [6]

Steve Dompier’s Trek-80 was similar in concept to earlier Trek games, but with alternating commands and responses replaced by a continuously displayed  textual interface. [ Creative Computing (July/August 1977), 44]
Spacewar on the Dazzler couldn’t match the PDP-1 original, but nothing else at the time sported pixel graphics. [ Creative Computing (July/August 1977), 43]

After 1977, when computers with graphical displays became more widely available (especially the full-color Apple II), computer games tapped a new vein of inspiration (and imitation): arcade games. Originally commercialized by Atari and its imitators as standalone arcade cabinets in the early 1970s, then moving into homes by the mid-1970s, these games were typically real-time and focused on action. Relatively cheap and easy-to-make, and relatively disposable to the user (few took more than a few minutes to play a complete game), computer action games proliferated by the hundreds and thousands, many of them direct or near clones of pre-existing arcade or home video games.

By 1980, however, there were major innovations that set personal computer games apart from other game media. In-depth simulations, expansive adventures that took hours to solve, and dungeon crawls teeming with a variety of monsters, treasures, and traps provided immersive experiences that the action-oriented video game consoles, did not, and (given their limited memory and storage capacity) could not provide. Once combined with full-color, bitmapped graphics, these games also surpassed anything previously available on their time-sharing predecessors. The era of imitation was definitively over.

Adventure

For several years of my childhood, for reasons that I no longer recall, our family’s Apple IIe computer, equipped with a green-and-black monochrome monitor, resided in my bedroom. Though much of my autobiographical memory is quite hazy, I can clearly remember each of the Apple II games we owned, each with its own 5 ¼-inch-square floppy disk: Syzygy (a space shooter in the vein of Asteroids ), One on One: Dr. J vs. Larry Bird , Winter Games , and Arcticfox (a sci-fi tank simulator with wireframe graphics).

But the game that truly captured my imagination, the game whose opening sequence and imagery remain etched (monochromatically) in my mind, was King’s Quest II: Romancing the Throne , a 1985 title by Sierra On-Line. The forty-nine-screen, hand-drawn fairy tale kingdom that you explore in the game (via your avatar, King Graham) felt like a vast world of endless possibility compared to the cramped half-court of One on One , the endlessly repeating monotony of a biathlon course in Winter Games , or the sterile polygonal landscape of Arcticfox ’s Antarctica. That open-ended feeling was enhanced by the lure of hidden secrets just out of reach, and a freeform text interface that accepted English commands like “THROW APPLE” (though only a tiny subset of the commands you could imagine would actually work). Despite its limitations and many, many frustrations (at age seven or eight, with no hint book and no Internet walkthroughs, I certainly never came close to completing it), it made me feel that I was truly experiencing an adventure.

One of the colorful environments of King’s Quest II. Limited at the time to a monochrome monitor, I never saw it like this.

The adventure game genre originated in a freely shared, text-driven game created in the time-sharing world. The game, which I will call Adventure (it is variously called Colossal Cave Adventure , Colossal Cave , Adventure , or simply ADVENT, after the game’s PDP-10 file name)challenged players to find five treasures within a cave complex by navigating a maze, solving puzzles, and defeating a band of axe-wielding dwarves  Its author was Will Crowther, a programmer at Bolt, Beranek and Newman (BBN), where he had written core infrastructural software for ARPANET, the first nationwide computer network.

The BBN team that worked on the Interface Message Processors (IMPs), minicomputers that routed messages across the ARPANET network. This photo was probably taken circa 1969-1970, when the first IMPs were delivered. Crowther is second from the right.

In 1975, Crowther went through a painful divorce. He had always enjoyed playing games with his school age daughters, so he began crafting a game on the company’s DEC PDP-10 to help him stay connected with them. Crowther copied the physical structure of Adventure ’s cave directly from a portion of the Mammoth complex in Kentucky. (He had met his wife through caving, and they had explored Mammoth together, so the game was also, in a sense, a means of staying connected to his former, married life.) It is probable (though not certain) that Crowther also drew some inspiration from a popular 1973 time-sharing game called Hunt the Wumpus , which required users to use textual clues to find and kill a Wumpus hidden in a system of caves without falling into a pit. But the conceptual structure of Adventure (delving into the earth to find treasure and magical artifacts in the face of devious obstacles and armed foes) came from a new game of pencil, paper, and imagination that Crowther was playing with some of his BBN friends, called Dungeons and Dragons . [7]

In Crowther’s words:

…the caving had stopped, because that had become awkward, so I decided I would fool around and write a program that was a re-creation in fantasy of my caving, and also would be a game for the kids, and perhaps had some aspects of the Dungeons and Dragons that I had been playing. [8]

Just as in the later King’s Quest II , the player used simple verb-noun commands (such as “TAKE LAMP”) to interact with the world, but lacking a graphical screen with a visible avatar, he or she also used text commands to move about the world, from one room of the cave to the next (e.g., “SOUTH” or “EAST”). Crowther showed the game off to his D&D buddies and his daughters, then took a new job in California, and forgot about it. [9]

Time-sharing games had once propagated gradually from computer to computer via collectives like the Digital Equipment Computer Users’ Society or colleagues and friends mailing paper tapes to one another. But BBN was on the ARPANET, and Crowther had put his game on a public directory in the BBN computer. From there, someone copied it across the network to a computer in a Stanford lab, where a graduate student, Don Woods, found it in early 1977.

Crowther-Woods Adventure as someone might have experienced it in the late 1970s, on a DEC video terminal. [Autopilot, CC BY-SA 3.0]

Fascinated by Crowther’s game, Woods contacted him for the FORTRAN source code, and set about expanding it. He increased the scope by adding more rooms, more puzzles, more foes, and more ways to interact with the world; but he also added the ability to save your progress and a point-tracking system with a final objective: to find fifteen treasures and return them to the starting location. Woods’ larger, more polished version of Adventure spread rapidly across the time-sharing world, and became an obsession for some, keeping them at the office past midnight in search of that last treasure. (One of Woods’ additions was a setting to allow admins to disable the game during working hours.) [10]

Adventureland

A frizzy-haired Florida man, Scott Adams, was the first to commercialize a version of Adventure for the personal computer. He had first fallen in love with computers on a time-sharing terminal at his Miami high school in the late 1960s. He went on to earn a computer science degree and by the late 1970s, was working as a programmer at telecom manufacturer Stromberg-Carlson. On the side he had become an avid home computer hobbyist, purchasing a Sphere computer in 1975 and then a TRS-80 in 1977. Shortly thereafter he discovered Adventure on the company time-sharing system and, like many before and after, could not quit playing until he had beaten it.

Adams decided that it would be an interesting challenge to build something similar for the TRS-80. It would have to be much smaller to fit in the sixteen kilobytes of memory he had available. The Crowther-Woods Adventure contained 140 distinct locations and ran to eighty kilobytes of (uncompiled) FORTRAN and fifty-four kilobytes of data for the game text. Adams’ Adventureland was considerably smaller, with fewer than thirty-five locations—not necessarily to the detriment of gameplay; for example, the cutting lopped off most of Adventure ’s huge and torturous mazes. [11]

Adams’ local TRS-80 buddies were impressed enough with his game that he decided to sell it through both The TRS-80 Software Exchange and Creative Computing, who offered it on cassette for $24.95 and $14.95, respectively, in their January 1979 magazine issues. He followed up with a whole series of games, starting with Pirate Adventure , and ported the games from the TRS-80 to other popular computer platforms. His wife Alexis joined the venture as a business manager and game designer, co-authoring Mystery Fun House and Voodoo Castle . [12]

Scott Adams surrounded by his Adventure series. [ Adventure international Microcomputer Software Catalog 2, 5 (1981),4]

The adventure game genre is often criticized for absurd and unfair puzzles, which can be guessed at only through trial-and-error, and tedious mazes or other navigational obfuscations. These early games from circa 1980 are among the worst offenders. In Adventureland , for example, a “very thin” black bear blocks your way, and the only way to get past it is to “yell” at it. Feeding this apparently hungry bear honey will prevent you from completing the game, because the honey is one of the treasures you must collect. You could easily get to state in these games where you have lost the game without knowing it. [13]

Yelling at a bear in Adventureland. [https://www.filfre.net/2011/06/adventureland-part-1]

But these criticisms are retrospective: the contemporary press and the buying public lapped up the Adams’ adventures and all of their imitators. We have to remember that the appeal of this genre lay in getting immersed (one might say “lost”) in the game for hours every evening, clawing your way forward towards ultimate triumph for weeks, or even months, on end. In a market full of arcade-like games that offered the convenient but shallow fun of a bag of potato chips, adventure games provided a rich and fulfilling meal for the imagination. As one lover of the genre put it:

Adventure is the product of imagination appealing to imagination. It is not just the puzzle, or the theme, or the nonplayer characters and their personalities. It is a verbal tapestry of interwoven phrases that whisk you away to magical kingdoms of the mind. The computer becomes a tool of reaching that conveys you where it will. You go along eagerly, breathlessly, awaiting what comes next. [14]

The catch was that this delicacy was consumable only once: a solved adventure game was no more interesting to revisit than a solved crossword puzzle. So, they had to provide a challenge: no one wanted to pay $24.95 for a game on the way home from work and then breeze through it before bedtime. A game that was very fair would also risk being seen as a waste of money. Despite improvements in design in future years that would banish some of the worst practices of the genre, adventure games remained trapped on the horns of this dilemma. [15]

Zork

The Adams’ “Adventure” line made them wealthy enough to build a faux-castle outside Orlando, and kicked off one of the most popular computer game genres of the 1980s. By late 1980, half-a-dozen other companies were putting out personal computer adventure games, from The Programmer’s Guild  to Mad Hatter Software, as well as a version of Crowther-Woods Adventure put out by Microsoft. But they are overshadowed in the historical record by a competitor that subsequently dominated both sales of and critical attention to text adventure games. It began at MIT. In the spring of 1977, the Crowther-Woods Adventure arrived over the ARPANET at the PDP-10 at the Laboratory for Computer Science (LCS), and sank its claws into its employees. Impressed by what Crowther and Woods had done, but convinced that it could be made even better, a group of LCS staff set out in May 1977 to one-up Adventure . [16]

Dave Lebling, who had already worked on several games (including Maze , the first first-person shooter game), kicked off the project. Lebling played Dungeons and Dragons in the same Cambridge D&D group as Crowther had (though not at the same time), and based the game’s combat system on the tabletop game. Then Marc Blank, Tim Anderson, and Bruce Daniels filled in most of the core structure of the program. They gave it the place-holder name of Zork (a term used as an inside-joke expletive at LCS, as in “why won’t this zorking thing work”), which ended up sticking permanently. The game reached its completed state in early 1979, by which point it greatly exceeded the original Adventure in scale, with 191 rooms and 211 items, occupying a full megabyte of memory. [17]

Coded in a LISP-descendant called MUDDLE or MDL, Zork had an elegant design that encapsulated all the information about the possible interactions of each room and item in a single block of code and data, making it much easier to extend than Adventure . It also had a much richer text interface: both Adventure and AdventureLand accepted only “verb noun” commands, but Zork also allowed for conjunctions and prepositions (for example, “TAKE SWORD AND LANTERN FROM SACK”).Though aping the basic tropes of Adventure (a small overland area leading to an underground treasure-hunt), its more complex architecture allowed for a richer and more clever set of puzzles. [18]

In the spring of 1979, several key staff members of LCS were poised to leave MIT. Their supervisor, Al Vezza, proposed to keep the band together by forming a company. Incorporated in June as Infocom, its new employees and shareholders included Lebling, Blank, and Anderson.

While the various partners mulled what exactly to do with their new business, Blank and a fellow LCS alum, Joel Berez, figured out how to cram Zork onto a microcomputer: they cut the number of rooms and items in the game in half and removed all the features of MDL not needed for the game, creating an interpreter for a simpler language they called Zork Implementation Language (ZIL). The resulting program occupied just seventy-seven kilobytes. To get this to fit into a microcomputer memory half that size, they had one last trick: a virtual memory system built into the interpreter, to swap chunks of the program on and off the disk as needed (typical floppy disk capacities at the time were over 100 kilobytes, and continued to grow). This meant that Zork could only run off of a floppy drive (whose rapidly spinning disk could sync to a new data location in a fraction of a second and supply data at fifteen kilobytes per second), never a cassette (which took a minute or more to fully unwind or rewind and supplied data at 300 bits per second). Or, to put it another way, the growing market prevalence of affordable floppy drives made larger personal computer adventure games feasible: it took about twenty minutes to load a Scott Adams adventure game from tape. [19]

In late 1979, Blank and Berez convinced a reluctant Vezza (who wanted to get into business software) to make a microcomputer Zork Infocom’s first product. They initially published through Personal Software, co-owned by MIT’s own Dan Fylstra, which had just recently released VisiCalc. But after VisiCalc’s smash success, Fylstra no longer wanted to deal in mere games, so Infocom became its own publisher for subsequent games—including Zork II and III , built from the remaining unused material from the original PDP-10 Zork .

Zork became available in December 1980 and sold 10,000 units in 1981, mostly on the Apple II, despite an eye-watering price of $39.95, at a time when most games cost fifteen to twenty-five dollars. Then, astonishingly, in an industry typically characterized by ephemerality and obsolescence, sales continued to grow, year after year. They peaked in 1984 with over 150,000 copies sold. No doubt Zork’s self-referential humor, its restrained but clever marketing, and the high quality of the game itself (certainly the most well-crafted adventure game to date) all helped to sell the game. [20]

The evolution of Zork’s marketing strategy, from the underground-zine feel of this 1981 ad under Personal Software… [ SoftSide (April 1981)]
To this more austere and elegant pitch under Infocom in 1982. [Softline (September 1982)]

But many sales also must have arisen from the startling impression given by sitting down in a store (or at a friend’s house) to interact with this remarkable piece of software. Bob Liddil, reviewing Zork for BYTE magazine, pointed to the fluency of the parser as the element that first pulled him in:

I was eager to test Zork’s biggest selling point, intelligent input (ie: its ability to accept free-form instructions). I typed “OPEN THE BAG AND GET THE LUNCH,” in reference to a brown paper sack inside the house. The computer complied. There was water and food, so I typed “EAT THE LUNCH AND DRINK THE WATER,” to which the computer responded with gratitude for satisfying its hunger and thirst. I was hooked. [21]

The game seemed to understand the user and to have an appropriate answer (or a witty retort) ready for everything they might try, from expletives (“FUCK > SUCH LANGUAGE IN A HIGH-CLASS ESTABLISHMENT LIKE THIS!”) to attempts to outwit the command system (“FIND HANDS > WITHIN SIX FEET OF YOUR HEAD, ASSUMING YOU HAVEN’T LEFT THAT SOMEWHERE.”), to questions about the imaginary world in which the game is played (“WHAT IS A ZORKMID? > THE ZORKMID IS THE UNIT OF CURRENCY OF THE GREAT UNDERGROUND EMPIRE.”) Along with VisiCalc and WordStar , Zork functioned not just as a piece of software that did something, but also as an existence proof (for the owner and for skeptical friends and family) that the microcomputer could be more than merely a toy version of a real computer. [22]

Zork sales finally fell off in the mid-1980s, not because new text adventure games had surpassed it (Infocom continued to rule that particular roost, and Zork remained their flagship), but because of the steady improvement in personal computer graphics and the corresponding ascendancy of graphical games over textual ones.

Mystery House

The first graphical adventure game actually appeared several months before Zork : On-Line Systems’ Mystery House, created by Ken and Roberta Williams. Unlike Scott Adams and most of the early personal computer hobbyists, Ken Williams got into computers for money, not love. Raised in greater Los Angeles in an unhappy home, he was a driven and impatient young man, and graduated high school at just sixteen. Roberta Heuer, a dreamy young woman whom Williams met through a double date, was impressed enough by his intelligence and ambition to give in to his insistence that they marry in 1972, while they were both still teenagers.

With the expectation of children to come, Ken abandoned his physics program at Cal Poly Pomona for a more immediately lucrative career in data processing. His father-in-law helped him get a loan to attend Control Data Corporation’s training school (the Control Data Institute), and from there he went on to a series of positions working on “big iron” batch-processing systems, constantly bouncing from job to job and home to home in search of better opportunities and a fatter pay check. He and Roberta wanted a bigger house and more creature comforts, but most of all they dreamed of an early retirement to a life out-of-doors, far from the city. [23]

Ken and Roberta as newlyweds in 1972.

The Williamses took no notice of microcomputers until Ken and one of his co-workers, Robert Leff, concocted a way to make money off of them: selling fellow programmers a microcomputer implementation of FORTRAN, one of the most popular data processing languages. Not only could this venture make him and Roberta still richer (always a key consideration), it could free them to finally move away from the traffic and grind of Los Angeles and to live out their dream of rural life. Initially Ken planned to write FORTRAN for the TRS-80, but he redirected his energies to the more capable Apple II after he and Roberta got themselves one for a mutual Christmas present.

Meanwhile, Roberta had gotten hooked on adventure games. Ken had an electromechanical teletype terminal in their home for one of his consulting jobs, and connected it to a computer with the Crowther-Woods Adventure available to play. He showed the game off to Roberta. For Ken it was a curiosity, but for Roberta it became an obsession: she would not quit until she had beaten the game, weeks later. Ken brought home a borrowed TRS-80 and cassette tapes for the Scott Adams adventure series, and she flew through those, too. Soon she had an idea for a game of her own: instead of a treasure hunt, it would be a murder mystery; a mix of Clue and Ten Little Indians set in a creepy old Victorian house.

She insisted that Ken help her create it, and, after putting her off several times, he finally relented. Roberta wanted to add pictures of each room as a way to make this new game better than what came before, taking advantage of the Apple II’s 280×192 pixel high-resolution graphics mode. Because storing dozens of bitmapped images on a floppy disk would be impossible, Ken bought a VersaWriter accessory, a tablet with a movable arm that let Roberta capture the (x, y) position of each line endpoint in her pictures and store them into the computer. He wrote code to re-create the pictures from these coordinates by drawing the lines at runtime. [24]

A circa 1983 ad for the Atari version of the VersaWriter.

Like Crowther and Adams, Ken split the data tables apart from the code that interpreted them. This allowed Roberta to work out all of the information about the rooms in the game, the items they contain, and the actions the player can perform, without needing to write any code. This division of labor between programming and design, quite novel to computer game software, came about from the accident of Roberta’s limited technical skills (she had worked briefly as COBOL programmer, at Ken’s insistence) and Ken’s lack of interest in the game: he was still focused on launching Apple FORTRAN. [25]

Then, while visiting local computer stores to pitch his computer language, Ken demoed an early version of Roberta’s game and everyone in the store gathered around to see it. The owners asked when they could have copies to sell. Ken realized he was backing the wrong horse: it was Roberta’s side project that would make them rich, not FORTRAN. Moreover, rather than give up a cut to a publisher like Programma International, they would take all the revenue for themselves, by publishing the game through the company name he had already registered for his never-to-be-released FORTRAN, On-Line Systems. On top of that, they could make even more money by distributing games into the stores they were already visiting on behalf of other software authors, like Scott Adams’ Florida-based Adventure International. Eventually unable to manage both publishing and distribution, he convinced his former colleague and erstwhile FORTRAN partner, Robert Leff, to buy out the distribution business, which grew into the industry behemoth Softsel. [26]

After a month of development on nights and weekends (Ken’s pace was manic: in his memoir he writes that he always strove to be a “Triple-A” player, and his brother called him a “chronic workaholic”), the Williams’ started selling Mystery House in May 1980. It required forty-eight kilobytes of memory, but with chip prices falling continuously, this was not so stringent a requirement as it had been even a year before.

The game’s simplistic “mystery” ends with the player gunning down the de facto murderer: the only living character to be found in a houseful of victims. The puzzles are among the more poorly clued and arbitrary to be found in a genre full of such frustrations. But for adventure-starved gamers of the time it was enchanting: not only could they witness the virtual world which they were navigating, it actually changed in response to their actions (picking up an object, like the taunting note that kicked off the murders, would remove it from the scene). Roberta’s drawings, crude and child-like as they certainly are, gave the game a visual appeal that drew in new buyers, and more than justified its price of $24.95. [27]

The entry room to Mystery House. The green and purple colors are artifacts of how the Apple II high-res graphics mode works. [https://www.filfre.net/2011/10/mystery-house-part-2]

That summer Ken and Roberta were pulling in $30,000 a month and shopping for a house far from Los Angeles, in Coarsegold, California, nestled in the foothills of the Sierra Nevada near Yosemite National Park. On-Line Systems became Sierra On-Line. A few months later a second “High-Res Adventure” followed, The Wizard and the Princess, which added visually-stunning color to Mystery House’s line drawings: Ken used dithering techniques to make the six colors available in high-res mode appear like twenty-one. Roberta’s King’s Quest series, which I encountered on my Apple II, did not begin until 1984. It became Sierra’s best seller: by 1987, the first three installments of the series had sold a combined 500,000 copies, at least according to Sierra’s own marketing. [28]

A scene from The Wizard and the Princess. Nothing like this had been seen on microcomputers before. The dark blue of the man’s shirt is composed of dithered blue and black, and the light blue of his pants from blue and white, etc. The odd colors at the borders between different-colored regions were once again artifacts of the Apple II high-res color system.

It stands out, in a story populated almost entirely with male characters, that two of the earliest adventure game designers (Alexis Adams and Roberta Williams) were women. The scope of Alexis’ contributions aren’t entirely clear, but Roberta was arguably the most successful adventure game designer of all time. There was an appeal in the adventure game genre, which had more in common with a mystery novel or a logic puzzle than an arcade game and typically eschewed violence (the summary execution of Mystery House ’s killer notwithstanding), that attracted some women to an otherwise almost entirely masculine industry. [29]

Ken and Roberta in 1989 after winning a Software Publisher’s Association Award for Best Role-Playing or Adventure Game for King’s Quest IV. [ Sierra News Magazine (Summer 1990) 8]

In a world where multiple discovery and parallel invention are the norm, it is also remarkable that all of the games we have discussed (and indeed all the computer adventure games ever made) can trace their ancestry to the Crowther-Woods Adventure . In the meantime, though, many other computer game authorshad drawn inspiration from Dungeons and Dragons , spawning an entirely different genre of computer games, more in tune with D&D ’s wargaming roots.


[1] Programma International, “Spring 1980 Catalog” (Spring 1980), 3-5 ( https://ia903201.us.archive.org/12/items/Programma_Catalog_Spring_1980_for_APPLE_II/Programma_Catalog_Spring_1980_for_APPLE_II.pdf ).

[2] J.J. Anderson, “Dave tells Ahl—The History of Creative Computing,” Creative Computing (November 1984), 72.

[3] “A Star Trek Product,” BYTE (July 1976), 92; “Scelbi Software,” BYTE (July 1976), 17.

[4] Alexander Smith, They Create Worlds: The Story of the People and Companies That Shaped the Video Game Industry, Vol. I: 1971-1982 (Boca Raton: CRC Press, 2020), 366-368; Jimmy Maher “Adventureland, Part 2,” The Digital Antiquarian (June 24, 2011) ( https://www.filfre.net/2011/06/adventureland-part-2 ); David H. Ahl, “The First Decade of Personal Computing,” Creative Computing (November 1984), 30.

[5] Smith, They Create Worlds , 266-267; David H. Ahl, ed., 101 BASIC Computer Games: Microcomputer Edition (New York: Workman Publishing, 1977).

[6] The Wargaming Scribe, “The beginning of home computer gaming: the VDM-1 and the SOL-20” (August 16, 2023) ( https://zeitgame.net/archives/10450 ); “Cromemco Dazzler Games” (Mountain View: Cromemco, 1977); Steve North, “Two Space Games (With Graphics!) For Your Home Computer,” Creative Computing (July/August 1977) 43-44; “Spacewar Available for the Cromemco Dazzler,” Cromemco News (January 1977).

[7] Smith, They Create Worlds , 383-384; Katie Hafner, “Will Crowther Interview,” (March 1994), 1-5 ( https://archive.org/details/WillCrowtherInterview/mode/1up ). I read into

[8] Quoted in Dale Peterson, Genesis II: Creation and Recreation with Computers (Reston: Prentice-Hall, 1983), 188.

[9] Smith, They Create Worlds , 384-385; Dennis G. Jerz, “Somewhere Nearby is Colossal Cave,” (2007), 83 ( https://jerz.setonhill.edu/resources/preprint/SNiCC.pdf

[10] Smith, They Create Worlds , 384-385; Jerz, “Somewhere Nearby is Colossal Cave,” 13; Jimmy Maher, “The Completed Adventure, Part 1” The Digital Antiquarian (June 2, 2011) ( https://www.filfre.net/2011/06/the-completed-adventure-part-1/ ); Tracy Kidder, The Soul of A New Machine (New York: Little, Brown, 2000 [1981]), 86-89.

[11] IF Archive Adventure zip ( https://unbox.ifarchive.org/?url=/if-archive/games/source/adv350-pdp10.tar.gz );”AdventureLand map ( https://www.solutionarchive.com/file/id%2C3/ ); Maher, “AdventureLand, Part 1,” The Digital Antiquarian (June 22, 2011) ( https://www.filfre.net/2011/06/adventureland-part-1 ).

[12] Robert Levering, Michael Katz, and Milton Moskowitz, The Computer Entrepreneurs: Who’s Making It Big and How in America’s Upstart Industry (New York: NAL Books, 1984), 114-118; Smith, They Create Worlds , 388.

[13] Jimmy Maher, “Adventureland, Part 1,” The Digital Antiquarian (June 22, 2011) ( https://www.filfre.net/2011/06/adventureland-part-1 ).

[14] Bob Liddil, “On the Road to Adventure,” BYTE (December 1980), 170.

[15] The 1990 LucasArts adventure, Loom , for example, though it is an artistic masterpiece, was criticized by reviewers for being too short and too easy. Scorpia, “Scorpion’s View: ‘Conquests of Camelot’ and ‘Loom’,” Computer Gaming World (July-August 1990), 51, 63, Simply making the games larger, with more puzzles was technically infeasible in the early years (we have already seen that Adventureland had to be much smaller than Adventure to fit on a microcomputer); later, as the costs of game production went up, it became financially infeasible. There is an expert dissection of the sins of one early adventure game in Jimmy Maher, “The Wizard and the Princess, Part 2,” The Digital Antiquarian (October 21, 2011) ( https://www.filfre.net/2011/10/the-wizard-and-the-princess-part-2 ).

[16] Bob Liddil, “On the Road to Adventure,” BYTE (December 1980), 162.

[17] Jimmy Maher, “The Roots of Infocom,” Digital Antiquarian (January 1, 2012) ( https://www.filfre.net/2012/01/the-roots-of-infocom ); Jimmy Maher, “Zork on the PDP-10,” Digital Antiquarian (January 3, 2012) ( https://www.filfre.net/2012/01/zork-on-the-pdp-10 ); Stephen Granade and Philip Jong, “David Lebling Interview,” Brass Lantern (undated, ca. 2000) ( http://brasslantern.org/community/interviews/lebling.html ); Nick Montfort, Twisty Little Passage: An Approach to Interactive Fiction (Cambridge: MIT Press, 2003), 86. Eric Roberts, Crowther and Lebling’s dungeon master, ran a variant of D&D he called Mirkwood Tales . Jon Peterson, Playing at the World (San Diego: Unreason Press, 2012), 617-618, 622.

[18] P. David Lebling, “ Zork and the Future of Computerized Fantasy Simulations,” BYTE (December 1980), 172-182.

[19] Jimmy Maher, “ZIL and the Z-Machine,” The Digital Antiquarian ( https://www.filfre.net/2012/01/zil-and-the-z-machine ); Maya Posch, “Zork And The Z-Machine: Bringing The Mainframe To 8-bit Home Computers,” Hackaday (May 22, 2019) ( https://hackaday.com/2019/05/22/zork-and-the-z-machine-bringing-the-mainframe-to-8-bit-home-computers ); Scott Adams “Pirate’s Adventure,” BYTE (December 1980), 212. Virtual memory was a well-established technique in minicomputer and mainframe operating systems, but no widely used personal computer OS offered virtual memory until the release of Windows 3.0 in 1990.

[20] “InfoCom Shipments By Title and Year” ( https://www.flickr.com/photos/textfiles/2419969220 ); Bob Liddil, “Zork, The Great Underground Empire,” Byte (February 1981), 262.

[21] Liddil, “Zork, The Great Underground Empire,” 262.

[22] Jimmy Maher, “Parser Games,” Digital Antiquarian (January 16, 2012) ( https://www.filfre.net/2012/01/parser-games ).

[23] Levy, Hackers , 293-297, 302-303; Ken Williams, Not All Fairy Tales Have Happy Endings: The Rise and Fall of Sierra On-Line (Ken Williams, 2020), 12-24, 22-24; Jimmy Maher, “Ken and Roberta,” The Digital Antiquarian (October 2, 2011) ( https://www.filfre.net/2011/10/ken-and-roberta ).

[24] Williams, Not All Fairy Tales , 55-56, 66-68, 88; Levy, Hackers , 303-304; Ken Wiliams, “Introduction to The Roberta Williams Anthology” (1996) ( https://wiki.sierrahelp.com/index.php/Introduction_to_The_Roberta_Williams_Anthology ). The account in the previous paragraphs is interpolated from the above sources, which are partially contradictory. All differ about who got the Apple II and why. Levy never mentions the TRS-80 or any adventure games besides Adventure , and has Roberta finishing that game after the time the Apple II was purchased, implying she never played any other adventure games before deciding to write Mystery House : the timeline would simply be too tight. I believe this is wrong, and either an intentional elision or a false interpolation by Levy. It is unlikely that the Williamses would later entirely hallucinate having brought home and played the whole series of Scott Adams games. The accounts also differ on whose idea it was to add pictures to the game. I’m inclined to believe it was Roberta, to whom the game idea and all the passion for it belonged.

[25] Williams, Not All Fairy Tales , 69-73.

[26] Levy, Hackers , 308-310; Williams, Not All Fairy Tales , 73; Ken Williams, “A Message From the President,” Sierra News Magazine (Summer 1990), 35].

[27] John Williams, “Sierra’s First Ten Years,” Sierra News Magazine (Summer 1990), 6; Williams, Not All Fairy Tales , 79; Jimmy Maher, “Mystery House, Part 2,” Digital Antiquarian (October 9, 2011) ( https://www.filfre.net/2011/10/mystery-house-part-2 ); “Game 57: Mystery House,” Data-Driven Gamer (April 22, 2019) ( https://datadrivengamer.blogspot.com/2019/04/game-57-mystery-house.html ).

[28] Jimmy Maher, “The Wizard and the Princess, Part 1,” Digital Antiquarian (October 20, 2011) ( https://www.filfre.net/2011/10/the-wizard-and-the-princess-part-1 ); “On-Line Systems Presents: Hi-Res Adventure,” Softline (September 1981), 16; Levy, Hackers , 310-311; “Sales Data,” King’s Quest Omnipedia ( https://kingsquest.fandom.com/wiki/Sales_data#King’s_Quest_Original ).

[29] In later years, Sierra On-Line would employ several more women as designers—Lori Cole (the Quest for Glory series), Christy Marx ( Conquests of Camelot and Conquests of the Longbow ), and Jane Jensen (the Gabriel Knight series), while Amy Briggs created Plundered Hearts at Infocom. It is hard to get any reliable numbers on the audience for adventure games: in 1989, Sierra estimated that 35-40% of the players of King’s Quest IV were women, which surely was well above average for a computer game. Patricia Cignarella, “Girls Just Want To Have Fun,” Sierra News Magazine (Autumn 1989), 25.

Ask HN: How do you handle release notes for multiple audiences?

Hacker News
news.ycombinator.com
2025-12-13 20:04:26
Comments...
Original Article

For those of you who ship often, when you release updates, do you typically write one set of release notes, or do you end up rewriting them for different audiences?

For example: • technical version for developers • simplified version for end users • something more high-level for stakeholders etc…

In my current position I’ve seen a plethora of different ways teams, and even the company I currently work for, go about this.

What I’ve seen: 1. paste raw GitHub changelogs into customer emails (highly wouldn’t recommend if you’re currently doing this ) 2. manually rewrite the same update multiple times for each audience 3. skip release notes entirely because it’s too much work

So I guess my question is: How do you or your company currently go about handling more than one set of release notes, and do you feel like more than one set is needed?

Would love to hear what’s working (or not working) for you, and if you found any tools that help mitigate this issue.

VPN location claims don't match real traffic exits

Hacker News
ipinfo.io
2025-12-13 19:46:19
Comments...
Original Article

In a large-scale analysis of 20 popular VPNs, IPinfo found that 17 of those VPNs exit traffic from different countries than they claim . Some claim 100+ countries, but many of them point to the same handful of physical data centers in the US or Europe.

That means the majority of VPN providers we analyzed don’t route your traffic via the countries they claim to, and they claim many more countries than they actually support.

Analyzing over 150,000 exit IPs across 137 possible exit countries, and comparing what providers claim to what IPinfo measures, shows that:

  • 17 in 20 providers had traffic exiting in a different country.
  • 38 countries were “virtual-only” in our dataset (claimed by at least one provider, but never observed as the actual traffic exit country for any provider we tested).
  • We were only able to verify all provider announced locations for 3 providers out of the 20.
  • Across ~150,000 VPN exit IPs tested, ProbeNet, our internet measurement platform, detected roughly 8,000 cases where widely-used IP datasets placed the server in the wrong country — sometimes thousands of kilometers off.

This report walks through what we saw across VPN and IP data providers, provides a closer look at two particularly interesting countries, explores why measurement-based IP data matters if you care where your traffic really goes, and shares how we ran the investigation.

Which VPNs Matched Reality (And Which Didn’t)

Here is the overlap between the number of listed countries each VPN provider claims to offer versus the countries with real VPN traffic that we measured — lower percentages indicate providers whose claimed lists best match our data:

Provider

Claimed Countries

% Virtual or Unmeasurable

IPVanish

108

61

CyberGhost

100

57

ExpressVPN

105

57

NordVPN

126

53

Private Internet Access

91

52

ProtonVPN

110

51

FastVPN

112

49

X-VPN

89

43

Surfshark

100

41

BelkaVPN

63

41

ZoogVPN

76

34

VyprVPN

63

27

FastestVPN

47

26

TrustZone

39

18

PrivateVPN

62

13

TunnelBear

47

9

VeePN

84

6

IVPN

41

0

Mullvad

50

0

Windscribe

70

0

It's important to note that we used the most commonly and widely supported technologies in this research, to make comparison between providers as fair as possible while giving us significant data to analyze, so this will not be the full coverage for each provider.

These are some of the most visible names in the market. They also tend to have very long country lists on their websites. Notably, three well-known providers had zero mismatches across all the countries we tested: Mullvad, IVPN, and Windscribe .

Country mismatches doesn’t automatically mean some providers offer “bad VPNs,” but it does mean that if you’re choosing a VPN because it claims “100+ countries,” you should know that a significant share of those flags may be labels, or virtual locations.

What “Virtual Locations” Really Mean

When a VPN lets you connect to, for example, “Bahamas” or “Somalia,” that doesn’t always mean traffic routes through there. In many cases, it’s somewhere entirely different, like Miami or London, but presented as if traffic is in the country you picked.

This setup is known as a virtual location:

  • The VPN app shows “Country X” (e.g. Bahamas).
  • The IP registry data also says “Country X” — because the provider self-declared it that way.
  • But the network measurements (latency and routing) show the traffic actually exits in “Country Y” — often thousands of kilometers away.

The problem? Without active network measurement, most IP datasets will rely on what the IP’s owner told the internet registry or published in WHOIS/geofeeds: a self-reported country tag. If that record is wrong or outdated, the mistake spreads everywhere. That’s where IPinfo’s ProbeNet comes in: by running live RTT tests from 1,200+ points of presence worldwide, we anchor each IP to its real-world location, not just its declared one.

Across the dataset, we found 97 countries where at least one VPN brand only ever appeared as virtual or unmeasurable in our data. In other words, for a noticeable slice of the world map, some “locations” in VPNs never show up as true exits in our measurements.

We also found 38 countries where every mention behaved this way: at least one VPN claimed them, but none ever produced a stable, measurable exit in that country in our sample.

You can think of these 38 as the “unmeasurable” countries in this study – places that exist in server lists, config files, and IP geofeeds, but never once appeared as the actual exit country in our measurements. They’re not randomly scattered – they cluster in specific parts of the map. By region, that includes:

This doesn’t prove there is zero VPN infrastructure in those countries globally. It does show that, across the providers and locations we measured, the dominant pattern is to serve those locations from elsewhere. Here are three of the most interesting examples of how this looks at the IP level.

Case Studies: Two Countries That Only Exist on the Map

To make this concrete, let’s look at three countries where every provider in our dataset turned out to be virtual: Bahamas , and Somalia .

Bahamas: All-Inclusive, Hosted in the US

In our measurements, five providers offered locations labeled as “Bahamas”: NordVPN, ExpressVPN, Private Internet Access, FastVPN, and IPVanish.

For all of them, measured traffic was in the United States, usually with sub-millisecond RTT to US probes.

Provider

Claimed as

Measured exit country

RTT to nearest ProbeNet vantage point in (evidence)

Example exit IP

NordVPN

🇧🇸

Bahamas

🇺🇸
United States

0.27 ms from Miami, United States

45.95.160.61

ExpressVPN

🇧🇸

Bahamas

🇺🇸
United States

0.15 ms from Miami, United States

64.64.117.18

Private Internet Access

🇧🇸

Bahamas

🇺🇸
United States

0.42 ms from New York, United States

95.181.238.101

FastVPN

🇧🇸

Bahamas

🇺🇸
United States

0.42 ms from Miami, United States

108.171.106.198

IPVanish

🇧🇸

Bahamas

🇺🇸
United States

0.37 ms from Miami, United States

108.171.106.207

Somalia: Mogadishu, via France and the UK

Somalia appears in our sample for only two providers: NordVPN and ProtonVPN.

Both label Mogadishu explicitly in their naming, but these RTTs are exactly what you’d expect for traffic in Western Europe, and completely inconsistent with traffic in East Africa. Both providers go out of their way in the labels (e.g. “SO, Mogadishu”), but the actual traffic is in Nice and London, not Somalia.

Provider

Claimed as

Measured exit country

RTT to nearest probe (evidence)

Example exit IP

NordVPN

🇸🇴

Somalia

🇫🇷
France

0.33 ms from Nice, France

212.32.91.11

ProtonVPN

🇸🇴

Somalia

🇬🇧
United Kingdom

0.37 ms from London, UK

74.118.126.204

When Legacy IP Providers Agree With the Wrong VPN Locations

So far, we’ve talked about VPN claims versus our measurements. But other IP data providers don’t run active RTT tests. They rely on self-declared IP data sources, and often assume that if an IP is tagged as “Country X,” it must actually be there.

In these cases, the IP legacy datasets typically “follow” the VPN provider’s story: if the VPN markets the endpoint as Country X, the legacy IP dataset also places it in Country X.

To quantify that, we looked at 736 VPN exits where ProbeNet’s measured country disagreed with one or more widely used legacy IP datasets.

We then compared the country IPinfo's ProbeNet measured (backed by RTT and routing) with the country reported by these other IP datasets and computed the distance between them. The gaps are large:

How Far Off Were the Other IP Datasets?

Distance between legacy IP databases and IPinfo country

Share of disagreement cases

> 1,000 km

83%

> 2,000 km

63%

> 5,000 km

28%

> 8,000 km

12%

The median error between ProbeNet and the legacy datasets was roughly 3,100 km . On the ProbeNet side, we have strong latency evidence that our measured country is the right one:

  • The median minimum RTT to a probe in the measured country was 0.27 ms .
  • About 90% of these locations had a sub-millisecond RTT from at least one probe.

That’s what you expect when traffic is genuinely in that country, not thousands of kilometers away.

An IP Example You Can Test Yourself

This behavior is much more tangible if you can see it on a single IP.

Here's one VPN exit IP where ProbeNet places the server in the United Kingdom, backed by sub-millisecond RTT from local probes, while other widely used legacy IP datasets place the same IP in Mauritius, 9,691 kilometers away.

🇬🇧 United Kingdom vs 🇲🇺 Mauritius (ProtonVPN)

If you want to check this yourself, you can plug it into a public measurement tool like https://ping.sx/ and run pings or traceroutes from different regions. Tools like this one provide a clear visual for where latency is lowest.

ProbeNet uses the same basic idea, but at a different scale: we maintain a network of 1,200+ points of presence (PoPs) around the world, so we can usually get even closer to the real physical location than public tools with smaller networks.

If you’d like to play with more real IPs (not necessarily VPNs) where ProbeNet and IPinfo get the country right and other datasets don’t, you can find a fuller set of examples on our IP geolocation accuracy page .

Why This Happens and How It Impacts Trust

It’s worth separating technical reasons from trust issues. There are technical reasons to use virtual or hubbed infrastructure:

  • Risk & regulation. Hosting in certain countries can expose both the provider and users to local surveillance or seizure.
  • Infrastructure quality. Some regions simply don’t have the same density of reliable data centers or high-capacity internet links, so running servers there is harder and riskier.
  • Performance & cost. Serving “Bahamas” from Miami or “Cambodia” from Singapore can be cheaper, faster, and easier to maintain.

From this perspective, a virtual location can be a reasonable compromise: you get a regional IP and content unblocking without the downsides of hosting in a fragile environment.

Where It Becomes a Trust Problem

Three things change the picture:

  • Lack of disclosure. Marking something clearly as “Virtual Bahamas (US-based)” is transparent. Listing “Bahamas” alongside “Germany” without any hint that one is virtual and the other is physical blurs the line between marketing and reality.
  • Scale of the mismatch. It’s one thing to have a few virtual locations in hard-to-host places. It’s another when dozens of countries exist only as labels across your entire footprint, or when more than half of your tested locations are actually somewhere else.
  • Downstream reliance. Journalists, activists, and NGOs may pick locations based on safety assumptions. Fraud systems, compliance workflows, and geo-restricted services may treat “Somalia” vs “France” as a meaningful difference. If both the VPN UI and the IP data say “Somalia” while the traffic is physically in France, everyone is making decisions on a false premise.

That last point leads directly into the IP data problem that we are focused on solving.

So How Much Should You Trust Your VPN?

If you’re a VPN user, here are some practical takeaways from this work:

  • Treat “100+ countries” as a marketing number, not a guarantee. In our sample, 97 countries existed only as claims, not reality, across 17 providers.
  • Check how your provider talks about locations. Do they clearly label “virtual” servers? Document where they’re actually hosted? Or do they quietly mix virtual and physical locations in one long list?
  • If you rely on IP data professionally, ask where it comes from. A static “99.x% accurate worldwide” claim doesn’t tell you how an IP data provider handles fast-moving, high-stakes environments like VPN infrastructure.

Ultimately, this isn’t an argument against VPNs, or even against virtual locations. It’s an argument for honesty and evidence. If a VPN provider wants you to trust that map of flags, they should be willing, and able, to show that it matches the real network underneath.

How IPinfo Approaches IP Data Differently

Most legacy IP data providers rely on regional internet registry (RIR) allocation data and heuristics around routing and address blocks. These providers will often accept self-declared data like customer feedback, corrections, and geofeeds, without a clear way to verify them.

IPinfo takes a measurement-first approach:

  1. Proprietary ProbeNet with 1,200+ points of presence
    We maintain an internet measurement platform of PoPs in locations around the world.
  2. Active measurements
    For each visible IP on the internet, including both IPv4 and IPv6 addresses, we measure RTT from multiple probes.
  3. Evidence-based geolocation
    We combine these measurements with IPinfo’s other signals to assign a country (and more granular location) that’s grounded in how the internet actually behaves.

This measurement-first approach is unique in the IP data space. Once we realized how much inaccuracy came from self-declared data, we started investing heavily in research and building ProbeNet to use active measurements at scale. Our goal is to make IP data as evidence-based as possible, verifying with observation on how the internet actually behaves.

Our Methodology for This Report

We approached this VPN investigation the way a skeptical but well-equipped user would: start from the VPNs’ own claims, then test them.

Step 1: Collecting What Providers Say

For each of the 20 VPN providers , we pulled together three kinds of data:

  • Marketing promises: The “servers in X countries” claims and country lists from their websites. When a country was clearly listed there, we treated it as the locations they actively promote.
  • Configurations and locations lists: Configurations from different protocols like OpenVPN or WireGuard were collected along with location information available on provider command-line tools, mobile applications, or APIs.
  • Unique provider–location entries: We ended up with over 6,000,000 data points and a list of provider + location combinations we could actually try to connect to with multiple IPs each.

Step 2: Observing Where the Traffic Really Goes

Next, we used IPinfo infrastructure and ProbeNet to dial into those locations and watch what actually happens:

  • We connected to each VPN “location” and captured the exit IP addresses.
  • For each exit IP address, we used IPinfo + ProbeNet’s active measurements to determine a measured country, plus:
    • The nearest ProbeNet vantage point (e.g., US, Brazil, France)
    • The round-trip time (RTT) from that probe (often under 1 ms), which is a strong hint about physical proximity

Now we had two views for each location:

  • Expected/Claimed country : What the VPN claims in its UI/configs/website
  • Measured country : Where IPinfo + ProbeNet actually see the exit IP

Step 3: Comparing Claims vs Reality

For each location where a country was clearly specified, we asked a very simple question: Does the expected country match the measured country?

If yes, we counted it as a match. If not, it became a mismatch: a location where the app says one country, but the traffic exits somewhere else.

Acknowledgements, Limitations, and Constrains

We deliberately used a very narrow definition of “mismatch.” For a location to be counted, two things had to be true: the provider had to clearly claim a specific country (on their website, in their app, or in configs), and we had direct active measurements from ProbeNet for the exit IPs behind that location.

We ignored any locations where the marketing was ambiguous, where we hadn’t measured the exit directly, or where we only had weaker hints like hostname strings, registry data, or third-party IP databases. Those signals can be useful and true, but we wanted our numbers to be as hard-to-argue-with as possible.

The result is that the mismatch rates we show here are conservative. With a looser methodology that also leaned on those additional hints, the numbers would almost certainly be higher, not lower.

Dick Van Dyke turns 100

Hacker News
www.theguardian.com
2025-12-13 18:58:09
Comments...
Original Article

A ll Hollywood stars grow old and die except perhaps one - Dick Van Dyke - who turns 100 today. The real world Peter Pan who used to trip over the ottoman on The Dick Van Dyke Show is still standing. The man who impersonated a wind-up toy in Chitty Chitty Bang Bang hasn’t wound down just yet. He has outlived mentors, co-stars, romantic partners and several studios. He’s even outlived the jokes about his performance in Mary Poppins. These days his mangled cockney accent is regarded with more fondness than contempt. It’s seen as one of the great charms of the 1964 classic, along with the carousel chase or the cartoon dancing penguins.

Accent on the charm … Dick Van Dyke with Julie Andrews in Mary Poppins.
Accent on the charm … Dick Van Dyke with Julie Andrews in Mary Poppins. Photograph: Donaldson Collection/Getty Images

Charm is the magic ingredient of every popular entertainer and few have possessed it in such abundance as Van Dyke, the impoverished son of a travelling cookie salesman who dropped out of high school and educated himself at the movies. “His job in this life is to make a happier world,” his Broadway co-star Chita Rivera once said - and this may explain his stubborn refusal to quit, not while times are tough and he feels that audiences still need cheering up.

Naturally his workrate has now slowed, but in the past few years he has competed on the TV show The Masked Singer, starred in a Coldplay video and enthusiastically stumped for Bernie Sanders . Van Dyke simply couldn’t understand why America’s older citizens were resistant to Sanders’ democratic socialist domestic policies. He said, “I want to urge my generation to get out and vote for him, please.”

Too much energy … Dick Van Dyke in Chitty Chitty Bang Bang.
Too much energy … Dick Van Dyke in Chitty Chitty Bang Bang. Photograph: Moviestore/REX/Shutterstock

As he nudges into triple figures, he has become a piece of living history: a walking, talking chronicle of US showbusiness itself. Van Dyke began his career performing for the troops in the second world war and proceeded to rub shoulders with the likes of Phil Silvers and Walt Disney. He had one foot in music-hall slapstick and the other in screwball comedy, and possibly splayed fingers in his midwestern hometown of Danville, Illinois.

In bridging these worlds, he perfected an outward-facing public image that was one part Stan Laurel to two parts Jimmy Stewart: a pratfalling clown who was decent and honest and smarter than he first appeared. And while he was already nearing 40 when The Dick Van Dyke Show and Mary Poppins made him an international star, the actor remained irrepressibly boyish. In 1968’s Chitty Chitty Bang Bang , he played Caractacus Potts, the madcap inventor who dreams up a flying car, while Lionel Jeffries - six months younger - played Potts’s addled and eccentric dad.

Van Dyke, by and large, has steered clear of dark films. He famously turned down the lead role in The Omen and insists that he mostly played a version of himself. “Wholesome,” he says. “An all-round good boy.” That’s true so far as it goes, although it’s probably only half the story, because Van Dyke’s interpretation conveniently sidesteps a 25-year struggle with alcoholism that spanned his professional heyday. Possibly it also glosses over the air of dancing mischief – even wildness – that animates his most feted, family-friendly performances.

Sparring mutual respect … Mary Tyler Moore and Dick Van Dyke in The Dick Van Dyke Show (1961).
Sparring mutual respect … Mary Tyler Moore and Dick Van Dyke in The Dick Van Dyke Show (1961). Photograph: CBS Photo Archive/CBS/Getty Images

Or to put it more bluntly, Van Dyke may have been mainstream but he never once felt conservative, nor even cosy, exactly. He brought too much energy to the room. It was as though he’d just blown in from outside and wasn’t entirely housetrained. The Dick Van Dyke Show – an otherwise standard 60s family sitcom – is notable for the crackling sexual chemistry and sparring mutual respect which Van Dyke cooked up with his co-star, Mary Tyler Moore.

Caractacus Potts, for his part, is the ultimate rackety dad: loving and exciting and liable to forget every birthday and dentist appointment. And then there is Bert, the sweep from Mary Poppins who trips across London’s rooftops like an urbanised Puck of Pook’s Hill. The evidence suggests that Bert isn’t cockney at all. He’s a spooky nature spirit, antic and mercurial, who is gamely attempting to pass himself off as a local.

Campaigning for Bernie Sanders, 2020.
Campaigning for Bernie Sanders, 2020. Photograph: Étienne Laurent/EPA

Van Dyke is 100 and therefore no longer looks like Peter Pan. He looks, if anything, the platonic ideal of old age, with laughter lines and a thick white beard, the weathered embodiment of a life well lived. In his later years, he has grown used to people asking him for health advice, to the point where he even sat down and listed it all in a book ( 100 Rules for Living to 100 ).

The man is too self-aware to present himself as a paragon of good living. Instead he credits his longevity to a sprinkle of everyday magic – a combination of good genes, solid friendships and a positive mental outlook. “My life has been a magnificent indulgence,” he says. “I’ve been able to do what I love and share it with the world.”

It’s an arrangement that has sustained him for a full century on the planet. It’s fuelled a career so rewarding and fun that it barely felt like work at all. Van Dyke started out as showbusiness’s gawky gatecrasher, a controlled explosion of elastic limbs and rubber-faced double-takes, before maturing by degrees into Hollywood’s twinkling Father Time. He is ancient but evergreen, feted and cherished. And he’s altogether as lucky as lucky can be.

Former Apple, Google designer: "Are we stuck with the same Desktop UX forever?" [video]

Hacker News
www.youtube.com
2025-12-13 18:39:53
Comments...

Fast, Memory-Efficient Hash Table in Java: Borrowing the Best Ideas

Hacker News
bluuewhale.github.io
2025-12-13 17:41:32
Comments...
Original Article

One day, I ran into SwissTable—the kind of design that makes you squint, grin, and immediately regret every naive linear-probing table you’ve ever shipped.

This post is the story of how I tried to bring that same “why is this so fast?” feeling into Java. It’s part deep dive, part engineering diary, and part cautionary tale about performance work.

1) The SwissTable project, explained the way it feels when you first understand it

SwissTable is an open-addressing hash table design that came out of Google’s work and was famously presented as a new C++ hash table approach (and later shipped in Abseil).

At a high level, it still does the usual hash-table thing: compute hash(key) , pick a starting slot, and probe until you find your key or an empty slot.

The twist is that SwissTable separates metadata (tiny “control bytes”) from the actual key/value storage, and it uses those control bytes to avoid expensive key comparisons most of the time. Instead of immediately touching a bunch of keys (which are cold in cache and often pointer-heavy), it first scans a compact control bytes that is dense, cache-friendly, and easy to compare in bulk.

To make probing cheap, SwissTable effectively splits the hash into two parts: h1 and h2 . Think of h1 as the part that chooses where to start probing (which group to look at first), and h2 as a tiny fingerprint stored in the control bytes to quickly rule slots in or out. It’s not a full hash—just enough bits to filter candidates before we pay the cost of touching real keys.

So on lookup, you compute a hash, derive (h1, h2) , jump to the group from h1 , and compare h2 against all control bytes in that group before you even look at any keys. That means most misses (and many hits) avoid touching key memory entirely until the metadata says “there’s a plausible candidate here.”

Because probing stays cheap, SwissTable can tolerate higher load factors—up to about 87.5% (7/8) in implementations like Abseil’s flat_hash_map —without falling off a performance cliff, which directly improves memory efficiency.
The net effect is a design that is simultaneously faster (fewer cache misses, fewer key compares) and tighter (higher load factor, fewer side structures like overflow buckets).

2) Watching SwissTable become the “default vibe” in multiple languages (Go, Rust)

The first sign you’re looking at a generational design is when it stops being a cool library trick and starts showing up in standard libraries.

Starting in Rust 1.36.0, std::collections::HashMap switched to the SwissTable-based hashbrown implementation. It’s described as using quadratic probing and SIMD lookup , which is basically SwissTable territory in spirit and technique. That was my “okay, this isn’t niche” moment.

Then Go joined the party: Go 1.24 ships a new built-in map implementation based on the Swiss Table design, straight from the Go team’s own blog post . In their microbenchmarks, map operations are reported to be up to 60% faster than Go 1.23, and in full application benchmarks they saw about a 1.5% geometric-mean CPU time improvement . And if you want a very practical “this matters in real systems” story, Datadog wrote about Go 1.24’s SwissTable-based maps and how the new layout and growth strategy can translate into serious memory improvements at scale.

At that point, SwissTable stopped feeling like “a clever C++ trick” and started feeling like the modern baseline . I couldn’t shake the thought: Rust did it, Go shipped it… so why not Java? And with modern CPUs, a strong JIT, and the Vector API finally within reach, it felt less like a technical impossibility and more like an itch I had to scratch.

That’s how I fell into the rabbit hole.

3) SwissTable’s secret sauce meets the Java Vector API

A big part of SwissTable’s speed comes from doing comparisons wide : checking many control bytes in one go instead of looping byte-by-byte and branching constantly. That’s exactly the kind of workload SIMD is great at: load a small block, compare against a broadcasted value, get a bitmask of matches, and only then branch into “slow path” key comparisons. In other words, SwissTable is not just “open addressing done well”—it’s “open addressing shaped to fit modern CPUs.”

Historically, doing this portably in Java was awkward: you either trusted auto-vectorization, used Unsafe , wrote JNI, or accepted the scalar loop. But the Vector API has been incubating specifically to let Java express vector computations that reliably compile down to good SIMD instructions on supported CPUs.

In Java 25, the Vector API is still incubating and lives in jdk.incubator.vector . The important part for me wasn’t “is it final?"—it was “is it usable enough to express the SwissTable control-byte scan cleanly?” Because if I can write “compare 16 bytes, produce a mask, act on set bits” in plain Java, the rest of SwissTable becomes mostly careful data layout and resizing logic. And once you see the control-byte scan as the hot path, you start designing everything else to make that scan cheap and predictable.

So yes: the Vector API was the permission slip I needed to try something I’d normally dismiss as “too low-level for Java.”

I began with the core SwissTable separation: a compact control array plus separate key/value storage . The control bytes are the main character—if those stay hot in cache and the scan stays branch-light, the table feels fast even before micro-optimizations.

I used the familiar h1/h2 split idea: h1 selects the initial group, while h2 is the small fingerprint stored in the control byte to filter candidates. Lookup became a two-stage pipeline: (1) vector-scan the control bytes for h2 matches, (2) for each match, compare the actual key to confirm. Insertion reused the same scan, but with an extra “find first empty slot” path once we know the key doesn’t already exist.

Where Java started pushing back was layout realism .

In C++ you can pack keys/values tightly; in Java, object references mean the “key array” is still an array of pointers, and touching keys can still be a cache-miss parade. So the design goal became: touch keys as late as possible , and when you must touch them, touch as few as possible—again, the SwissTable worldview.

Deletion required tombstones (a “deleted but not empty” marker) so probing doesn’t break, but tombstones also accumulate and can quietly degrade performance if you never clean them up.

Resizing was its own mini-project: doing a full rehash is expensive, but clever growth strategies (like Go’s use of table splitting/extendible hashing) show how far you can take this if you’re willing to complicate the design.

I also had to treat the Vector API as an optimization tool, not a magic wand. Vector code is sensitive to how you load bytes, how you handle tails, and whether the JIT can keep the loop structure stable. I ended up writing the control-byte scan as a very explicit “ load compare mask iterate matches ” loop.

At this stage, the prototype already worked , but it wasn’t yet “SwissTable fast”—it was “promising, and now the real work begins.”

5) The pieces of SwissMap that actually mattered

Here’s what survived the usual round of “this feels clever but isn’t fast” refactors.

Control bytes & layout

With non-primitive keys, the real cost is rarely “a few extra byte reads” — it’s pointer chasing. Even one equals() can walk cold objects and pay cache-miss latency. So SwissMap treats the ctrl array as the first line of defense: scan a tight, cache-friendly byte array to narrow the search to a handful of plausible slots before touching any keys/values.

This matters even more in Java because “keys/values” usually means arrays of references. On 64-bit JVMs, compressed oops (often enabled up to ~32GB depending on alignment/JVM flags) packs references into 32 bits, making the reference arrays denser. When compressed oops is off, references widen to 64 bits and the same number of key touches can spill across more cache lines.

Either way, the ctrl array does most of the work: most misses die in metadata. Compressed oops just makes the unavoidable key touches cheaper when they do happen.

Load factor

In classic open addressing, pushing load factor up usually means the average probe chain gets longer fast — more branches, more random memory touches, and a steep cliff in miss cost. That’s why many general-purpose hash maps pick conservative defaults. Java’s HashMap , for example, defaults to a 0.75 load factor to keep miss costs from ballooning as the table fills.

SwissTable flips the cost model: probing is dominated by scanning the ctrl bytes first, which are dense, cache-friendly, and cheap to compare in bulk. That means “one more probe group” is often just one more ctrl-vector load + compare, not a bunch of key equals() calls. With SwissTable-style probing, the table can run denser without falling off a cliff. Abseil’s SwissTable-family maps are well known for targeting a ~7/8 (0.875) maximum load factor; even when the table is crowded, most probes are still “just metadata work.”

That trade-off is exactly what I wanted in Java too: higher load factor → fewer slots → smaller key/value arrays → fewer cache lines touched per operation, as long as the ctrl scan stays the fast path.

Sentinel padding

SIMD wants fixed-width loads: 16 or 32 control bytes at a time. The annoying part is the tail — the last group near the end of the ctrl array. In native code you might “over-read” a few bytes and rely on adjacent memory being harmless. In Java you don’t get that luxury: out-of-bounds is a hard stop.

Without padding, the probe loop picks up tail handling: extra bounds checks, masked loads, or end-of-array branches — exactly the kind of bookkeeping you don’t want in the hottest path. A small sentinel padding region at the end of the array lets every probe issue the same vector load, keeping the loop predictable and JIT-friendly.

H1/H2 split

Split the hash into h1 (which selects the starting group) and h2 (a small fingerprint stored per slot in the ctrl byte). h1 drives the probe sequence (usually via a power-of-two mask), while h2 is a cheap first-stage filter: SIMD-compare h2 against an entire group of control bytes and only touch keys for the matching lanes.

SwissMap uses 7 bits for h2 , leaving the remaining ctrl-byte values for special states like EMPTY and DELETED . That’s the neat trick: one byte answers both:

  • “Is this slot full/empty/deleted?”
  • “Is this slot even worth a key compare?”

Most lookups reject non-matches in the control plane. And if a probed group contains an EMPTY , that’s a definitive stop signal: the probe chain was never continued past that point, so the key can’t exist “later.”

Reusing the loaded control vector

A ByteVector load isn’t free — it’s a real SIMD-width memory load of control bytes. On my test box, that load alone was ~6ns per probed group. In a hash table where a get() might be only a few dozen nanoseconds, that’s a meaningful tax.

So SwissMap tries hard to load the ctrl vector exactly once per group and reuse it:

  • use the same loaded vector for the h2 equality mask (candidate lanes)
  • and again for the EMPTY / DELETED masks (stop/continue decisions)

No extra passes, no “just load it again,” no duplicate work.

Tombstones

Deletion in open addressing is where correctness bites: if you mark a removed slot as EMPTY , you can break the probe chain. Keys inserted later in that chain would become “invisible” because lookups stop at the first empty.

Tombstones solve that by marking the slot as DELETED (“deleted but not empty”), so lookups keep probing past it.

Reusing tombstones on put

On put , tombstones are not just a correctness hack — they’re also reusable space. The common pattern is:

  • during probing, remember the first DELETED slot you see
  • keep probing until you either find the key (update) or hit an EMPTY (definitive miss)
  • if the key wasn’t found, insert into the remembered tombstone rather than the later empty slot

That tends to keep probe chains from getting longer over time, reduces resize pressure, and prevents workloads with lots of remove/put cycles from slowly poisoning performance.

When tombstones force a same-capacity rehash

Tombstones preserve correctness, but they dilute the strongest early-exit signal: EMPTY . A table with lots of DELETED tends to probe farther on misses and inserts — more ctrl scans, more vector loads, and more chances to touch cold keys.

So SwissMap tracks tombstones and triggers a same-capacity rehash when they cross a threshold (as a fraction of capacity or relative to live entries). This rebuilds the ctrl array, turning DELETED back into EMPTY and restoring short probe chains — basically compaction without changing logical contents.

A resize rehash without redundant checks

Resizing forces a rehash because h1 depends on capacity. The obvious approach is “iterate old entries and call put into the new table,” but that pays for work you don’t need: duplicate checks, extra branching, and unnecessary key equality calls.

The faster path treats resize as a pure move :

  • allocate fresh ctrl/key/value arrays
  • reset counters
  • scan the old table once, and for each FULL slot:
    • recompute (h1, h2) for the new capacity
    • insert into the first available slot found by the same ctrl-byte probe loop
    • without checking “does this key already exist?” (it can’t: you’re moving unique keys)

This makes resizing a predictable linear pass over memory rather than a branchy series of full put() operations.

Iteration is another place where “simple” becomes surprisingly expensive. You can scan linearly and yield FULL slots, but many designs want a stable-ish visit pattern without allocating a separate dense list. And some reinsertion/rehash interactions can even go accidentally quadratic (see the Rust iteration write-up ).

SwissMap avoids extra buffers by iterating with a modular stepping permutation: pick a start and an odd step (with power-of-two capacity, any odd step is coprime), then visit indices via repeated idx = (idx + step) & mask . This hits every slot exactly once, spreads accesses across the table, and keeps iteration as a tight loop over the same ctrl-byte state machine used elsewhere.

Here’s a trimmed version of the lookup to show how the probe loop hangs together around the control bytes:

protected int findIndex(Object key) {
    if (size == 0) return -1;
    int h = hash(key);
    int h1 = h1(h);
    byte h2 = h2(h);
    int nGroups = numGroups();
    int visitedGroups = 0;
    int mask = nGroups - 1;
    int g = h1 & mask; // optimized modulo operation (same as h1 % nGroups)
    for (;;) {
        int base = g * DEFAULT_GROUP_SIZE;
        ByteVector v = loadCtrlVector(base);
        long eqMask = v.eq(h2).toLong();
        while (eqMask != 0) {
            int bit = Long.numberOfTrailingZeros(eqMask);
            int idx = base + bit;
            if (Objects.equals(keys[idx], key)) { // found
                return idx;
            }
            eqMask &= eqMask - 1; // clear LSB
        }
        long emptyMask = v.eq(EMPTY).toLong(); // reuse loaded vector
        if (emptyMask != 0) { // any empty in a probed group is a definitive miss in SwissTable-style probing
            return -1;
        }
        if (++visitedGroups >= nGroups) { // guard against infinite probe when table is full of tombstones
            return -1;
        }
        g = (g + 1) & mask;
    }
}

6) Benchmarks

I didn’t want to cherry-pick numbers that only look good on synthetic cases.
So I used a simple, repeatable JMH setup that stresses high-load probing and pointer-heavy keys—the exact situation SwissTable-style designs are meant to handle.

All benchmarks were run on Windows 11 (x64) with Eclipse Temurin JDK 21.0.9, on an AMD Ryzen 5 5600 (6C/12T).

For context, I compared against HashMap , fastutil’s Object2ObjectOpenHashMap , and Eclipse Collections’ UnifiedMap .

The headline result

At high load factors SwissMap keeps competitive throughput against other open-addressing tables and stays close to JDK HashMap performance. In some benchmarks it also comes out ahead by a large margin.

put hit put miss
CPU: put hit CPU: put miss
get hit get miss
CPU: get hit CPU: get miss

On memory, the flat layout (no buckets/overflow nodes) plus a 0.875 (7/8) max load factor translated to a noticeably smaller retained heap in small-payload scenarios—over 50% less than HashMap in this project’s measurements.

Memory Footprint

Caveat

These numbers are pre-release; the Vector API is still incubating, and the table is tuned for high-load, reference-key workloads. Expect different results with primitive-specialized maps or low-load-factor configurations.

Quick heads-up

You might notice a SWAR-style SwissTable variant appearing out of nowhere in the benchmark section. That version is part of a follow-up round of tuning: same SwissTable control-byte workflow, but implemented with SWAR to reduce overhead and avoid leaning on the incubating Vector API.

If SWAR is new to you, think of it as “SIMD within a register”: instead of using vector lanes, we pack multiple control bytes into a single 64-bit word and do the same kind of byte-wise comparisons with plain scalar instructions. The end result is a similar fast-path idea, just expressed in a more portable (and JDK-version-friendly) way.

I didn’t want this post to turn into three posts, so I’m saving the full “how/why” (SWAR included) for the next write-up — stay tuned.

P.S. If you want the code

This post is basically the narrative version of an experiment I’m building in public: HashSmith , a small collection of fast, memory-efficient hash tables for the JVM.

It includes SwissMap (SwissTable-inspired, SIMD-assisted probing via the incubating Vector API), plus a SwissSet variant to compare trade-offs side by side.

It’s explicitly experimental (not production-ready), but it comes with JMH benchmarks and docs so you can reproduce the numbers and poke at the implementation details.

If you want to run the benchmarks, sanity-check an edge case, or suggest a better probe/rehash strategy, I’d love issues/PRs

P.P.S. A great talk to watch

If you want the “straight from the source” version, Matt Kulukundis and other Google engineers’ CppCon talk on SwissTable is genuinely excellent — clear, practical, and packed with the kind of details that make the design click: CppCon talk .

Analysis finds anytime electricity from solar available as battery costs plummet

Hacker News
pv-magazine-usa.com
2025-12-13 17:32:21
Comments...
Original Article

Ember’s report outlines how falling battery capital expenditures and improved performance metrics have lowered the levelized cost of storage, making dispatchable solar a competitive, anytime electricity option globally.

A report from energy think tank Ember details how cost reductions in battery storage technology are enabling dispatchable solar power to compete with conventional power sources.

Ember’s assessment draws on data from recent auctions across international markets, including Italy, Saudi Arabia, and India, supplemented by expert interviews conducted in October 2025.

This research indicates the industry is moving into a new environment where scaling of manufacturing capacity and competition have pushed costs down. The cost of core BESS equipment fell by 40% in 2024 compared with 2023, according to BloombergNEF’s global benchmark, reaching a record low of $165 per kWh.

Ember’s October 2025 data said a further large fall in 2025 is on track. Over the last 10 years, installed costs have fallen by 20% per year on average, while deployment has increased by around 80% per year.

According to the findings, the all-in capital expenditure for building a large, long-duration utility-scale battery energy storage system project in global markets outside of the U.S. and China is now approximately $125 per kWh. This figure reflects project pricing, comprising $75 per kWh for core equipment sourced from China, including battery enclosures, the power conversion system (PCS), and energy management system (EMS) and $50 per kWh for local installation, engineering, and grid connection activities.

These capital costs translate into a levelized cost of storage (LCOS) of $65 per MWh. This LCOS reflects the cost of shifting electricity to a different time period.

Image: Ember

Ember said this reduction in LCOS is driven by equipment costs and improved performance metrics, such as a 20-year design life for LFP technology, 90% efficiency, and lower project financing costs due to de-risked revenue models. Longer lifetimes, higher efficiency, and lower project risks reduce the LCOS by 35% even before accounting for the falling equipment prices, said the report.

The core implication of this analysis is the economic viability of dispatchable solar.

Storing 50% of a day’s solar output to meet night-time demand adds $33 per MWh to the generation cost. Using the 2024 global average solar price of $43 per MWh, the total cost for dispatchable solar is calculated at $76 per MWh.

This positions dispatchable solar as a cost-effective alternative to new gas power plants, particularly in regions reliant on imported LNG.

For the U.S., core equipment costs can reach $100 per kWh or higher in markets with higher tariffs or stricter standards. With the U.S. market facing various tariffs on Chinese imports and domestic content requirements via the Inflation Reduction Act (IRA), the $125 per kWh BESS project capex estimate is not directly applicable to the U.S. market.

The total equipment costs are 10% to 15% cheaper for four-hour projects, a key project duration in the U.S., as some components are sized to power rather than energy. However, even with cost variations, the U.S. is the second biggest BESS market globally, behind China, and saw record growth in Q1 2025 across all segments.

In 2024, Texas, California, Arizona, and Nevada all saw significant utility-scale battery growth, with the U.S. adding 10 GW of utility-scale batteries nationally, an 80% increase over the previous year. This growth is accelerating the integration of solar into the grid.

Ember’s conclusion is that solar has evolved beyond daytime electricity; coupled with storage, it becomes dispatchable, anytime power, positioned to meet a substantial portion of the world’s future energy needs.

This content is protected by copyright and may not be reused. If you want to cooperate with us and would like to reuse some of our content, please contact: editors@pv-magazine.com .

Popular content

Using nvi as a Minimal and Fast Text Editor

Lobsters
git.sr.ht
2025-12-13 17:28:35
Comments...
Original Article
Using nvi as a Minimal and Fast Text Editor =========================================== Overview: --------- This document introduces the nvi text editor, a clean and efficient implementation of the original Berkeley vi written by Keith Bostic. nvi is designed for speed, low memory usage, and predictable behavior. It is especially suited for system administration, SSH sessions, editing configuration files, and handling very large text files such as logs or JSON dumps on Unix-like systems including Slackware. Unlike modern editors, nvi avoids complexity (plugins, scripting, syntax highlighting) and focuses on reliability and efficiency. 1. Why Use nvi? --------------- nvi provides the traditional vi experience but with several practical advantages: - extremely low memory usage - fast access to distant lines - excellent performance on multi-gigabyte files - simple configuration and predictable commands - small codebase aligned with the Unix philosophy These characteristics make nvi ideal for minimal environments or users who prefer a distraction-free workflow. 2. Internal Architecture ------------------------ nvi embeds its own modified, partial implementation of Berkeley DB 1.85. This embedded database layer is used exclusively for storing and managing the text being edited. It is not a full Berkeley DB engine. Key components: * Partial Berkeley DB: The embedded DB code includes only the subsystems required by vi: recno, btree, hash, and mpool. Features such as transactions, multi-user locking, logging, or general database APIs are not present. * Recno (Record Number Interface): Each line of the file is stored as a numbered record. This provides fast random access to line N without scanning the file sequentially, even in files containing millions of lines. * B-Tree: Used for indexing and efficient lookup. The implementation inside nvi is trimmed down specifically for the editor's needs. * Mpool (memory pool): Manages cached database pages in memory. Only the necessary pages are loaded, allowing nvi to operate smoothly without loading the entire file into RAM. * Not stored as .db files: Although based on Berkeley DB, the internal structures are kept purely in memory through mpool. nvi does not write these internal DB pages to disk as a persistent database. * Recovery files: Crash recovery is handled separately. nvi stores recovery metadata in: /var/tmp/vi.recover/ Files named recover. and vi. allow interrupted editing sessions to be restored with: nvi -r These recovery files are independent from the in-memory recno/mpool structures used for normal editing. This architecture explains nvi's ability to handle extremely large files with low memory usage. 3. Minimal Configuration (.exrc) ------------------------------- nvi uses a simple configuration file named ".exrc". A minimal setup might be: set showmode set showmatch set ruler set shiftwidth=2 set tabstop=2 No plugins, runtime files, or external tools are required. 4. Working Without Syntax Highlighting -------------------------------------- nvi does not include syntax highlighting. This design choice promotes: - reduced visual noise - clearer focus on logic and structure - consistent display across all terminals - improved readability in minimal environments For scripting, logs, configuration files, and remote administration, highlighting is optional rather than necessary. 5. Undo, Marks, and Window Splits --------------------------------- Undo and redo: u undo last change . repeat undo uu redo Marks offer precise navigation: mx set mark x 'x jump to line of mark x `x jump to exact cursor position Window splits: :vs vertical split :E horizontal split 6. Recommended Usage with tmux ------------------------------ Since nvi does not implement tabs, tmux is the ideal companion: - multiple panes - persistent sessions - fullscreen focus - fast switching between tasks This combination keeps the editor itself minimal while still supporting multi-file workflows. 7. nvi on Slackware ------------------- nvi was added to Slackware -current on: Mon Jan 13 00:11:55 UTC 2020 This was the first appearance of nvi in the distribution, and it later became part of the Slackware 15.0 stable release (February 2022). Slackware 14.2 and earlier shipped with Elvis as /usr/bin/vi. On the same day nvi was added, Elvis was rebuilt to remove its /usr/bin/vi and /usr/bin/ex symlinks. From that point onward, nvi provides those symlinks **only if** they are not already supplied by another editor. This preserves Slackware’s long-standing policy that users should be able to choose their preferred vi implementation. Reasons for adoption: - nvi provides UTF-8 support (Elvis lacked this capability) - cleaner and more predictable behavior in modern terminals - smaller and more maintainable codebase - strong alignment with Slackware’s minimal and traditional design Slackware applies a number of maintenance patches to nvi, addressing: - wide-character and UTF-8 behavior - build system portability - memory safety in recno/mpool - recovery stability - accumulated bug fixes from BSD and Debian These patches modernize nvi without altering its classic vi behavior. Users may adjust the default vi/ex implementation using: pkgtool -> Setup -> vi-ex 8. Essential Commands --------------------- A concise nvi quick reference is available at: https://4c6e.xyz/nvi.html 9. When nvi Is a Good Choice ---------------------------- nvi is ideal for: - servers and SSH environments - editing configuration files in /etc - minimal or low-resource systems - large log files or JSON datasets - users who prefer classic vi behavior - workflows requiring speed and simplicity 10. Conclusion -------------- nvi follows the original Unix philosophy: simple, fast, and reliable. Its embedded Berkeley DB layer enables excellent performance with large files, while its minimal design keeps editing predictable and distraction-free. For Slackware and other Unix users seeking a lightweight, efficient editor, nvi remains an outstanding choice. Appendix: Minimal Comparison (nvi vs vim vs elvis) -------------------------------------------------- nvi: - fastest and lightest - best for huge files (logs/JSON) - classic vi behavior - ideal for servers and SSH - no syntax, no plugins, no extras vim: - most features and plugins - best for programming and IDE-like workflow - syntax highlighting, scripting, LSP - heavier and slower with large files elvis: - small, portable vi clone - unique display modes - good for rescue/embedded systems - limited UTF-8 support Upstream and historical references ---------------------------------- nvi originates from the Berkeley vi developed by Keith Bostic. Historical documentation and background can be found on the Berkeley vi home page maintained by the original author: https://sites.google.com/a/bostic.com/keithbostic/the-berkeley-vi-editor-home-page ------------------------------------------------------------------ Last Modified: 2025-12-13 11:37:32 UTC

Editors should have an opt-in for less assistance (2024)

Lobsters
www.da.vidbuchanan.co.uk
2025-12-13 17:19:16
Comments...
Original Article

By David Buchanan, 2 nd January 2024

This is a rant. You have been warned!

I'd like text editors to be worse. Specifically, I'd like their default behaviour to be as close as possible to the median text input box you'd find in any piece of software, like the humble HTML <textarea> . More realistically, I'd like a configuration preset that lets me opt in to the same, without having to hunt for a thousand individual setting tweaks. This opt-in should apply as globally as possible, perhaps as an environment variable.

My rationale is simple: I hate context switching. I want my input sequences to always work, no matter what software I'm using. Trying to apply deeply ingrained muscle memory in the wrong context and having it not work can be extremely frustrating.

As you can maybe guess from that, modal text editors like vim don't work for me. Consider them to be outside the scope of this rant—I only care about text editors (and more broadly, text inputs) that are superficially "normal."

Let's take a concrete example: typing a quoted string. My go-to input sequence is to type " " [left] , i.e., typing two quotes at once and then moving the cursor back into the middle of the two, ready to type the string itself: "|" . That's three keystrokes, but in my brain it's a single action, and it works just about anywhere I might want to use it.

That is, everywhere except editors that try to be too smart. There are two common behaviours, both annoying in their own way:

  • Pressing " once produces "|" , and finishing the input sequence results in "|"" .

  • Pressing " once produces "|" , pressing it again produces ""| , and pressing left results in "|" .

The first behaviour is obviously undesirable because it produced an unexpected result.

The second one produced the correct result, but I hate it even more! I hate it because it added a whole extra layer of "too smart" behaviour to try to counteract the first. That is, pressing " with the cursor already in front of a " character will not insert an additional " , but merely move the cursor to the other side of it. This works out alright in this particular scenario, but it's extremely confusing in any other situation where you happen to want to type a " in front of an existing one. This might sound far-fetched, but it's the default behaviour of Firefox's dev console, and it's something I stumble over regularly.

I'm sure there's a setting I could tweak for this, but my problem is not Firefox; my problem is that these annoying behaviours are everywhere , and not just in relation to quoted strings.

When there's a problem "everywhere," it might seem like the best approach is to suck it up and get used to it. But I can't, because the problem isn't a specific behaviour; the problem is the divergence of behaviours. I can't possibly get used to all of them!

Defining the Problem

I'm going to name this annoyance " auto-input ". Anything that inputs characters I didn't type for myself (or otherwise opt in to) is on my naughty list. This does not include input suggestions . For example, when typing a symbol name, an editor might offer a drop-down list of suggested completions that I can opt in to using the tab key. That's great because if I ignore the suggestion and continue typing obliviously, everything still works.

Text editors are encouraged to be smart, but that smartness shouldn't degrade the basics.

Exceptions

The one exception to my auto-input hatred is auto-indentation. When I press enter, I expect the new line to be pre-filled with the same leading whitespace as the line above. Anything that tries to be smarter than that, especially those that try to be language-aware, will likely get on my nerves too. And, if the indentation is spaces, pressing backspace should delete exactly one of them at a time. This is a matter of personal preference, but so is everything else in this article.

Solutions

I'm too lazy to do any serious work towards a solution here, beyond writing this rant, but I would like to float the idea of a NO_AUTOINPUT environment variable. If the variable is present on first-run of an application, it should set the appropriate settings defaults to minimise auto-input behaviours. After that, the settings can be tweaked per user preference.

For the sake of homogeneity, auto-indentation should be disabled with NO_AUTOINPUT too. I'll accept the collateral damage of having to re-enable one setting, in the cases where I need it.

I Tried Gleam for Advent of Code, and I Get the Hype

Lobsters
blog.tymscar.com
2025-12-13 17:01:32
Comments...
Original Article

I do Advent of Code every year.

For the last seven years, including this one, I have managed to get all the stars. I do not say that to brag. I say it because it explains why I keep coming back.

My Advent of Code stars

It is one of the few tech traditions I never get bored of, even after doing it for a long time. I like the time pressure. I like the community vibe. I like that every December I can pick one language and go all in.

This year, I picked Gleam.

A much shorter year #

Advent of Code is usually 25 days. This year Eric decided to do 12 days instead.

So instead of 50 parts, it was 24.

That sounds like a relaxed year. It was not, but not in a bad way.

The easier days were harder than the easy days in past years, but they were also really engaging and fun to work through. The hard days were hard, especially the last three, but they were still the good kind of hard. They were problems I actually wanted to wrestle with.

It also changes the pacing in a funny way. In a normal year, by day 10 you have a pretty comfy toolbox. This year it felt like the puzzles were already demanding that toolbox while I was still building it.

That turned out to be a perfect setup for learning a new language.

Why Gleam felt like a good AoC language #

Gleam is easy to like quickly.

The syntax is clean. The compiler is helpful, and the error messages are super duper good. Rust good.

Most importantly, the language strongly nudges you into a style that fits Advent of Code really well. Parse some text. Transform it a few times. Fold. Repeat.

Also, pipes. Pipes everywhere. I love pipes.

One thing I did not expect was how good the editor experience would be. The LSP worked much better than I expected. It basically worked perfectly the whole time. I used the Gleam extension for IntelliJ and it was great.

https://plugins.jetbrains.com/plugin/25254-gleam-language

I also just like FP.

FP is not always easier, but it is often easier. When it clicks, you stop writing instructions and you start describing the solution.

The first Gleam superpower: echo #

The first thing I fell in love with was echo .

It is basically a print statement that does not make you earn it. You can echo any value. You do not have to format anything. You do not have to build a string. You can just drop it into a pipeline and keep going.

This is the kind of thing I mean:

list.range(0, 5)
|> echo
|> list.map(int.to_string)
|> echo

You can quickly inspect values at multiple points without breaking the flow.

I did miss string interpolation, especially early on. echo made up for a lot of that.

It mostly hit when I needed to generate text, not when I needed to inspect values. The day where I generated an LP file for glpsol is the best example. It is not hard code, but it is a lot of string building. Without interpolation it turns into a bit of a mess of <> s.

This is a small excerpt from my LP generator:

"Minimize\n"
<> "  total: "
<> buttons
|> string.join(" + ")
<> "\n\nSubject To\n"

It works. It is just the kind of code where you really feel missing interpolation.

Options everywhere, and why that matters for grid puzzles #

A lot of AoC is grids.

Grids are where you normally either crash into out of bounds bugs, or you litter your code with bounds checks you do not care about.

In my day 4 solution I used a dict as a grid. The key ergonomic part is that dict.get gives you an option-like result, which makes neighbour checking safe by default.

This is the neighbour function from my solution:

fn get_neighbours(grid: Grid(Object), pos: Position) -> List(Object) {
  [
    #(pos.0 - 1, pos.1 - 1),
    #(pos.0 - 1, pos.1),
    #(pos.0 - 1, pos.1 + 1),
    #(pos.0, pos.1 - 1),
    #(pos.0, pos.1 + 1),
    #(pos.0 + 1, pos.1 - 1),
    #(pos.0 + 1, pos.1),
    #(pos.0 + 1, pos.1 + 1),
  ]
  |> list.filter_map(fn(neighbour_pos) { grid |> dict.get(neighbour_pos) })
}

That last line is the whole point.

No bounds checks. No sentinel values. Out of bounds just disappears.

I expected to write parsers and helpers, and I did. What I did not expect was how often Gleam already had the exact list function I needed.

list.transpose saved a whole day #

Day 6 part 1 was basically a transpose problem in disguise.

I read the input, chunked it into rows, transposed it, and suddenly the rest of the puzzle became obvious.

input
|> list.transpose
|> list.map(fn(line) { line |> calculate_instruction })
|> bigi.sum

In a lot of languages you end up writing your own transpose yet again. In Gleam it is already there.

list.combination_pairs is a cheat code #

Another example is list.combination_pairs .

In day 8 I needed all pairs of 3D points. In an imperative language you would probably write nested loops and then question your off by one logic.

In Gleam it is a one liner:

boxes
|> list.combination_pairs

Sometimes FP is not about being clever. It is about having the right function name.

fold_until is my favorite thing I found #

If I had to pick one feature that made me want to keep writing Gleam after AoC, it is fold_until .

Early exit without hacks is fantastic in puzzles.

In day 8 part 2 I kept merging sets until the first set in the list contained all boxes. When that happens, I stop.

The core shape looks like this:

|> list.fold_until(initial, fn(acc, pair) {
  case done_yet {
    True -> Stop(new_acc)
    False -> Continue(new_acc)
  }
})

It is small, explicit, and it reads like intent.

I also used fold_until in day 10 part 1 to find the smallest combination size that works.

Where Gleam fought me a bit #

Even though I enjoyed Gleam a lot, I did hit a few recurring friction points.

None of these are deal breakers. They are just the kind of things you notice when you do 24 parts in a row.

File IO is not in the standard library #

This one surprised me on day 1.

For AoC you read a file every day. In this repo I used simplifile everywhere because you need something. It is fine, I just did not expect basic file IO to be outside the standard library.

Regex is a dependency too #

Day 2 part 2 pushed me into regex and I had to add gleam_regexp .

This is the style I used, building a regex from a substring:

let assert Ok(re) = regexp.from_string("^(" <> substring <> ")+$")
regexp.check(re, val)

Again, totally fine. It just surprised me.

List pattern matching limitations #

You can do [first, ..rest] and you can do [first, second] .

But you cannot do [first, ..middle, last] .

It is not the end of the world, but it would have made some parsing cleaner.

Comparisons are explicit #

In Gleam a lot of comparisons are not booleans. You get an order value.

This is great for sorting. It is also very explicit. It can be a bit verbose when you just want an <= check.

In day 5 I ended up writing patterns like this:

case cmp_start, cmp_end {
  order.Lt, _ -> False
  _, order.Gt -> False
  _, _ -> True
}

Big integers, and targeting JavaScript #

I used bigi a few times this year.

On the Erlang VM, integers are arbitrary precision, so you usually do not care about overflow. That is one of the nicest things about the BEAM.

If you want your Gleam code to also target JavaScript, you do care. JavaScript has limits, and suddenly using bigi becomes necessary for some puzzles.

I wish that was just part of Int , with a single consistent story across targets.

The most satisfying part: XOR as bitmasks #

Day 10 part 1 was my favorite part of the whole event.

The moment I saw the toggling behavior, it clicked as XOR. Represent the lights as a number. Represent each button as a bitmask. Find the smallest combination of bitmasks that XOR to the target.

This is the fold from my solution:

combination
|> list.fold(0, fn(acc, comb) {
  int.bitwise_exclusive_or(acc, comb)
})

It felt clean, it felt fast, and it felt like the representation did most of the work.

The least satisfying part: shelling out to glpsol #

Day 10 part 2 was the opposite feeling.

I knew brute force was out. It was clearly a system of linear equations.

In previous years I would reach for Z3, but there are no Z3 bindings for Gleam. I tried to stay in Gleam, and I ended up generating an LP file and shelling out to glpsol using shellout .

It worked, and honestly the LP format is beautiful.

Here is the call:

let _ =
  shellout.command(
    "glpsol",
    ["--lp", "temp.lp", "-w", "temp_sol.txt"],
    ".",
    [],
  )

It is a hack, but it is a pragmatic hack, and that is also part of Advent of Code.

Memoization keys that actually model the problem #

Day 11 part 2 is where I was happy I was writing Gleam.

The important detail was that the memo key is not just the node. It is the node plus your state.

In my case the key was:

#(neighbour, new_seen_dac, new_seen_fft)

Once I got the memo threading right, it ran instantly.

The finale, and the troll heuristic #

The last day was the only puzzle I did not fully enjoy.

Not because it was bad. It just felt like it relied on assumptions about the input, and I am one of those people that does not love doing that.

I overthought it for a bit, then I learned it was more of a troll problem. The “do the areas of the pieces, when fully interlocked, fit on the board” heuristic was enough.

In my solution it is literally this:

heuristic_area <= max_area

Sometimes you build a beautiful mental model and then the right answer is a single inequality.

Closing thoughts #

I am very happy I picked Gleam this year.

It has sharp edges, mostly around where the standard library draws the line and a few language constraints that show up in puzzle code. But it also has real strengths.

Pipelines feel good. Options and Results make unsafe problems feel safe. The list toolbox is better than I expected. fold_until is incredible. Once you stop trying to write loops and you let it be functional, the solutions start to feel clearer.

I cannot wait to try Gleam in a real project. I have been thinking about using it to write a webserver, and I am genuinely excited to give it a go.

And of course, I cannot wait for next year’s Advent of Code.

If you want to look at the source for all 12 days, it is here:

https://github.com/tymscar/Advent-Of-Code/tree/master/2025/gleam/aoc/src

EasyPost (YC S13) Is Hiring

Hacker News
www.easypost.com
2025-12-13 17:01:25
Comments...
Original Article

High-performance shipping made easy

Our team of problem solvers brings a modern approach to shipping logistics. We collaborate across departments, ask challenging questions, explore new solutions, and take accountability for our wins and mistakes.

Dear Job Seekers,

We want to ensure your safety and protect you from potential scams. Recently, there have been fraudulent recruitment initiatives online that impersonate our company. These scams aim to deceive unsuspecting applicants by offering nonexistent positions and requesting personal information or upfront fees.

Remember that our company does not endorse any job postings outside our official channels. If you encounter a suspicious offer, report it through the job platform on which you found it or report email as spam.

If you need to check on the validity of an email from EasyPost, feel free to reach out directly to recruiting@easypost.com

For more information on this scam, please see this FTC Consumer Alert .

The future of you

As industry experts, we’re working not only to help our customers make sense of the industry, but to define where it’s headed. We are looking for candidates who are approachable, dynamic, inventive, intelligent, and reliable to join our team in unpacking the future of shipping.

The future of shipping

How can modern, flexible technology improve the customer experience of shipping? What if every business was able to offer same-day shipping? How much waste would be removed from the environment if all our shipments were consolidated into one delivery per week? At EasyPost, we’re figuring out the answer to these questions and more.

Life at EasyPost

Adaptive

Embrace new challenges to grow your skill set.

Simple

Create efficient solutions that are easy to execute.

Inclusive

Share new ideas and work collaboratively across teams.

Team and technology

Team and technology

We’re a fun group of passionate entrepreneurs who built our own revolutionary software designed to make shipping simple. EasyPost started as an Engineering first company and we are proud to have a pragmatic approach to software development. Our team has a wealth of diverse experience and different backgrounds ranging from startups to large technology companies.Be part of a leading technology company:

  • CI/CD inspired workflows – we deploy dozens of times a day
  • Small services over monoliths – we’ve deployed hundreds of services
  • Strong engineering tooling and developer support
  • Transparency and participation around architecture and technology decisions
  • Culture of blamelessness and improving today from yesterday’s shortcomings

Benefits and perks

Medical, dental,
vision plans

Flexible
time-off

Stock option opportunities

Cross-functional learning

Monthly
virtual events

Start your adventure at EasyPost

Customer Success

Product Management

Sales

Engineering

YouTube channels spreading fake, anti-Labour videos viewed 1.2bn times in 2025

Guardian
www.theguardian.com
2025-12-13 17:00:37
Exclusive: More than 150 anonymous channels using cheap AI tools to spread false stories about Keir Starmer, study finds YouTube channels spreading fake, anti-Labour videos have amassed more than a billion views this year, as opportunists attempt to use AI-generated content to profit from political ...
Original Article

YouTube channels spreading fake, anti-Labour videos have amassed more than a billion views this year, as opportunists attempt to use AI-generated content to profit from political division in the UK.

More than 150 channels have been detected in the last year that promote anti-Labour narratives, as well as outright fake and inflammatory accusations about Keir Starmer .

A study seen by the Guardian has found the channels have accumulated 5.3m subscribers and have created more than 56,000 videos, with a total of almost 1.2bn views in 2025. The network of anonymous channels includes alarmist rhetoric, AI scripts and British narrators to attract hits.

Starmer is personally targeted. The prime minister was either named in the video title or description 15,600 times.

Reset Tech, the non-profit group that produced the research, said the channels were part of a global trend to produce synthetic propaganda on the platform. It pointed to the proliferation of cheap AI tools that could be deployed to make a quick profit from divisive topics.

One channel called Britain News-night talked about Starmer and Reeves facing arrest. Another, TheUKPoliticalBrief, touted videos on the “explosive truth” about immigrant crime and marches on Westminster.

The UK NewsCore channel focused on how Nigel Farage was ousting Starmer, and claimed the prime minister was “sacked live” and thrown out of parliament.

Other videos featured bizarre, fabricated stories about a row between the royal family and the government. One channel, Gold Up!, said the dispute had left Starmer “melting down on live TV”.

Some of the videos and channels were removed by YouTube’s checks. However, all 150 were taken down when the platform was approached by the Guardian. Reset Tech said some channels had created tens or hundreds of similar videos without being deplatformed.

The research found similar channels operating in German, French, Spanish and Polish, targeting other politicians or political issues. In total, it mapped 420 problematic channels operating in Europe. Reset Tech said Russian-speaking creators operate some of the channels.

It is believed channels aimed at the UK were being driven by opportunistic creators trying to monetise political division over issues like immigration, rather than overseas political actors. However, it said their presence still posed a risk to public trust.

The content has caused concern inside Labour . “The rise of fake news online is a serious threat to our democracy,” a spokesperson said. “The public will be rightly alarmed that democratically elected leaders and institutions are being undermined by bad faith foreign state actors and those seeking to profit from misinformation.

“We’ve already seen attempts from overseas to influence fair elections and manipulate public opinion both here and abroad.

“The government is stepping up its efforts to work with online platforms to tackle this scourge on free and fair democracy. But it’s important that tech bosses take this threat seriously and live up to their obligations to remove this type of content wherever it’s found.”

Dylan Sparks, UK director of Reset Tech, called for YouTube to take swifter action. “Malicious actors are permitted by YouTube to spread synthetic ‘news’ that disrupts political debate in the UK, while also earning revenue from it,” he said. “This AI-generated, low cost content spreads across the platform undetected, revealing clear weaknesses in YouTube’s monetisation and content moderation systems.

“This specific network focuses on the prime minister and Labour government, but the same loopholes could be exploited by any hostile actor to push an agenda. Because social media platforms profit from engagement, their business model creates an in-built tension between enforcing their own policies and reducing the spread of malicious content that drives revenue.

“The rapid spread of AI has also introduced new risks to the online environment, and platforms need to move faster and invest more to address them.”

A YouTube spokesperson said: “Spam and deceptive practices that try to take advantage of the YouTube community are not allowed on the platform, which is why the channels flagged by the Guardian have all been removed.

“We enforce our policies consistently, regardless of political viewpoint expressed, or how the content is generated. Our teams work around the clock to monitor for harmful content, taking swift action as needed.”

YouTube is now working with Reset Tech over its findings. The platform said its systems prominently feature authoritative news content on the YouTube homepage, in search results, and through recommendations. It has removed more than 2.1m channels for violating its community guidelines.

Ministers have already formed an online advertising taskforce to see what action can be taken to address the advertising-based monetisation of harmful and misleading content.

Aging Out of Fucks: The Neuroscience of Why You Suddenly Can't Pretend Anymore

Hacker News
www.blog.lifebranches.com
2025-12-13 16:51:17
Comments...
Original Article

You’re in a meeting. Someone says something objectively wrong. And instead of doing your usual dance—the soft correction, the diplomatic phrasing, the careful preservation of everyone’s feelings—you just... say it.

“That’s not accurate.”

No cushioning. No apology. No emotional labor to make your truth more palatable.

And everyone looks at you like you’ve grown a second head.

Welcome to what I call the Great Unfuckening—that point in midlife when your capacity to pretend, perform, and please others starts shorting out like an electrical system that’s finally had enough.

You might think you’re becoming difficult. Impatient. One of those “bitter older women” you were warned about.

But here’s what’s actually happening: your brain is restructuring itself. And thank god for that.

Let’s start with the science, because this isn’t about you becoming a worse person. It’s about your brain finally doing some overdue maintenance.

For decades, your prefrontal cortex—the part of your brain responsible for executive function, social behavior, and impulse control—has been working overtime. It’s been monitoring social cues, calculating risks, suppressing authentic responses, and managing everyone else’s emotional experience.

This is exhausting work. And it turns out, it’s unsustainable.

Research in neuroscience shows that as we age, the brain undergoes a process called synaptic pruning. Neural pathways that aren’t essential get trimmed away. Your brain is essentially Marie Kondo-ing itself, keeping what serves you and discarding what doesn’t.

And all those neural pathways dedicated to hypervigilant people-pleasing? They’re often first on the chopping block.

Dr. Louann Brizendine, neuropsychiatrist and author of “The Female Brain,” explains that women’s brains are particularly wired for social harmony and caregiving in the first half of life—driven partly by estrogen and oxytocin. But as estrogen levels shift in perimenopause and beyond, this intense drive to please and nurture others begins to diminish.

What replaces it isn’t bitterness. It’s clarity.

Think about what you’ve been doing since you were old enough to understand social dynamics:

Reading the room. Adjusting your tone. Softening your language. Making yourself smaller to make others comfortable. Laughing at jokes that weren’t funny. Agreeing with opinions you didn’t share. Explaining things carefully so no one feels threatened by your knowledge.

You’ve been running complex social calculations every single day for decades.

There’s a concept in psychology called “decision fatigue”. The deteriorating quality of decisions made after a long session of decision-making. But what we don’t talk about enough is emotional labor fatigue.

After thousands of interactions where you’ve monitored and managed your authentic responses to maintain social harmony, something in your system starts breaking down. Not because you’re broken, but because the system was never meant to run this way indefinitely.

Your brain isn’t malfunctioning. It’s finally refusing to malfunction anymore.

Men experience aging changes too, obviously. But women tend to report this shift more dramatically, and there’s a reason for that.

From childhood, girls are socialized for social harmony in ways boys simply aren’t. Research shows that girls as young as 4 already demonstrate more awareness of others’ emotions and adjust their behavior accordingly more than boys do.

By the time you reach midlife, you’ve had 40+ years of this conditioning. That’s four decades of:

“Don’t be bossy” (translation: don’t lead)

“Don’t be pushy” (translation: don’t assert boundaries)

“Don’t be difficult” (translation: don’t have needs)

“Don’t be emotional” (translation: don’t be human)

You’ve been performing an elaborate social choreography so long it became automatic. You stopped noticing you were doing it.

Until suddenly, you can’t anymore. Or more accurately—you won’t.

Several neurological and hormonal shifts converge in midlife that contribute to this phenomenon:

Hormonal recalibration. As estrogen declines, so does its moderating effect on emotional responses and social bonding behaviors. You’re not becoming “hormonal” in the dismissive sense. You’re becoming less chemically compelled to prioritize others’ comfort over your own truth.

Prefrontal cortex changes. The same executive function region that helped you suppress inappropriate responses for decades starts operating differently. Some research suggests it becomes less reactive to social judgment and approval. You’re literally less neurologically invested in what others think.

Accumulated stress response. Decades of chronic low-level stress from constant social monitoring takes a biological toll. Your stress response system—the HPA axis—can become dysregulated. What looks like “not having a filter” might actually be a stress response system that’s finally saying “enough.”

Cognitive prioritization shifts. Your brain starts prioritizing differently. Energy becomes more precious. Time becomes more finite. The cost-benefit analysis of pretending shifts dramatically.

Here’s the part that makes this transition so uncomfortable: other people don’t like it.

When you stop performing emotional labor, systems that relied on that labor start breaking down. And instead of examining why the system needed your performance to function, people blame you for withdrawing it.

You’re suddenly:

  • “Not a team player”

  • “Going through something”

  • “Difficult to work with”

  • “Changed” (said with concern that really means disapproval)

The same directness that would be called “no-nonsense” in a man gets called “abrasive” in a woman over 40.

This backlash is proof of concept. It confirms that your people-pleasing wasn’t optional. It was required labor that kept everything running smoothly. And when you stop providing it for free, people notice.

The discomfort you’re causing? That’s not your problem to fix. That’s information about a system that was always exploiting you.

But here’s what complicates this: the liberation feels dangerous.

You’ve been rewarded your entire life for being accommodating. Easy. Pleasant. Not too much. The positive feedback loop of being liked is powerful, and you’re now breaking that loop.

You might find yourself afraid that:

  • You’re becoming “that woman”—the bitter, difficult one everyone avoids

  • You’ll lose relationships (and you might—more on this in a moment)

  • You’re being selfish or narcissistic

  • You’re overreacting or being “too sensitive” (ironic, since you’re actually being less sensitive to others’ reactions)

These fears are valid. But they’re also old programming.

The woman you’re afraid of becoming? She’s not real. She’s a cautionary tale designed to keep you compliant.

Let’s be explicit about what’s actually happening when you “lose your filter”:

You’re gaining authenticity. The real you—the one who’s been submerged under layers of performance—is finally surfacing. This might feel harsh because authentic humans have edges. They have opinions. They have boundaries. These aren’t character flaws.

You’re gaining time. All the energy you spent managing everyone else’s experience? That’s now available for literally anything else. The return on investment is staggering.

You’re gaining clarity. When you stop cushioning every truth, reality becomes clearer. Problems that were obscured by diplomatic language become visible and therefore solvable.

You’re gaining real relationships. Some relationships will end when you stop people-pleasing. These were transactional relationships sustained by your performance. What remains are connections based on who you actually are.

This is hard to talk about, but necessary: some relationships won’t survive your refusal to keep pretending.

Friendships built on shared complaining but not actual intimacy. Work relationships that relied on you doing emotional labor others weren’t doing. Family dynamics where you played mediator, peacemaker, or emotional manager.

When you stop playing these roles, one of two things happens:

The relationship evolves into something more authentic, or it dissolves because it was never based on authentic connection in the first place.

Both outcomes are information.

Losing relationships because you stopped performing isn’t actually loss. It’s clarity about what was never really there.

If you’re in the thick of this shift, here’s what helps:

Name what’s happening. “I’m not becoming difficult—I’m becoming authentic. My brain is reorganizing around honesty instead of performance.” Language matters. The story you tell yourself about this change shapes your experience of it.

Expect resistance. When you stop over-functioning in relationships and systems, others will push back. This isn’t evidence you’re doing something wrong. It’s evidence you were doing too much before.

Practice the pause. You don’t have to swing from people-pleasing to brutal honesty overnight. Notice when you’re about to soften/cushion/apologize unnecessarily. Pause. Choose consciously whether to add the cushioning or not.

Find your people. Other women going through this same shift. They exist. They’re tired of pretending too. These relationships will feel different—less performative, more substantial.

Grieve if you need to. There’s loss here too. Loss of approval, loss of being liked by everyone, loss of your identity as “the nice one.” This grief is legitimate even as the change is ultimately positive.

Here’s what no one tells you about aging out of fucks: it’s practice for being fully alive.”

Every small death of ego, every shedding of others’ opinions, every moment you choose truth over approval, you’re rehearsing the ultimate letting go.

You’re learning to exist as yourself regardless of external validation. This is spiritual work masquerading as social rudeness.

The woman who can say “that’s not accurate” without apologizing is the same woman who can eventually face her own mortality without flinching. She’s practiced not needing everyone’s approval. She’s learned that her worth isn’t contingent on being pleasant.

You’re becoming free.

The “you” that’s emerging isn’t a worse version. It’s the version that was always there but buried under decades of social conditioning to maintain harmony at any cost.

Your brain is finally doing triage. Deciding what actually matters. Cutting away the pretense that never served you.

The filter you’re losing wasn’t protecting you. It was protecting everyone else from your truth.

And your truth? It’s not the problem.

The system that required you to hide it was always the problem.

So when someone says you’ve changed, when they say you’re not the person you used to be, when they imply something’s wrong with you now?

They’re right. You have changed.

You’ve changed into someone who’s no longer available for performance.

And that’s not difficult.

That’s development.

What’s the thing you used to bite your tongue about that you can’t anymore? Drop it in the comments. I have a feeling we’re all going through versions of the same awakening.

Leave a comment

“I’m building a space for women who are done performing. If this resonated with you, stick around. There’s more where this came from—and we’re just getting started.”

The Infinite Loop of One LLM Talking to Another

Daring Fireball
www.instagram.com
2025-12-13 16:26:23
This is very funny, but also a good indication of just how far away these things are from actual intelligence. First, a reasonable human being would never get caught in a loop like this. Second, only humans can not only recognize what’s going on here, but also see the humor in it.  ★  ...
Original Article

English

Down chevron icon

© 2025 Instagram from Meta

Show HN: Kinkora – A creative playground for experimenting with video models

Hacker News
kinkora.fun
2025-12-13 16:24:34
Comments...

Indexed Reverse Polish Notation, an Alternative to AST

Lobsters
burakemir.ch
2025-12-13 16:08:29
Comments...
Original Article

Indexed Reverse Polish Notation, an Alternative to AST

2025-12-12

"Why study compiler construction? Because knowing how a programming language is specified and implemented makes you a better programmer." I still remember these words, pronounced as matter-of-fact introduction to an undergrad course of compilers.

Compiler engineers have come up with many useful programming techniques and representations.

Today, I want to write about one such technique, an alternative to Abstract Syntax Trees (ASTs). Inspired by the parse tree representation in the Carbon compiler , this post explains a way to represent parsed source code using a variation of Reverse Polish Notation (RPN), in a contiguous array.

We call this Indexed RPN . Ordering program parts in a linear sequence very naturally leads to machine interpretation, which is well-known for calculators but maybe a little less well known when there are scoped definitions and control flow structures.

This is by no means a new way of doing things, but with modern machines having plenty of memory, there may have been less pressure to reach for techniques that memory-friendly.

1. From Arithmetic to Indices

Let’s start with an arithmetic expression example: We want to represent (3 + 4) * 5.

In a standard AST, this is a tree of pointers. In a standard stack-machine RPN, this looks like 3 4 + 5 *. Now we want something slightly different, because we want to operate on tree structure. For example, if we deal with this expression in a compiler that translates and optimizes. We want to be able to refer to specific sub-expressions later.

The "Administrative Normal Form" Perspective

Before we look at the memory layout, let's imagine breaking up this expression by naming subexpression. If we had local definitions in our language this would give us Administrative Normal Form (ANF). In ANF, we give a name to every intermediate result:

// Source: (3 + 4) * 5

val t0 = 3 
val t1 = 4
val t2 = add(t0, t1)
val t3 = 5
val t4 = mul(t2, t3)

The expression now takes a lot more to write, but for a compiler it is much more structured. The arguments of the operations are always names, which makes the data flow and also the order in which arguments get evaluated fully explicit. Here, t2 depends entirely on t0 and t1 , and t0 is evaluated before t1 .

The In-Memory Representation

We don't want local definitions just yet, the above is just to motivate flattening of a tree structure. If we store the instructions in a contiguous array (e.g., a vector or Vec), the index of the node becomes its name.

The internal names are merely indices into the sequence of nodes.

Index (ID) Node Kind Operands / Data
0 IntLiteral 3
1 IntLiteral 4
2 BinaryOp Add(lhs: 0, rhs: 1)
3 IntLiteral 5
4 BinaryOp Mul(lhs: 2, rhs: 3)

This is similar to Reverse Polish Notation (RPN), but there is a difference. In standard RPN, there is an implicit stack from which Add consumes items blindly. In Indexed RPN , Add explicitly refers to indices 0 and 1. This provides a stable reference to every sub-expression, allowing us to traverse the code and locate nodes without necessarily having to build up a stack.

2. Dealing with "Let" and Scope

Let us make the language more realistic by adding local let-declarations and scoping.

// Source
let a = 10;
let b = a + 5;

Here we face a problem: Source variables (a, b) are different from our internal indices (0, 1, 2...). We need a node that bridges this gap — an "Introducer" .

The Bind Node

We introduce a Bind node. This node represents the action of bringing a name into existence in the current scope. Depending on the language you are working with, a binding may have semantic significance. For example, if references to the binding are objects of the language like Rust or C++ references.

Index Node Kind Operands Meaning
0 IntLiteral 10 The raw value 10.
1 Bind name: "a", val: 0 Introducer : "a" exists, bound to Index 0.
2 NameRef ref: 1 A reference back to the introducer node.
3 IntLiteral 5 The raw value 5.
4 BinaryOp Add(2, 3) Adds the NameRef and the Literal.
5 Bind name: "b", val: 4 Introducer : "b" exists, bound to Index 4.

The Stack Returns (For Compilation)

In order to deal with this data, we traverse it but we will also want to build up a stack. You might ask: If we flattened the tree, why do we need a stack?

While the storage is flat, the compilation process requires a stack to handle scopes. Because let declarations can be nested, we cannot simply scan linearly and remember everything forever. We need to handle when names go out of scope (shadowing).

Let's add BlockStart and BlockEnd nodes that indicate nested blocks.

// Source Code
let x = 10;       // Outer 'x'
{
    let x = 20;   // Inner 'x' (shadows outer)
    print(x);     // Should print 20
}
print(x);         // Should print 10

The resolve_names Algorithm

I am too lazy for full code examples, just the idea.

We use a SymbolTableStack during the resolution pass. We iterate through the array once. We maintain a stack of scopes, where each scope maps a string name to an integer index.


function resolve_names(nodes):
    # A stack of scopes. Each scope is a Map: String -> Index
    scope_stack = [ new Map() ]

    for i, node in enumerate(nodes):
        
        match node.kind:
            case BlockStart:
                # Push a new, empty scope onto the stack
                scope_stack.push( new Map() )

            case BlockEnd:
                # Pop the top scope. Inner variables are forgotten.
                scope_stack.pop()

            case Bind(name, value_index):
                # Register the variable in the CURRENT (top) scope.
                current_scope = scope_stack.top()
                current_scope.set(name, i)

            case NameRef(name):
                # Look for the name, starting from the top scope down.
                target_index = find_in_stack(scope_stack, name)
                
                # PATCH THE NODE:
                # The node no longer holds "x". It holds the index (e.g., 4).
                node.resolved_index = target_index

After this pass, the stack is discarded. The IR is now "wired." Every variable usage points directly to the instruction that created it.

When representing source as AST, we would use an algebraic data type. One could use mutable data structures there, or build up a symbol table.

3. Breaking the Line: Control Flow

So far, execution has been linear: Index 0, then 1, then 2. But branching constructs like if, else, and while break this line.

In a tree-based AST, an If node has children pointers to "Then" and "Else" blocks. In our flat array, we may prefer to have in the same contiguous vector, instead of blocks floating in separate memory. So we introduce Jump nodes.

The Linear Layout

Consider this source:

if (a) { print(1); } else { print(2); }
print(3);

Here is the Indexed RPN layout. Note the use of BrFalse (Branch if False) and Jmp (Unconditional Jump).

Index Node Kind Data Explanation
0 NameRef "a" Load variable a .
1 BrFalse target: 5 If a is false, jump to Index 5 (Else).
2 Int 1 Start of "Then" block.
3 Print 2
4 Jmp target: 7 Jump over the "Else" block.
5 Int 2 Start of "Else" block (Target of node 1).
6 Print 5
7 Int 3 Merge Point. Execution continues here.
8 Print 7

Building It: Backpatching

When we emit the BrFalse instruction at index 1, we haven't written the Else block yet, so we don't know the target index.

It is quite straightforward to deal with that:

  1. Emit BrFalse with a placeholder target. Save the index.
  2. Emit the "Then" block.
  3. Emit Jmp with a placeholder target. Save the index.
  4. Mark the current index as the start of "Else". Backpatch (update) the BrFalse at index 1.
  5. Emit the "Else" block.
  6. Mark the current index as the end. Backpatch the Jmp at index 4.

This effectively flattens the logic of the program into a shape that mirrors how hardware executes instructions: predictable, linear memory access with explicit jumps.

4. Towards Interpretation and Code Gen

We have successfully flattened our source code. We have resolved variable names into absolute indices and lowered high-level control flow into jumps. Now comes the reward.

The Interpreter: The "Big Switch"

Because our code is a flat array, we can come up with a Virtual Machine (VM) that looks exactly like a hardware CPU: it has an Instruction Pointer (ip) and a big loop. As a reminder — I will never get tired of repeating this — the difference between a virtual machine and abstract machine is that a virtual machine has instructions , whereas an abstract machine has transitions .

A translation to a low-level format and a virtual machine plays the role of an interpreter, which provides an implementation of our language. We can also use it to specify the operational semantics , roughly: however you implement this language, it should produce the same result as the reference interpreter. For "real" languages, often specification comes as an afterthought but there are plenty of situations where one really would like to know how a piece of source code is supposed to behave. For example, to find out if the "real" implementation is correct. Somehow, educated people who really should know better can parrot statements like "undefined behavior is all about compiler optimizations" and completely ignore that "undefined behavior" is first and foremost a gap in the specification.

Back to our interpreter: we can do something really simple: since we used ANF, (where every node index represents a runtime value), we don't even need a runtime stack for intermediate calculations. We can simply map the nodes array to a parallel values array. A real implementation would not do this, but if we use an interpreter solely to specify behavior, this is sufficient, and we can defer optimizations. Note that since we have already resolved names to indices, instructions like Bind or BlockStart are effectively metadata. The interpreter can simply skip them.


function run_vm(nodes):
    # Holds the runtime result of every node.
    values = new Array(size=len(nodes))
    ip = 0
    
    while ip < len(nodes):
        node = nodes[ip]
        
        match node.kind:
            case IntLiteral:
                values[ip] = node.raw_value
                ip += 1
                
            case Add:
                # Direct access by index! No stack popping needed.
                lhs_val = values[node.lhs_index]
                rhs_val = values[node.rhs_index]
                values[ip] = lhs_val + rhs_val
                ip += 1

            case BrFalse:
                if values[node.cond_index] == False:
                    ip = node.target_index # JUMP
                else:
                    ip += 1 
            
            case Jmp:
                ip = node.target_index # Unconditional JUMP

            case _:
                 # Skip metadata nodes (Bind, BlockStart, etc.)
                 ip += 1

The Code Generator

We could also do a source-to-source translation and generate C code. The Indexed RPN shines again, because the complexity of the source language is reduced quite a bit. Since instructions are topologically sorted and dependencies are explicit, generating C can be as simple as a single for loop where every node becomes a temporary variable t{i}.

This is maybe not a great way to specify what a language means, but a clear implementation advantage of translating to an existing language is that one can build on top of an existing whole compiler, with optimizations, native code generation backends. How exactly the semantics and runtime aspects of the source language and the target language are connected is of course a design choice and can be wildly different.



function generate_c_code(nodes):
    output = StringBuilder()
    output.append("int main() {\n")
    
    for i, node in enumerate(nodes):
        # 1. Create a label for every instruction so Jumps can find it
        #    e.g., "L_0:", "L_1:", etc.
        output.append(f"L_{i}: ;\n")
        
        # 2. Create a variable name for this node's result
        #    e.g., "t_0", "t_1"
        var_name = f"t_{i}"
        
        match node.kind:
            case IntLiteral:
                # int t_0 = 10;
                output.append(f"    int {var_name} = {node.value};\n")
                
            case Add:
                # int t_2 = t_0 + t_1;
                lhs = f"t_{node.lhs_index}"
                rhs = f"t_{node.rhs_index}"
                output.append(f"    int {var_name} = {lhs} + {rhs};\n")
            
            case Print:
                # printf("%d\n", t_5);
                arg = f"t_{node.arg_index}"
                output.append(f"    printf(\"%d\\n\", {arg});\n")
            
            case BrFalse:
                # if (!t_1) goto L_5;
                cond = f"t_{node.cond_index}"
                target = f"L_{node.target_index}"
                output.append(f"    if (!{cond}) goto {target};\n")
            
            case Jmp:
                # goto L_7;
                target = f"L_{node.target_index}"
                output.append(f"    goto {target};\n")

    output.append("    return 0;\n}")
    return output.toString()

Conclusion

People will always build more languages, especially domain-specific ones. A realistic work-in-progress language that uses indexed RPN is Carbon .

By moving from a tree to an Indexed RPN , we replace heap allocations with a single contiguous vector. What was recursive tree-walking of AST can in many cases become index lookups. So there should be a lot less memory-traffic, and when programs get large, memory traffic can have a significant impact on performance.

If you are like me and build toy programming language implementations for fun, consider trying this out and see how it works for you!

LG TV's new software update installed MS Copilot, which cannot be deleted

Hacker News
old.reddit.com
2025-12-13 15:41:24
Comments...
Original Article

whoa there, pardner!

Your request has been blocked due to a network policy.

Try logging in or creating an account here to get back to browsing.

If you're running a script or application, please register or sign in with your developer credentials here . Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again. if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

You can read Reddit's Terms of Service here .

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here .

When contacting us, please include your Reddit account along with the following code:

91aa87df-53d5-4eb4-b096-c1c501733e0c

Ask HN: How can I get better at using AI for programming?

Hacker News
news.ycombinator.com
2025-12-13 15:37:16
Comments...
Original Article
Ask HN: How can I get better at using AI for programming?
22 points by lemonlime227 1 hour ago | hide | past | favorite | 12 comments

I've been working on a personal project recently, rewriting an old jQuery + Django project into SvelteKit. The main work is translating the UI templates into idiomatic SvelteKit while maintaining the original styling. This includes things like using semantic HTML instead of div-spamming, not wrapping divs in divs in divs, and replacing bootstrap with minimal tailwind. It also includes some logic refactors, to maintain the original functionality but rewritten to avoid years of code debt. Things like replacing templates using boolean flags for multiple views with composable Svelte components.

I've had a fairly steady process for doing this: look at each route defined in Django, build out my `+page.server.ts`, and then split each major section of the page into a Svelte component with a matching Storybook story. It takes a lot of time to do this, since I have to ensure I'm not just copying the template but rather recreating it in a more idiomatic style.

This kind of work seems like a great use case for AI assisted programming, but I've failed to use it effectively. At most, I can only get Claude Code to recreate some slightly less spaghetti code in Svelte. Simple prompting just isn't able to get AI's code quality within 90% of what I'd write by hand. Ideally, AI could get it's code to something I could review manually in 15-20 minutes, which would massively speed up the time spent on this project (right now it takes me 1-2 hours to properly translate a route).

Do you guys have tips or suggestions on how to improve my efficiency and code quality with AI?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

A Giant Ball Will Help This Man Survive a Year on an Iceberg

Hacker News
www.outsideonline.com
2025-12-13 15:25:33
Comments...
Original Article

New perk: Easily find new routes and hidden gems, upcoming running events, and more near you. Your weekly Local Running Newsletter has everything you need to lace up! Subscribe today .

While rowing across the Pacific in 2008, wind pushing him and waves battering him, Italian explorer Alex Bellini felt an unsettling lack of control.

Playing in his mind was the story of another Italian explorer, Umberto Nobile, who crashed his zeppelin north of Svalbard after a 1928 polar expedition. Seven men died. The survivors, including Nobile, spent a month wandering the free-floating pack ice, at one point shooting and eating a polar bear, until their rescue. How people react to unpredictable situations fascinates Bellini, a dedicated student of psychology. In the Arctic Ocean, unpredictable situations are a way of life.

“All adventure is based on hypothesis, which can be very different to reality,” says Bellini. “An adventurer must adapt himself to the environment he faces.”

Bellini’s newest adventure highlights and relishes that lack of control. Sometime next winter, he plans to travel to Greenland’s west coast, pick an iceberg, and live on it for a year as it melts out in the Atlantic.

Sometime next winter, Alex Bellini plans to travel to Greenland’s west coast, pick an iceberg, and live on it for a year as it melts out in the Atlantic.
Sometime next winter, Alex Bellini plans to travel to Greenland’s west coast, pick an iceberg, and live on it for a year as it melts out in the Atlantic. (Courtesy of Alex Bellini)

This is a precarious idea. Bellini will be completely isolated, and his adopted dwelling is liable to roll or fall apart at any moment, thrusting him into the icy sea or crushing him under hundreds of tons of ice.

His task: experience the uncontrollable nature of an iceberg at sea without getting himself killed. The solution: an indestructible survival capsule built by an aeronautics company that specializes in tsunami-proof escape pods.

“This adventure is about waiting for something to happen,” says Bellini. “But I knew since the beginning I needed to minimize the risk. An iceberg can flip over, and those events can be catastrophic.” Icebergs tend to get top-heavy as they melt from their submerged bottoms, so flips can be immediate and unpredictable. And, of course, so is the weather.

Bellini spent two years searching for the appropriate survival capsule, but most were too heavy to plant on a berg. But then, in October, he contacted aeronautical engineer Julian Sharpe, founder of Survival Capsule , a company that makes lightweight, indestructible floating capsules, or “personal safety systems.

They can hold from two to ten people, depending on the model, and are made from aircraft-grade aluminum in what’s called a continuous monocoque structure, an interlocking frame of aluminum spars that evenly distribute force, underneath a brightly painted and highly visible aluminum shell. The inner frame can be stationary or mounted on roller balls so it rotates, allowing the passengers to remain upright at all times.

Inside are a number of race car–style seats with four-point seatbelts, arranged facing either outward from the center or inward around the circumference, depending on the number of chairs. Storage compartments, including food and water tanks, sit beneath the seats. Two watertight hatches open inward to avoid outside obstructions. Being watertight, it s a highly buoyant vessel, displacing water like a boat does.

“I fell in love with the capsule,” says Bellini. “I’m in good hands.” He selected a three-meter, ten-person version, for which he’ll design his own interior.

Sharpe got the idea for his capsules after the 2004 Indonesian tsunami. He believes fewer people would have died had some sort of escape pod existed. With his three-man team, which includes a former NOAA director and a Boeing engineer, he brought the idea to fruition in 2011. Companies in Japan that operate in the line of fire for tsunamis expressed the most interest. But Sharpe hopes the products will be universal—in schools, retirement homes, and private residences, anywhere there is severe weather. The first testing prototypes of the capsules, which range from $12,000 to $20,000, depending on size, were shipped to Tokyo in 2013. Four are in Japan; two are in the United States. His two-person capsule is now for sale; the others will follow later this year.

“Right now there’s only horizontal and vertical evacuations,” Sharpe said. “We want to offer a third option: riding it out.”

The company intends to rely on an increasing market for survival equipment as sea level and the threat of major storms rise. Sharpe designed the capsules to be tethered to the ground using 20 to 50 meters of steel cable and to withstand a tsunami or storm surge. Each will have a water tank and a sophisticated GPS beacon system in case the tether snaps. Survival Capsule advises storing seven to ten days of food in each capsule.

The product appeals to Bellini because it s strong enough to survive a storm at sea or getting crushed between two icebergs. It will rest on top of the ice using either its own weight or a specially designed stand that will detach if the berg rolls. The circular shape is crucial for avoiding a crushing blow. The capsule will just roll off any incoming mass, and the water will provide an equal and opposite reaction to any force exerted on the capsule. A multicurved surface is almost uncrushable, Sharpe said. “If you imagine shooting an arrow at a wooden ball, unless you hit dead center, it’ll ricochet.”

The capsule is strong enough to survive a storm at sea or getting crushed between two icebergs. It will rest on top of the ice using either its own weight or a specially designed stand that will detach if the berg rolls.

The basic model ensures survival, but there’s more to life on an iceberg than just surviving. You can add windows, extra space, and other modular additions, even surround sound and color options. “You can trick your crib out all you want,” Sharpe said. And that’s exactly what Bellini plans to do. He doesn’t have a layout yet, but he has hired Italian designer Pietro Santoro to customize his ten-person pod. He will remove the other nine seats for extra room.

Other than modifications to keep him safe and healthy, the capsule is basic, Bellini said. It will carry 300 to 400 kilograms of food, a wind generator, solar panels, and an EPIRB beacon so rescuers can find him. He’ll have Wi-Fi to update his team and the public. The layout will consist of a work table, electronic panels, and a bed. “A foldable bed, Bellini added. I want to have room to work out.”

Bellini will spend almost all of his time in the capsule with the hatch closed, which will pose major challenges. He’ll have to stay active without venturing out onto a slippery, unstable iceberg. If it flips, he’ll have no time to react. He’s working with a company to develop nanosensors able to detect movement in the iceberg so he has advance warning of a flip. “Any step away from [the iceberg] will be in unknown territory,” he said. “You want to stretch your body. But then you risk your life.” He fears a lack of activity will dull his ability to stay safe. “I cannot permit myself to get crazy,” he said. “I need to keep my body fit, not for my body, but for my safety.” He is working on a routine of calisthenics that can be done in the capsule, and he might install a stationary bike, most likely a Ciclotte .

Lack of sunlight is another challenge of spending a year in an aluminum sphere. It will be winter in the Arctic, with maybe five hours of light each day. Bellini and Sharpe are working on a lighting system that will simulate natural light, allowing Bellini to get vitamins and maintain his circadian rhythm.

Bellini’s model is in development, and he expects it to be ready in about a year. He plans to write during his mission and will bring plenty of nonfiction books, especially psychology.

The capsule won’t ease his isolation, maybe his greatest challenge, but Bellini remains undaunted: “It’s the key to the inner part of myself.” The first step is relinquishing control.

Z8086: Rebuilding the 8086 from Original Microcode

Hacker News
nand2mario.github.io
2025-12-13 15:04:29
Comments...
Original Article

After 486Tang , I wanted to go back to where x86 started. The result is z8086 : a 8086/8088 core that runs the original Intel microcode . Instead of hand‑coding hundreds of instructions, the core loads the recovered 512x21 ROM and recreates the micro‑architecture the ROM expects.

z8086 is compact and FPGA‑friendly: it runs on a single clock domain, avoids vendor-specific primitives, and offers a simple external bus interface. Version 0.1 is about 2000 lines of SystemVerilog, and on a Gowin GW5A device, it uses around 2500 LUTs with a maximum clock speed of 60 MHz. The core passes all ISA test vectors, boots small programs, and can directly control peripherals like an SPI display. While it doesn’t boot DOS yet, it’s getting close.


Why another x86?

The 8086 is where the x86 story began. If you want to understand why x86 feels like x86 — segmented addressing, ModR/M, the prefetch queue, the oddball string instructions — this is the chip to study.

Also, reverse-engineering of the 8086 has reached a surprisingly level of maturity. We now have Ken Shirriff’s massive 8086 blog series and Andrew Jenner’s disassembled microcode . Combined with the original 8086 patent , these resources make it possible to rebuild a faithful core instead of a functional approximation.

My goals were simple:

  • Faithful where it counts. Accurately replicate the microarchitectural behavior of the original 8086 wherever it matters most.
  • Designed to be explorable and educational. The code is thoroughly commented to make it clear and easy to understand. Aims to be a good teaching resource.
  • FPGA-friendly and practical. z8086 is built to be an effective, useful CPU IP core for real FPGA projects.

Re‑creating the 8086

Here’s the high‑level view:

z8086 block diagram

(You can cross-reference function blocks against the die shot .)

At a bird’s‑eye level the pipeline is:

Prefetch queue → Loader (FC/SC) → Microcode sequencer → EU/BIU datapath

This is like the original chip’s split. The BIU (bus interface unit) runs ahead, fetching bytes into a 6‑byte queue whenever the bus is idle. The EU (execution unit) consumes bytes from that queue, decodes them, and drives the microcode engine. When the EU needs memory, it issues a Type‑6 micro‑op; the BIU yields the bus and prefetch pauses. That overlap is why the 8086 feels “pipelined” despite being a late‑70s design.

Microcode is the glue here. Each 21‑bit micro‑instruction encodes a move (5‑bit source → 5‑bit destination on an internal bus) plus an action (ALU op, short/long jump, bookkeeping, or a bus cycle). The sequencer advances through {AR, CR} addresses until the microcode asserts “run next instruction.”

Some key pieces:

  • Microcode engine. The sequencer keeps {AR, CR} (plus SR for calls), fetches 21‑bit words from ucode.hex , and executes them as a tight move→action loop. ROME marks active execution. When microcode wants a queue byte ( LOC_Q ) but the queue is empty, or when an EU bus cycle is in flight, a stall signal freezes CR so the ROM sees exactly the timing it expects.

  • Translation + group decode. The original 8086 uses ROMs to (1) classify opcodes into ~15 “group” signals (“has ModR/M,” “prefix,” “uses w‑bit,” “grp3/4/5,” etc.), and (2) map {opcode, ModR/M} to microcode entry points for effective‑address and control‑flow routines. z8086 implements these as combinational replicas ( group_decode() and translate() ), derived from the dumped ROM truth tables. This is what lets the recovered microcode drop straight in without being rewritten.

  • Bus + unaligned access. Externally you get rd/wr/io/word/ready with aligned cycles, so FPGA memory is easy to hook up. Internally the EU still issues Type‑6 bus micro‑ops with the right segment defaults and overrides. If a word access lands on an odd address, the bus FSM automatically splits it into two byte cycles ( BUS_UNALIGNED ), so software sees real 8086 semantics while the outside world stays aligned.

  • ALU + flags. The ALU is implemented as a classic 16×1‑bit slice, controlled by signals modeled after Intel’s original logic. The initial ALU design used Verilog primitives, but this updated bit‑slice version is both smaller and faster, closely replicating the behavior of the original chip’s ALU.

One concrete example: for a ModR/M instruction like ADD AX, [BX+SI+4] , the loader’s FC grabs the opcode, SC grabs the ModR/M byte, translate() jumps into the right effective‑address micro‑routine, the EU reads the operand through a Type‑6 bus cycle into OPR , the ALU updates SIGMA and flags, and a final Type‑6 writeback happens only if the instruction targets memory.


Interesting discoveries

Microcode is super efficient

The 8086 shipped with ~29K transistors and still delivered a very rich CISC ISA: segmented addressing, ModR/M base+index+disp modes, and weirdly specialized instructions like DAA and XLAT . The trick was microcode. A small internal datapath plus ROM sequencing let Intel implement a huge instruction surface area without exploding logic.

The contrast with other CPUs is striking. The 6502 (~4.5K transistors) and Z80 (~8.5K) are elegant, mostly hardwired, and highly minimalist designs. In comparison, the 8086 features a much wider datapath, significantly more instructions and features, yet manages to do so with less than four times the transistor count of the Z80. The 68000 (~68K transistors) takes a different approach, using far more silicon for its fully hardwired CISC design. Remarkably, the 8086 achieves a similar feature set with less than half the transistor count of the 68000. This efficiency carries over to z8086: the core fits into just 2,500 LUT4s — dramatically smaller than ao486, which is about ten times larger.

The patent’s FC/SC formulas are wrong (or at least incomplete)

Interestingly, the patent’s explanation of FC and SC signal generation turns out to be inconsistent. The formulas it provides are:

FC = [(00) + (10)(NXT + RNI)]·MT
SC = [(01) + (11)](2BR·MT)

Here, “MT” refers to “a signal generated by Q control circuitry indicating that the queue is empty…”. In reality, however, the correct logic should be “ not MT ”" rather than MT, contrary to the documentation. Testing and implementation confirm that this change results in the expected loader behavior.

The “8086 interrupt bug"

The original 1978 8086 had an interrupt-related bug: If an interrupt occurs immediately after a MOV SS,xxx or POP SS instruction, the CPU may push data to an incorrect stack address, corrupting memory. The problem arises because both the Stack Segment (SS) and Stack Pointer (SP) must be updated to ensure correct stack operations. If an interrupt arrives between these updates, the CPU could save flags/IP/CS to the wrong location. Intel later resolved this by automatically disabling interrupts for one instruction following operations like POP SS .

z8086 faithfully reproduces this edge case using a delay_interrupt register. This register is set whenever one of three events occurs: when SC decodes a prefix ( g_prefix ), a stack segment load ( POP SS ), or a segment register move ( MOV sr, r/m , detected by g_seg_reg_bits ). This mechanism disables interrupt handling for exactly one instruction, matching the original 8086’s behavior.

The prefetch queue bus is 8-bit

The prefetch queue is a 6-byte buffer that continuously feeds the execution engine. Its output, called the “Q Bus,” is an 8-bit bus delivering the next instruction byte. Notably, while the 8086 is architecturally a 16-bit CPU, it fetches instruction bytes one at a time—consuming at most a single byte per cycle. This design ultimately limits performance, a bottleneck that later Intel CPUs overcome; for instance, the 386 features a 32-bit wide Q bus.

Working on ao486 for 486Tang underscored just how crucial the prefetch queue is to overall performance and Fmax. The intricate x86 instruction set makes optimizing the queue challenging. Balancing width, depth, and flexibility in its design truly tests the designer’s skill.


Reflections and next steps

Overall, this project has been incredibly fun — like piecing together a giant puzzle. It involves gathering information from many sources, making educated guesses about the original design, and testing those theories until everything clicks into place.

Getting code to work is the definitive proof of truly understanding a system. The fact that z8086 functions as intended demonstrates that the community now possesses deep, practical insight into the original x86 chip.

Intel packed an impressive array of features into the 8086. Some attribute this to it being designed by a software developer . While many of these features have become less relevant over time — and some of the 8086’s success was undoubtedly lucky, such as being chosen for the IBM PC — the developer-friendly design played a big role in kickstarting the x86 ecosystem.

This release is an early preview and comes with several limitations: it is not yet cycle accurate, the interrupt circuitry is still under-tested, the original 8086 bus cycles are not fully replicated, and it has not yet been used to run large programs.

Here are some directions I plan to work on:

  • More extensive testing on FPGA boards
  • Booting DOS
  • Compiling to WebAssembly for interactive 8086 visualization in the browser?

z8086 should work on most FPGAs, with sample projects provided for DE10-Nano, Xilinx Artix7 and Tang Console 60K. If low-level CPU archaeology interests you – or you’d like to try a real-microcode 8086 as a soft CPU in your own project – check out the project on GitHub: 👉 z8086 on GitHub .

Feedback, issues, and PRs are always welcome. Thanks for reading!

Gavin Newsom pushes back on Trump AI executive order preempting state laws

Guardian
www.theguardian.com
2025-12-13 15:00:35
California governor says order pushes ‘grift and corruption’ instead of innovation just hours after president’s dictum The ink was barely dry on Donald Trump’s artificial intelligence executive order when Gavin Newsom came out swinging. Just hours after the order went public Thursday evening, the Ca...
Original Article

The ink was barely dry on Donald Trump’s artificial intelligence executive order when Gavin Newsom came out swinging. Just hours after the order went public Thursday evening, the California governor issued a statement saying the presidential dictum, which seeks to block states from regulating AI of their own accord, advances “grift and corruption” instead of innovation.

“President Trump and David Sacks aren’t making policy – they’re running a con,” Newsom said, referencing Trump’s AI adviser and crypto “czar” . “Every day, they push the limits to see how far they can take it.”

Trump’s executive order is a major victory for tech companies that have campaigned against legislative barriers to developing and deploying their AI products. It also sets up a clash between state governments and the White House over the future of AI regulation. The immediate backlash from groups including child safety organizations, unions and state officials has highlighted the deeply contentious nature of the order and diverse range of interests it affects.

Several officials and organizations have already questioned the legality of the executive order, stating that Trump does not have the power to undermine state legislation on AI and denouncing the decree as the result of tech industry lobbying. California , home to some of the world’s most prominent AI companies and one of the most active states legislating AI, has been a locus for pushback against the order.

“This executive order is deeply misguided, wildly corrupt, and will actually hinder innovation and weaken public trust in the long run,” California Democratic representative Sara Jacobs said in a statement. “We will explore all avenues – from the courts to Congress – to reverse this decision.”

After a draft version of Trump’s order leaked in November, state attorney general Rob Bonta said that his office would “take steps to examine the legality or potential illegality of such an executive order”, teeing up a precedent-setting duel between California and the White House.

Legislative loggerheads

In September, Newsom signed a landmark AI law that would compel developers of large, powerful AI models known as “frontier models” to provide transparency reports and promptly report safety incidents or face fines up to $1m. The governor touted the Transparency in Frontier Artificial Intelligence act as an example for how to regulate AI companies nationwide.

“Our state’s status as a global leader in technology allows us a unique opportunity to provide a blueprint for well-balanced AI policies beyond our borders,” Newsom said in an address to the California state senate. “Especially in the absence of a comprehensive federal AI policy framework and national AI safety standards.”

The September bill and more California legislation could be in Trump’s crosshairs. Thursday’s executive order calls for an AI litigation taskforce that would review state laws that do not “enhance the United States’ global AI dominance” and then pursue legal action or potentially withhold federal broadband funding. The taskforce will also consult with the administration’s AI and crypto “czar” to determine which laws to target.

Although Trump has framed the executive order as a means of streamlining legislation and removing onerously patchwork regulation, critics have alleged that the government has never provided any comprehensive federal framework for regulating AI to replace state laws. The order follows attempts to include similar AI moratoriums in bills earlier this year , which failed due to bipartisan backlash. Instead, opponents view the order as a gift to major tech companies that have cozied up to the administration over the course of the year.

“President Trump’s unlawful executive order is nothing more than a brazen effort to upend AI safety and give tech billionaires unchecked power over working people’s jobs, rights and freedoms,” AFL-CIO president, Liz Shuler, said in a statement.

Nationwide backlash

Within hours of Trump signing the order, opposition loudened among lawmakers, labor leaders, children’s advocacy groups and civil liberties organizations that decried the policy. Other California Democratic leaders said the executive order was an assault on state rights and the administration should instead focus on federal agencies and academic research to boost innovation.

“No place in America knows the promise of artificial intelligence technologies better than California,” said Alex Padilla, a senator for California. “But with today’s executive order, the Trump administration is attacking state leadership and basic safeguards in one fell swoop.”

Similarly, Adam Schiff, another California senator, emphasized: “Trump is seeking to preempt state laws that are establishing meaningful safeguards around AI and replace them with … nothing.”

Lawmakers from Colorado to Virginia to New York also took issue with the order. Don Beyer, a Virginia congressmember called it a “terrible idea” and said that it would “create a lawless Wild West environment for AI companies”. Likewise, Alex Bores, a New York state assemblymember, called the order a “massive windfall” for AI companies, adding that “a handful of AI oligarchs bribed Donald Trump into selling out America’s future”.

Even Steve Bannon, Trump loyalist and former adviser, criticized the policy. In a text message to Axios , Bannon said Sacks had “completely misled the President on preemption”. Mike Kubzansky, the CEO of Omidyar Network, a philanthropic tech investment firm that funds AI companies, similarly said “the solution is not to preempt state and local laws” and that ignoring AI’s impact on the country “through a blanket moratorium is an abdication of what elected officials owe their constituents”.

Blowback against the order has also included child protection organizations that have long expressed concerns over the effects of AI on children. The debate over child safety has intensified this year in the wake of multiple lawsuits against AI companies over children who died by suicide after interacting with popular chatbots.

“The AI industry’s relentless race for engagement already has a body count, and, in issuing this order, the administration has made clear it is content to let it grow,” said James Steyer, the CEO of child advocacy group Common Sense Media. “Americans deserve better than tech industry handouts at the expense of their wellbeing.”

A group of bereaved parents and child advocacy organizations have also spoken out. They have been working to pass legislation to better protect children from harmful social media and AI chatbots and released a national public service announcement on Thursday opposing the AI preemption policy. Separately, Sarah Gardner, the CEO of Heat Initiative, one of the groups in the coalition, called the order “unacceptable”.

“Parents will not roll over and allow our children to remain lab rats in big tech’s deadly AI experiment that puts profits over the safety of our kids,” Gardner said. “We need strong protections at the federal and state level, not amnesty for big tech billionaires.”

Skills vs Dynamic MCP Loadouts

Lobsters
lucumr.pocoo.org
2025-12-13 14:55:24
Comments...
Original Article

written on December 13, 2025

I’ve been moving all my MCPs to skills, including the remaining one I still used: the Sentry MCP 1 . Previously I had already moved entirely away from Playwright to a Playwright skill.

In the last month or so there have been discussions about using dynamic tool loadouts to defer loading of tool definitions until later. Anthropic has also been toying around with the idea of wiring together MCP calls via code, something I have experimented with .

I want to share my updated findings with all of this and why the deferred tool loading that Anthropic came up with does not fix my lack of love for MCP. Maybe they are useful for someone else.

What is a Tool?

When the agent encounters a tool definition through reinforcement learning or otherwise, it is encouraged to emit tool calls through special tokens when it encounters a situation where that tool call would be appropriate. For all intents and purposes, tool definitions can only appear between special tool definition tokens in a system prompt. Historically this means that you cannot emit tool definitions later in the conversation state. So your only real option is for a tool to be loaded when the conversation starts.

In agentic uses, you can of course compress your conversation state or change the tool definitions in the system message at any point. But the consequence is that you will lose the reasoning traces and also the cache. In the case of Anthropic, for instance, this will make your conversation significantly more expensive. You would basically start from scratch and pay full token rates plus cache write cost, compared to cache read.

One recent innovation from Anthropic is deferred tool loading. You still declare tools ahead of time in the system message, but they are not injected into the conversation when the initial system message is emitted. Instead they appear at a later point. The tool definitions however still have to be static for the entire conversation, as far as I know. So the tools that could exist are defined when the conversation starts. The way Anthropic discovers the tools is purely by regex search.

Contrasting with Skills

This is all quite relevant because even though MCP with deferred loading feels like it should perform better, it actually requires quite a bit of engineering on the LLM API side. The skill system gets away without any of that and, at least from my experience, still outperforms it.

Skills are really just short summaries of which skills exist and in which file the agent can learn more about them. These are proactively loaded into the context. So the agent understands in the system context (or maybe somewhere later in the context) what capabilities it has and gets a link to the manual for how to use them.

Crucially, skills do not actually load a tool definition into the context. The tools remain the same: bash and the other tools the agent already has. All it learns from the skill are tips and tricks for how to use these tools more effectively.

Because the main thing it learns is how to use other command line tools and similar utilities, the fundamentals of how to chain and coordinate them together do not actually change. The reinforcement learning that made the Claude family of models very good tool callers just helps with these newly discovered tools.

MCP as Skills?

So that obviously raises the question: if skills work so well, can I move the MCP outside of the context entirely and invoke it through the CLI in a similar way as Anthropic proposes? The answer is yes, you can, but it doesn’t work well. One option here is Peter Steinberger’s mcporter . In short, it reads the .mcp.json files and exposes the MCPs behind it as callable tools:

npx mcporter call 'linear.create_comment(issueId: "ENG-123", body: "Looks good!")'

And yes, it looks very much like a command line tool that the LLM can invoke. The problem however is that the LLM does not have any idea about what tools are available, and now you need to teach it that. So you might think: why not make some skills that teach the LLM about the MCPs? Here the issue for me comes from the fact that MCP servers have no desire to maintain API stability. They are increasingly starting to trim down tool definitions to the bare minimum to preserve tokens. This makes sense, but for the skill pattern it’s not what you want. For instance, the Sentry MCP server at one point switched the query syntax entirely to natural language. A great improvement for the agent, but my suggestions for how to use it became a hindrance and I did not discover the issue straight away.

This is in fact quite similar to Anthropic’s deferred tool loading: there is no information about the tool in the context at all. You need to create a summary. The eager loading of MCP tools we have done in the past now has ended up with an awkward compromise: the description is both too long to eagerly load it, and too short to really tell the agent how to use it. So at least from my experience, you end up maintaining these manual skill summaries for MCP tools exposed via mcporter or similar.

Path Of Least Resistance

This leads me to my current conclusion: I tend to go with what is easiest, which is to ask the agent to write its own tools as a skill. Not only does it not take all that long, but the biggest benefit is that the tool is largely under my control. Whenever it breaks or needs some other functionality, I ask the agent to adjust it. The Sentry MCP is a great example. I think it’s probably one of the better designed MCPs out there, but I don’t use it anymore. In part because when I load it into the context right away I lose around 8k tokens out of the box, and I could not get it to work via mcporter. On the other hand, I have Claude maintain a skill for me. And yes, that skill is probably quite buggy and needs to be updated, but because the agent maintains it, it works out better.

It’s quite likely that all of this will change, but at the moment manually maintained skills and agents writing their own tools have become my preferred way. I suspect that dynamic tool loading with MCP will become a thing, but it will probably quite some protocol changes to bring in skill-like summaries and built-in manuals for the tools. I also suspect that MCP would greatly benefit of protocol stability. The fact that MCP servers keep changing their tool descriptions at will does not work well with materialized calls and external tool descriptions in READMEs and skill files.

This entry was tagged ai and thoughts

copy as / view markdown

The Brand-New Pentagon Press Corps Is Gaga for Hegseth

Intercept
theintercept.com
2025-12-13 14:23:34
The Department of War has cracked the code on making the perfect press corps by welcoming in only its biggest cheerleaders. The post The Brand-New Pentagon Press Corps Is Gaga for Hegseth appeared first on The Intercept....
Original Article
Pentagon Press Secretary Kingsley Wilson conducts a press briefing at the Pentagon, Washington, D.C., Dec. 2, 2025. (DoW photo by U.S. Navy Petty Officer 1st Class Eric Brann)
Pentagon press secretary Kingsley Wilson conducts a press briefing at the Pentagon, Washington, D.C., on Dec. 2, 2025. Photo: U.S. Navy Officer Eric Brann/Office of the Secretary of War

The welcome was so warm it could’ve been the first day of school for a new class of kindergarteners, and with the so-called reporters’ level of skepticism for the administration, they might as well have been.

“I would also like to take a moment today to welcome all of you here to the Pentagon briefing room as official new members of the Pentagon press corps. We’re glad to have you,” Pentagon press secretary Kingsley Wilson said in her December 2 briefing. “This is the beginning of a new era.”

Wilson also said that “legacy media chose to self-deport from this building,” a cute way of noting that dozens of news organizations — among them the New York Times, the Washington Post, the major broadcast news outlets, and even Fox News and Newsmax — gave up their press passes rather than sign on to the administration’s blatantly anti-First Amendment set of rules for reporting on Pete Hegseth’s Department of War. Among those rules was a provision allowing journalists to be expelled for reporting on anything, whether classified or unclassified, not approved for official release .

To test-drive the absurdity of this new “press corps,” Wilson granted the second question of the “new era” to disgraced former congressman Matt Gaetz, once Donald Trump’s pick for attorney general and now a host on the feverishly pro-Trump One America News Network. Gaetz, who was wearing a rather dated performance fleece jacket embroidered with “ Representative Matt Gaetz ,” asked two questions about regime change in Venezuela, a policy the administration is actively fomenting as it carries out strikes on boats it claims are carrying “narcoterrorists” smuggling drugs in the Caribbean Sea and Pacific Ocean.

The substance of the questions mattered less than the opening they provided for Wilson to parrot the administration’s line on these strikes: “Every single person who we have hit thus far who is in a drug boat carrying narcotics to the United States is a narcoterrorist. Our intelligence has confirmed that.” Somewhat puzzlingly, Wilson also said the Department of War is “a planning organization” with “a contingency plan for everything.”

There was no further follow-up from the member of the “press” whom the House Ethics Committee found engaged in sexual activity with a 17-year-old girl in 2017. (Gaetz has denied wrongdoing.)

Since the briefing took place just days after the killing of a member of the National Guard blocks from the White House, multiple members of the Pentagon’s new Fourth Estate asked weighty questions in the wake of the tragedy, including whether the service member would receive a medal for distinguished service or a military burial at Arlington National Cemetery. (Both are TBD.)

It wasn’t all softball questions, but every assembled member served their purpose by running interference for the administration in general and Hegseth in particular. One interlocutor, following up on a question about selling weapons to Qatar despite its ties to the Muslim Brotherhood from the indefatigable Laura Loomer, asked without a hint of irony whether the U.S. would be “reassessing our relationship with Israel” over Israeli media reports that the country’s government “funded Hamas.”

Without missing a beat, the War Department flak replied that that would be a “better question for the State Department” and moved right along.

Another member of the press corps asked whether any actual drugs have been recovered from these alleged drug-smuggling boats that the U.S. military has been drone striking — twice, in one case — a question well worth asking, and one that’s almost certainly being posed by the deposed mainstream journalists now reporting on the Pentagon from outside its walls. Wilson, standing in for the U.S. government, responded by essentially asking that we trust her, trust the intelligence, and trust that Hegseth’s War Department is telling the truth. The matter was, once again, closed.

Along with Loomer, a noted Trump sycophant and conspiracy theorist, I spotted “Pizzagate” promoter Jack Posobiec, who asked about Democratic Sen. Mark Kelly, and Project Veritas founder James O’Keefe in the assembled crowd. In a video of the briefing, an open laptop in one member of the “new” media’s lap was emblazoned with stickers that read “feminine, not feminist” and “homemaking is hot.” A statement from the department trumpeting news of the new corps features an interviewer in front of a backdrop emblazoned with logos for “LindellTV,” the media venture by MyPillow founder Mike Lindell — who is now running for governor of Minnesota. (LindellTV’s IMDB page describes the programming as: “Aging man with many internet connectivity issues, screaming into his cell phone, has discussions with a tired looking news anchor,” although it’s not clear whether that’s the official network tagline.)

The Pentagon press corps has always been a gilded cage — a perch for big-name reporters who want a plush-sounding posting without too much hassle. The most essential, critical reporting never comes from briefings, where reporters sit with their mouths open like baby birds looking up for a news morsel from their press secretary mother. But like with so many things under Trump, by giving up on any semblance of respecting norms, he’s revealed how neutered the institution was to begin with. Critical reporting on the War Department has, and will, continue, even without reporters in the physical building. It’s worth asking if they should ever go back.

Quoting Obie Fernandez

Simon Willison
simonwillison.net
2025-12-13 14:01:31
If the part of programming you enjoy most is the physical act of writing code, then agents will feel beside the point. You’re already where you want to be, even just with some Copilot or Cursor-style intelligent code auto completion, which makes you faster while still leaving you fully in the driver...
Original Article

If the part of programming you enjoy most is the physical act of writing code, then agents will feel beside the point. You’re already where you want to be, even just with some Copilot or Cursor-style intelligent code auto completion, which makes you faster while still leaving you fully in the driver’s seat about the code that gets written.

But if the part you care about is the decision-making around the code, agents feel like they clear space. They take care of the mechanical expression and leave you with judgment, tradeoffs, and intent. Because truly, for someone at my experience level, that is my core value offering anyway. When I spend time actually typing code these days with my own fingers, it feels like a waste of my time.

Obie Fernandez , What happens when the coding becomes the least interesting part of the work

Earth-Like Planets Are More Common Than We Thought, Study Says

403 Media
www.404media.co
2025-12-13 14:00:24
Normally, it’s bad news to be next to an exploding star. But ancient supernovae may have aided the formation of our home world—and perhaps Earthlike planets elsewhere....
Original Article

Welcome back to the Abstract! These are the studies this week that got hosed with star spray, mounted a failed invasion, declined to comment, and achieved previously unknown levels of adorability.

First, a study about how the solar system wasn’t destroyed 4.5 billion years ago (phew!). Then: a human touch on an ancient boat, the duality of posters and lurkers, and an important update on toadlets.

As always, for more of my work, check out my book First Contact: The Story of Our Obsession with Aliens or subscribe to my personal newsletter the BeX Files .

Sink into a warm cosmic-ray bath

Sawada, Ryo et al. “Cosmic-ray bath in a past supernova gives birth to Earth-like planets.” Science Advances.

Earth was cosmically conceived in part by a massive shockwave from a nearby supernova, which seeded our home world and neighboring rocky planets with telltale radioactive signatures, according to a new study.

The solar system’s rocky planets contain short-lived radionuclides (SLRs), which are ancient elements that were likely barfed out from exploding stars. For this reason, scientists have long suspected that stars must’ve detonated next to the gassy disk that gave rise to the solar system. The heat generated from these radioactive elements helped the building blocks of the rocky planets—Mercury, Venus, Earth, and Mars—melt together so they could become whole worlds, which means we owe our existence to these ancient supernovas.

Now, a team has developed a new model to explain how the primordial pyrotechnics didn’t just blow up the nascent solar system. The results suggest that rocky Earth-like worlds may be common in the universe, with potential implications for the search for extraterrestrial life.

“A key question in astronomy is how ubiquitous Earth-like rocky planets are,” said researchers led by Ryo Sawada of the University of Tokyo. “The formation of terrestrial planets in our Solar System was strongly influenced by the radioactive decay heat of SLRs, particularly aluminum-26, likely delivered from nearby supernovae.”

“However, the supernova injection scenario faces an unresolved problem in that existing supernova models could not reproduce both the relative and absolute abundances of SLRs without disrupting the protosolar disk,” an event that “would likely prevent the Solar System formation altogether,” the team added.

In other words, it’s hard to explain how the solar system got its high abundance of SLRs without killing it in the cradle. Sawada and his colleagues propose a solution that involves at least one star exploding about three light years of the disk, sparking a shockwave that created a cosmic-ray “bath.”

Schematic picture of the system assumed in this study. Image: Sawada et al., Sci. Adv. 11, eadx7892

In this “immersion mechanism,” energetic cosmic rays trapped in the bath triggered SLR-producing reactions directly within the disk. This contrasts with the hypothesis that the SLRs were largely injected and then mixed up in the disk through some unknown process. This new solution can account both for the high abundance of certain SLRs, like aluminum-26, and the fact that the solar system was not destroyed, as evidenced by its apparent continued existence.

“Our results suggest that Earth-like, water-poor rocky planets may be more prevalent in the

Galaxy than previously thought,” the team said, noting that many disks are rocked by similar supernova-shockwaves. “This challenges previous interpretations that classified the Solar System as an outlier with a particularly high [aluminum-26] abundance.”

In addition to offering a new hypothesis for an old astronomical problem, the study gets bonus points for its extremely poetic title: “Cosmic-ray bath in a past supernova gives birth to Earth-like planets.” If you say this enchanted phrase three times, somewhere an Earth-like world will be born.

In other news…

The biometrics of a Baltic boatsman

Fauvelle, Mikael et al. “New investigations of the Hjortspring boat: Dating and analysis of the cordage and caulking materials used in a pre-Roman iron age plank boat.” PLOS One

Stars aren’t the only things leaving their dirty fingerprints in unexpected places this week. Archeologists working on the mysterious Hjortspring boat, a 2,400-year-old Scandinavian vessel, discovered a tantalizing partial human fingerprint in its caulking, providing “a direct link to the ancient seafarers who used this boat,” according to the study.

Photo of caulking fragment showing fingerprint on the left and high-resolution x-ray tomography scan of fingerprint region on the right. Image: Photography by Erik Johansson, 3D model by Sahel Ganji

The ridges of the fingerprint “fall within average distributions for both adult male and females as well as for juvenile adults, making it difficult to say much about the individual who produced the print,” said researchers led by Mikael Fauvelle of Lund University. “The most likely interpretation, however, is that it was made during repairs by one of the crew members on the boat itself, providing a direct link to the seafarers of the ancient vessel.”

Regardless of this person’s identity, their voyage didn’t end well. Researchers think the crew of the Hjortspring boat probably sailed from the eastern Baltic Sea to attack the Danish island of Als, where they were defeated. “The victors [deposited] the weapons of their vanquished foes together with one of their boats into the bog,” where they remained for millennia until they were rediscovered in the 1880s, the team said.

It’s a timeless reminder for would-be invaders: Don’t get caulky.

Long-time lurker, first-time poster

Oswald, Lisa et al. “Disentangling participation in online political discussions with a collective field experiment.” Science Advances.

At last, scientists have investigated the most elusive online demographic: the humble lurker. A team recruited 520 Redditors in the U.S. to participate in small subreddits focused on a variety of political topics during the summer of 2024. The aim was to probe why some people became prolific “power-users” that post with voluminous confidence, while others remained wallflowers.

“Online political discussions are often dominated by a small group of active users, while most remain silent,” said researchers led by Lisa Oswalt of the Max Planck Institute for Human Development. “This visibility gap can distort perceptions of public opinion and fuel polarization.”

The team found that “lurking (posting nothing) was most common among users who perceived discussions as toxic, disrespectful, or unconstructive.” Lurkers were offered small payments to post in the experiment, which succeeded in motivating some to contribute to discussions. As a result, the study concluded that “future interventions may be able to make online political discussions more representative by offering more positive social rewards for lurkers to post.”

At last, an opportunity to unionize the lurkers of the world. Solidarity (in silence) forever.

It’s the great pumpkin toadlet, Charlie Brown

Bornschein, Marcos R. et al. “A new species of Brachycephalus (Anura: Brachycephalidae) from Serra do Quiriri, northeastern Santa Catarina state, southern Brazil, with a review of the diagnosis among species of the B. pernix group and proposed conservation measures.” PLOS One.

We will close, as we have before , with an impossibly cute toadlet. Scientists have discovered this new species of “pumpkin toadlet” in the “cloud forests” of Brazil, a sentence so twee that it’s practically its own fairy tale. The tiny toad Brachycephalus lulai, pictured below on a pencil tip, belongs to a family of “flea toads” that are among the smallest vertebrates on Earth.

Basically it is very smol: Brachycephalus lulai is a tiny pumpkin toadlet measuring less than 14 mm in length. Photo: Luiz Fernando Ribeiro. Image credit 1: Luiz Fernando Ribeiro, CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

“Our team sought to better document the individual variation of all Brachycephalus species in southern Brazil, looking for them in the field over the past seven years,” said researchers led by Marcos R. Bornschein of São Paulo State University. “As a result of this work, we discovered and herein described a population collected on the eastern slope of Serra do Quiriri as a new species.”

The team also reported that the toads are actively colonizing newly formed cloud forests, which are high-altitude woods shrouded in mist. The researchers propose making these unique habitats into refuges for the adorable anurans.

Thanks for reading! See you next week.

The state of the kernel Rust experiment

Lobsters
lwn.net
2025-12-13 13:29:17
Comments...
Original Article

Welcome to LWN.net

The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the Linux and free software communities. If you enjoy this article, please consider subscribing to LWN . Thank you for visiting LWN.net!

The ability to write kernel code in Rust was explicitly added as an experiment — if things did not go well, Rust would be removed again. At the 2025 Maintainers Summit, a session was held to evaluate the state of that experiment, and to decide whether the time had come to declare the result to be a success. The (arguably unsurprising) conclusion was that the experiment is indeed a success, but there were some interesting points made along the way.

Headlines

Miguel Ojeda, who led the session, started with some headlines. The Nova driver for NVIDIA GPUs is coming, with pieces already merged into the mainline, and the Android binder driver was merged for 6.18. Even bigger news, he said, is that Android 16 systems running the 6.12 kernel are shipping with the Rust-written ashmem module. So there are millions of real devices running kernels with Rust code now.

[Miguel Ojeda] Meanwhile, the Debian project has, at last, enabled Rust in its kernel builds; that will show up in the upcoming "forky" release. The amount of Rust code in the kernel is " exploding ", having grown by a factor of five over the last year. There has been an increase in the amount of cooperation between kernel developers and Rust language developers, giving the kernel project significant influence over the development of the language itself. The Rust community, he said, is committed to helping the kernel project.

The rust_codegen_gcc effort, which grafts the GCC code generator onto the rustc compiler, is progressing. Meanwhile the fully GCC-based gccrs project is making good progress. Gccrs is now able to compile the kernel's Rust code (though, evidently, compiling it to correct runnable code is still being worked on). The gccrs developers see building the kernel as one of their top priorities; Ojeda said to expect some interesting news from that project next year.

With regard to Rust language versions, the current plan is to ensure that the kernel can always be built with the version of Rust that ships in the Debian stable release. The kernel's minimum version would be increased 6-12 months after the corresponding Debian release. The kernel currently specifies a minimum of Rust 1.78, while the current version is (as of the session) 1.92. Debian is shipping 1.85, so Ojeda suggested that the kernel move to that version, which would enable the removal of a number of workarounds.

Jiri Kosina asked how often the minimum language version would be increased; Ojeda repeated that it would happen after every Debian stable release, though that could eventually change to every other Debian release. It is mostly a matter of what developers need, he said. Linus Torvalds said that he would be happy to increase the minimum version relatively aggressively as long as it doesn't result in developers being shut out. Distributors are updating Rust more aggressively than they have traditionally updated GCC, so requiring a newer version should be less of a problem.

Arnd Bergmann said that the kernel could have made GCC 8 the minimum supported version a year earlier than it did, except that SUSE's SLES was running behind. Kosina answered that SUSE is getting better and shipping newer versions of the compiler now. Dave Airlie worried that problems could appear once the enterprise distributors start enabling Rust; they could lock in an ancient version for a long time. Thomas Gleixner noted, though, that even Debian is now shipping GCC 14; the situation in general has gotten better.

Still experimental?

Given all this news, Ojeda asked, is it time to reconsider the "experimental" tag? He has been trying to be conservative about asking for that change, but said that, with Android shipping Rust code, the time has come. Airlie suggested making the announcement on April 1 and saying that the experiment had failed. More seriously, he said, removing the "experimental" tag would help people argue for more resources to be directed toward Rust in their companies.

Bergmann agreed with declaring the experiment over, worrying only that Rust still " doesn't work on architectures that nobody uses ". So he thought that Rust code needed to be limited to the well-supported architectures for now. Ojeda said that there is currently good support for x86, Arm, Loongarch, RISC-V, and user-mode Linux, so the main architectures are in good shape. Bergmann asked about PowerPC support; Ojeda answered that the PowerPC developers were among the first to send a pull request adding Rust support for their architecture.

Bergmann persisted, asking about s390 support; Ojeda said that he has looked into it and concluded that it should work, but he doesn't know the current status. Airlie said that IBM would have to solve that problem, and that it will happen. Greg Kroah-Hartman pointed out the Rust upstream supports that architecture. Bergmann asked if problems with big-endian systems were expected; Kroah-Hartman said that some drivers were simply unlikely to run properly on those systems.

With regard to adding core-kernel dependencies on Rust code, Airlie said that it shouldn't happen for another year or two. Kroah-Hartman said that he had worried about interactions between the core kernel and Rust drivers, but had seen far fewer than he had expected. Drivers in Rust, he said, are indeed proving to be far safer than those written in C. Torvalds said that some people are starting to push for CVE numbers to be assigned to Rust code, proving that it is definitely not experimental; Kroah-Hartman said that no such CVE has yet been issued.

The DRM (graphics) subsystem has been an early adopter of the Rust language. It was still perhaps surprising, though, when Airlie (the DRM maintainer) said that the subsystem is only " about a year away " from disallowing new drivers written in C and requiring the use of Rust.

Ojeda returned to his initial question: can the "experimental" status be ended? Torvalds said that, after nearly five years, the time had come. Kroah-Hartman cited the increased support from compiler developers as a strong reason to declare victory. Steve Rostedt asked whether function tracing works; the answer from Alice Ryhl was quick to say that it does indeed work, though " symbol demangling would be nice ".

Ojeda concluded that the ability to work in Rust has succeeded in bringing in new developers and new maintainers, which had been one of the original goals of the project. It is also inspiring people to do documentation work. There are a lot of people wanting to review Rust code, he said; he is putting together a list of more experienced developers who can help bring the new folks up to speed.

The session ended with Dan Williams saying that he could not imagine a better person than Ojeda to have led a project like this and offered his congratulations; the room responded with strong applause.

Index entries for this article
Kernel Development tools/Rust
Conference Kernel Maintainers Summit/2025



Trying manual memory management in Go

Lobsters
www.youtube.com
2025-12-13 13:21:50
Comments...

AI is bringing old nuclear plants out of retirement

Hacker News
www.wbur.org
2025-12-13 13:08:36
Comments...
Original Article

The Palisades Nuclear Generating Station is nestled between sand dunes on the eastern shore of Lake Michigan. It shut down for financial reasons in 2022. Three years later, it’s on the cusp of reopening, with hundreds of workers streaming through its security barriers every day.

Palisades is on track to restart in early 2026. When it does, it will be the first nuclear plant in the United States to generate electricity again after being decommissioned . Nick Culp of Holtec, the company that owns the plant, said its revival is a response to a surge in demand for electricity.

“We have seen [Michigan]’s baseload generation go offline at a rapid rate as they’ve moved away from fossil generation,” Culp said. “How do you backfill that when you see demand on the horizon like [artificial intelligence], like data storage, like keeping the lights on at home, and new manufacturing?”

Palisades Nuclear Generating Station is on track to restart in 2026. (Chris Bentley/Here &amp; Now)
Palisades Nuclear Generating Station is on track to restart in 2026. (Chris Bentley/Here & Now)

Nuclear is part of the answer to that question, Culp said, and the government agrees. Michigan gave $300 million to the restart — part of its goal to have 100% carbon-free electricity by 2040 — and the federal government gave the project a loan of more than $1.5 billion.

That money is part of the Trump administration’s investment in what it’s calling a “nuclear energy renaissance.” In May, the White House released a plan to quadruple American nuclear power by 2050, following a similar pledge from the Biden administration.

Meeting that goal would require dozens of new reactors. But whether they’re traditional power plants or new designs , nuclear reactors are expensive and slow to build. Facing a crunch between climate goals and rising electricity demand, Michigan, Pennsylvania, and Iowa are reopening plants that closed just a few years ago.

Powering up

When the Palisades plant in Michigan closed in 2022, Jim Byrd said he left his office of more than two decades “with a heavy heart.”

He was working at a nuclear plant in Mississippi last year when he heard about the plan to reboot Palisades. Then he got the call he had been waiting for, asking him to come back.

“Palisades is my home. These people are my family,” Byrd said. Since his return, he’s been training new employees in an exact replica of the reactor control room, right down to its 1960s pink-and-green color scheme.

While the plant was in decent shape, recommissioning still required repairing equipment and overcoming mountains of paperwork.

“We are creating a roadmap on how to do this, and the whole industry is watching,” said Byrd. “I had existing licensed operators that had a license from the Nuclear Regulatory Commission when we shut down, so we had to work on getting those back.”

All that work is worth it, he said, to get the plant back up and running.

“What we're doing here is exciting,” said Byrd. “Having a reliable power source that keeps your electricity costs low, everybody should be rooting for that.”

Paul Rhodes (left) is an operations shift manager at the Palisades Nuclear Generating Station, and Jim Byrd (right) is the assistant operations manager. (Chris Bentley/Here &amp; Now)
Paul Rhodes (left) is an operations shift manager at the Palisades Nuclear Generating Station, and Jim Byrd (right) is the assistant operations manager. (Chris Bentley/Here & Now)

The restart also attracted employees from elsewhere in the industry. The plant’s new chief nuclear officer, Rich Burroni, came from New York’s Indian Point Energy Center, which closed in 2021.

“The trend five years ago was a lot of work on decommissioning,” he said, “and now that’s all changed.”

More change may be coming for Palisades. The Department of Energy said this month it will give Holtec up to $400 million in federal funding to build small modular reactors in Michigan. That technology could help speed up the deployment of new nuclear power in the future, according to many in the industry, but so far has not been commercially viable.

For now, restarting a plant costs less than a third of what it would take to build a new one, said Culp of Holtec.

“When you factor in how long it takes to construct a new nuclear power plant, especially here in the United States, and the amount of money that goes into it,” he said, “it’s a pretty good value proposition.”

‘Taken for granted’

Many of Palisades’ employees live within 10 miles of the plant, which means they could be exposed to a radioactive plume in an emergency .

That zone also includes the town of Covert, Michigan. Township supervisor Daywi Cook’s father helped build the plant in the 1960s.

Covert, Michigan, township supervisor Daywi Cook’s father helped build the plant in the 1960s. (Chris Bentley/Here &amp; Now)
Covert, Michigan, township supervisor Daywi Cook’s father helped build the plant in the 1960s. (Chris Bentley/Here & Now)

“I grew up with the sirens being tested. I think it was every last Saturday of the month,” Cook said. “It was just a normal thing.”

Having friends and family members who worked at the plant helped demystify nuclear power, she said, and she came to see the plant as part of the community.

At one point, taxes from the plant made up 40% of the township’s revenue. Now, as Covert’s township supervisor, Cook said she’s glad the plant is reopening.

“Having that stability and having that employment available for folks who live here is something that I think was taken for granted for a very long time,” she said. “I think what's important is that we educate ourselves as residents near the plant and that Holtec continues to be a good neighbor in being transparent with the community.”

Zach Morris, head of the economic development group Market One, said Pallisades is an important piece of the local economy.

“Southwest Michigan is a beautiful area. It's just a wonderful community of small towns. I call it Americana,” Morris said. “Americana needs electricity. So the good news is we have a really reliable source of power that is clean. It pays its employees well. So we're excited about being able to keep that online.”

Not everyone is on board with the plant’s reopening. Environmental groups have sued to stop it, and protesters have raised concerns about the long-term storage of spent fuel next to the Great Lakes.

Three Mile Island

While nuclear power does have a record of safety , many Americans remember the 1979 disaster at central Pennsylvania’s Three Mile Island. One of the two reactors on the island had a partial meltdown and released radioactive gases into the environment. There were no deaths, and the Nuclear Regulatory Commission said the accident “had no detectable health effects on plant workers or the public.”

That left the plant with only one working reactor, which produced power until 2019, when it shut down for financial reasons. Today, that reactor, like Palisades in western Michigan, is in the process of coming back online.

“When you walk through the plant now, all the equipment is still there, but it's deathly quiet. You don't hear the hum of the motors, the steam going through the lines,” said Craig Smith, who is in charge of bringing back the plant at Three Mile Island, renamed the Crane Clean Energy Center . “It's an eerie kind of feeling when you walk through the plant.”

The nuclear plant at Three Mile Island was renamed the Crane Clean Energy Center. (Chris Bentley/Here &amp; Now)
The nuclear plant at Three Mile Island was renamed the Crane Clean Energy Center. (Chris Bentley/Here & Now)

That eerie feeling may soon be gone. A red LCD clock in Smith’s office counts down the hours until the plant’s reopening in late 2027, which is backed by a billion-dollar loan from the Trump Administration.

The recommissioned reactor on Three Mile Island will pump 835 megawatts into the regional grid, but all that electricity is spoken for by Microsoft, which agreed to buy an equivalent amount of power from the grid for the next 20 years to feed its data centers.

“The dynamics of the energy economy have changed significantly, mainly because of artificial intelligence,” Smith said.

Nuclear is well-suited to the moment, in his view, because of its consistency.

“Hottest days of the year, coldest days of the year, freezing weather, the plant continues to operate,” Smith said. “As far as a reliable power source, you can’t beat it.”

Smith was in high school in nearby Hershey in 1979 and remembers the evacuation after the disaster at Three Mile Island. That failed to dissuade him from going into a career in nuclear power, and he said today, the industry is safer because of regulations put in place after the partial meltdown.

“People at the plant here take that personally,” he said. “The standards of the industry are greatly improved, and we've made significant improvements to the design of the plants and how we operate them.”

‘No viable solution’

Gene Stilp has a different take. He’s one of many people in the area who say the official story of the 1979 disaster failed to account for long-term health problems they believe are related to the accident.

Stilp has been fighting nuclear power on Three Mile Island since before the plant opened, and said the recommissioning is an unnecessary risk to public safety.

“We’re sticking up for the people who live here rather than the shareholders of Microsoft and Constellation,” said Stilp, who often appears in public wearing a blazer with “NO TMI RESTART” sewn on the back.

“What they’re proposing for evacuation does not work, and so that’s my line in the sand,” he said, pointing out the 10-mile Emergency Planning Zone includes a major hospital complex and several schools. “The population increases in Central Pennsylvania, the realization that there are so many people at risk here, the best you can do is take away that risk.”

Another longtime opponent of the power plant, Eric Epstein of Three Mile Island Alert , said the country is making mistakes in its rush to power data centers. He said the economics might have changed for nuclear power, but the risks have not.

“There was no public discussion about whether or not we’re going to restart Three Mile Island,” said Epstein. “You had this psychic tear in the fabric of the community that can't be papered over. You can put all the green paint you want on nuclear power, but there has been no viable solution to isolate nuclear waste.”

Constellation said the spent fuel on site has been safely stored on the island for decades, in fortified containers required by the government to withstand natural disasters, and that all the waste created in 40 years fits in an area about the size of two tennis courts.

Dauphin County Commissioner Justin Douglas said he’s listening to local concerns about the plant’s reopening.

“I personally am very interested in transparency and accountability for this in the sense of ensuring that it's as safe as it possibly can be, that we're tracking the cost and ensuring that the taxpayers aren't carrying any of the burden, that we have a good plan for the waste management and that ultimately the community impact is positive,” said Douglas. “We plan for the worst, and we hope for the best.”

‘A slam dunk’

Meeting the country’s rising demand for electricity will take a lot more than reviving a few recently decommissioned plants.

“It is a brilliant idea. It's sort of a slam dunk. The downside is that there are not many reactors out there that are realistically able to restart,” said Jacopo Buongiorno, professor of nuclear science and engineering at Massachusetts Institute of Technology. “You’re looking at a little bit less than three gigawatts of electricity, out of 50 that apparently are required for data centers and AI.”

There are also technical tweaks called uprates that can squeeze more power out of existing plants, which could help blunt the immediate electricity crunch.

“You probably have potential for another five to eight gigawatts across the whole fleet. So you add that up to the two or three that we get from the restarts, you're looking at 10 [gigawatts],” Buongiorno said, or only about a fifth of the total AI power demand expected by 2030.

“If that demand continues in the 2030s, then you can make the investment now to build new reactors,” he said, “and then nuclear can actually capture a lot more than 20%.”

Please stop using middleware to protect your routes (2024)

Lobsters
pilcrowonpaper.com
2025-12-13 12:55:25
Comments...
Original Article

When talking about auth, there seems be a certain group that’s adamant on using middleware to handle authorization. Middleware here refers to functions that run before every request.

function isProtected(path: string) {
	return path !== "/login" && path !== "/signup";
}

app.middleware((req, res, next) => {
	if (!isProtected(req.path)) {
		return next();
	}
	const user = validateRequest(req);
	if (user) {
		return next();
	}
	res.writeHeader(401);
	res.write("Unauthorized");
});

app.get("/", (_, res) => {
	res.writeHeader(200);
	res.write("Secret message");
});

I do not like this approach at all.

I’m just confused at this point since you’re just re-implementing routing logic within middleware, an API provided by your routing library. And what do you do when you need to protect routes based on user roles?

const adminOnlyRoutes = ["/admin/*"];

app.middleware((req, res, next) => {
	if (!isProtected(req.path)) {
		return next();
	}
	const user = validateRequest(req);
	if (user) {
		let requiresAdminRole = false;
		for (const route of adminOnlyRoutes) {
			requiresAdminRole = matchRoute(route, req.path);
		}
		if (requiresAdminRole && !user.admin) {
			res.writeHeader(401);
			return;
		}
		return next();
	}
	res.writeHeader(401);
});

While route-level middleware (middleware that only applies to certain routes) may help in this simple example, routes in real-world applications aren’t often organized by their required permissions. What happens if you have multiple roles? What if you need to implement different rate-limiting on each route based on user roles? How about API access token permissions and scopes?

Abstractions aren’t the problem here. The issue is that middleware is the wrong abstraction. It’s just the most obvious solution that seems to make sense in a smaller scale.

But, we first have to answer: Do we need to abstract in the first place?

This goes beyond this rant but I feel, at least in the JavaScript ecosystem, people seems to go too far on abstractions and “simplicity.” It isn’t surprising given how loosey-goosey powerful JS can be. Auth, which includes both authentication and authorization, seems to be particularly vulnerable to this since people are overtly scared of it. But auth is not an independent system from your application. It’s an integral part of it that affects and is affected by everything else. This makes it extra-hard to abstract without introducing unwanted complexity since it any abstraction that’s useful require some level of flexibility.

Getting back to the middleware discussion, why not just add the auth check on each route?

app.get("/", (req, res) => {
	// ...
	if (!user.admin) {
		res.writeHeader(401);
		return;
	}
	// ...
});

“B, b… but DRY! Abstractions!”

If you’re too lazy to write some basic if checks, maybe that’s a you problem. But on a serious note, if you need to abstract, use wrapper functions. This is a much better approach than middleware since you don’t have to worry about routing. I also like that all the logic is defined in a single location instead of scattered across your project.

app.get(
	"/",
	protectedRoute((req, res, user) => {
		// ...
	})
);

If you deal with multiple permission level (e.g. roles, scopes), you can just create a helper function for checking them. Again, abstractions themselves aren’t bad. You just need to implement them at the right level.

app.get("/", (req, res) => {
	// ...
	if (!hasPermission(user.role, ["moderator", "admin"])) {
		res.writeHeader(403);
		return;
	}
});

This doesn’t mean middleware is useless. It works for global-level stuff like CSRF protection and providing data to each route. Actually, authenticating requests and passing the user object to each route is a great use of middleware (but letting each route handle authorization). But even then, you should probably replace it once you need to deal with exceptions and multiple patterns.

One common response I get to this opinion is that using middleware prevents developers from accidentally forgetting to add an auth check. That’s why you test your code . You should be testing your auth logic regardless of your implementation. Given that, adding auth checks to each route is less bug-prone and easier to debug than forcing an abstraction with middleware.

We built another object storage

Hacker News
fractalbits.com
2025-12-13 12:29:19
Comments...
Original Article

A Crowded Market, But An Unsolved Problem

Object storage is the backbone of modern data infrastructure. AWS S3, Google Cloud Storage, MinIO, Ceph, newer players like Tigris Data—the market is saturated. So why build another one?

Because the fundamental assumptions behind these systems are shifting. High performance is no longer optional—but having high performance available isn’t the same as being able to afford using it.

Beyond “Cold Storage”: Why Performance Matters Now

Traditional object storage had a clear priority order: cost first, performance later. This worked fine for archiving backups and storing large, rarely accessed files.

But today, object storage is increasingly the primary data layer for AI, analytics, and cloud-native applications. Latency directly translates to compute costs—stalled GPUs waiting on I/O are expensive GPUs doing nothing.

High-performance object storage exists now. S3 Express One Zone, for example, delivers single-digit millisecond latency. But there’s a catch: the per-request pricing makes it prohibitively expensive to actually use at high IOPS. As one analysis put it, it’s “the right technology, at the right time with the wrong price” [1]. You have the performance on paper, but you can’t afford to run your workload at full speed. That’s the high-performance trap.

The New Challenge: AI and Analytical Workloads

Modern workloads, especially in AI, impose demands that strain traditional designs:

Small Objects at Scale : AI training datasets often consist of millions of small files (images, text snippets, feature vectors). A study of typical AI training workloads found over 60% of objects are 512KB or smaller [2]. This shifts the bottleneck from bandwidth to metadata performance.

Latency Sensitivity : Training loops and inference pipelines are bottlenecked by I/O. When fetching thousands of small objects per batch, per-object latency compounds quickly, stalling expensive GPUs.

The Need for Directories : S3’s flat namespace is a mismatch for many workflows. Data scientists expect atomic renames and efficient directory listings—operations that are either slow or missing in classic object stores.

”Why Not Just Use a Filesystem?”

A reasonable question: if you want directories and atomic rename, why not just use a filesystem like AWS EFS? Object stores and filesystems are different concepts—why blur the line?

The answer is that the line is already blurring, driven by real workload demands. AWS themselves recognized this when they introduced S3 Express One Zone with explicit “directory bucket” semantics and atomic rename support (currently single-object) [3]. Google Cloud has made similar moves toward hierarchical namespace support [4]. The industry is converging on this because the clean separation between “object storage for scale” and “filesystem for semantics” doesn’t match how modern applications actually work.

We’re not trying to build a POSIX filesystem. But the subset of filesystem semantics that matter for data workflows—efficient directory listings, atomic rename for safe data handoffs—these belong in object storage. The alternative is forcing every application to build fragile workarounds on top of a flat namespace.

Where Current Solutions Hit a Wall

Existing systems struggle with these patterns in predictable ways:

The High-Performance Trap : High-performance tiers like S3 Express One Zone solve the latency problem, but the per-request cost means you can’t actually use that performance at scale. At 10K PUT/s, you’re looking at ~$29K/month in request fees alone. The performance is there; the economics aren’t.

The Small Object Tax : With cloud object storage, you pay per request. Storing billions of 4KB objects means your API request costs can exceed your storage costs. The more objects you have, the worse it gets.

Missing Directory Semantics : The lack of atomic rename forces complex workarounds in applications, limiting what you can build directly on object storage. Most systems with rename support rely on inode-like structures that struggle with scalability and performance—adding to the per-IOPS cost burden.

Introducing FractalBits

We built FractalBits to break out of the high-performance trap: delivering performance you can actually afford to use at scale. In our benchmarks, we achieved nearly 1M GET/s on 4KB objects with a cluster totaling 64 cores across all data and metadata nodes.

Our focus:

  1. High IOPS at a cost that makes sense—so you can actually run your workload at full speed.
  2. Native directory semantics, including atomic rename.
  3. Strong consistency—no eventual consistency surprises.

The Cost Difference

Here’s what the gap looks like for a small-object intensive workload (4KB objects, 10K IOPS):

Metric S3 Express One Zone FractalBits Reduction
Monthly Cost for 10K PUT/s ~$29,290 ~$166 ~150×
Monthly Cost for 10K GET/s ~$778 ~$42 ~15×
Storage (1 TB Per Month) ~$110 $0 (included)

S3 costs based on public pricing ($0.00113/1K PUTs, $0.00003/1K GETs, $0.11/GB/Month). FractalBits estimated using 1-year reserved instance pricing for required compute (e.g., i8g.2xlarge for data, m7g.4xlarge for metadata). Your savings will vary based on workload, but the magnitude is indicative.

At our core is a metadata engine built on an on-disk radix tree, optimized for path-like keys.

Most object stores use LSM-trees (good for writes, variable read latency) or B+ trees (predictable reads, write amplification). We chose a radix tree because it naturally mirrors a filesystem hierarchy:

Prefix Sharing : Common path segments (e.g., /datasets/cifar10/ ) are stored once, saving memory and speeding up traversal.

Efficient Directory Operations : Listing a directory becomes a subtree scan. Atomic rename is essentially updating a pointer at the branch point, not copying data.

Crash Consistency : We use physiological logging to ensure metadata integrity and fast recovery.

Unlike most systems that use inode-based (or inode-like) structures to support directory features, we use a full-path approach for better scalability and performance.

By the way, we implemented the core engine in Zig for control and predictable performance.
Why Zig?

  • comptime metaprogramming generates optimized code paths for different node types at compile time
  • Manual memory management means no GC pauses and predictable latency
  • Direct SIMD access for parallel key comparisons within tree nodes
  • io_uring in std library, so that we can easily try more recent io_uring kernel features (registered buffers, nvme IOPoll etc).

The Gateway: Rust-Based S3-Compatible API server

Our S3-compatible API server, built in Rust, manages the data path:

Safety & Concurrency : Rust’s ownership model gives us thread safety without a garbage collector—important for high-concurrency request handling.

Async I/O : Built on Tokio for handling thousands of concurrent connections.

Production-Ready Frameworks : We support both axum and actix-web, defaulting to actix-web. Its thread-per-core architecture aligns with our design for maximum performance.

The Model: Bring Your Own Cloud (BYOC)

FractalBits deploys as a managed software layer within your own cloud account (currently AWS only).

For you:

  • Cost transparency—you pay the cloud provider’s raw costs for VMs and disks, no egress fees to us
  • Data sovereignty—your data never leaves your cloud tenant
  • Low latency—deploy in the same region/VPC as your compute

For us: We leverage the cloud’s proven infrastructure instead of building it from scratch, letting us focus on the storage engine itself.

Looking Ahead

The object storage market has high-performance options, but the economics often make that performance unusable at scale. And systems that do offer directory semantics often struggle with performance or scalability. Getting both at a reasonable cost is still rare. We think there’s room for a different approach.

FractalBits is our answer. We’re early in this journey and learning from users who are pushing these limits.


Hitting the performance or cost wall with your current object storage? We’d be interested to hear about your use case.

GitHub


References:

[1]. S3 Express One Zone, Not Quite What I Hoped For, https://jack-vanlightly.com/blog/2023/11/29/s3-express-one-zone-not-quite-what-i-hoped-for

[2]. Mantle: Efficient Hierarchical Metadata Management for Cloud Object Storage Services. SOSP 2025.

[3]. Amazon S3 Express One Zone now supports renaming objects within a directory bucket, https://aws.amazon.com/about-aws/whats-new/2025/06/amazon-s3-express-one-zone-renaming-objects-directory-bucket/

[4]. Google Cloud Storage hierarchical namespace, https://cloud.google.com/blog/products/storage-data-transfer/new-gcs-hierarchical-namespace-for-ai-and-data-lake-workloads

Concrete syntax matters, actually

Lobsters
www.youtube.com
2025-12-13 12:25:31
Slides: https://slim.computer/concrete-syntax/ Comments...

Germany's train service is one of Europe's worst. How did it get so bad?

Hacker News
www.npr.org
2025-12-13 12:08:29
Comments...
Original Article
Germany's new Intercity Express train is seen in Berlin prior to its official presentation by railway operator Deutsche Bahn, on Oct. 17.

Germany's new Intercity Express train is seen in Berlin prior to its official presentation by railway operator Deutsche Bahn, on Oct. 17. Tobias Schwarz/AFP via Getty Images hide caption

toggle caption

Tobias Schwarz/AFP via Getty Images

EN ROUTE TO BERLIN — As the 12:06 p.m. Intercity Express train to Berlin leaves the Swiss city of Bern and crosses the border into Germany, passengers reluctantly bid farewell to punctuality — a guarantee in the Alpine republic where trains run like clockwork.

Fifty-seven-year-old Elisabeth Eisel regularly takes this seven-hour train journey. "Trains in Switzerland are always on time, unless they're arriving from Germany," she says. "Harsh but true, sadly. It didn't used to be the case."

Chronic underinvestment in Germany has derailed yet another myth about Teutonic efficiency. The German railway Deutsche Bahn's long-distance "high-speed" trains are now among the least punctual in Europe . In October, the national rail operator broke its own poor record with roughly only half of all long-distance trains arriving without delay.

Waning reliability is but one of many problems for state-owned Deutsche Bahn, which is operating at a loss and regularly subjects its passengers to poor or no Wi-Fi access, seat reservation mix-ups, missing train cars and "technical problems" — a catch-all reason commonly cited by conductors over the train intercom.

German Transport Minister Patrick Schnieder (second from left) and Evelyn Palla (third from left), CEO of Deutsche Bahn, get off the train at the premiere of the new Intercity Express train at Berlin Ostbahnhof, Oct. 17.

German Transport Minister Patrick Schnieder (second from left) and Evelyn Palla (third from left), CEO of Deutsche Bahn, get off the train at the premiere of the new Intercity Express train at Berlin Ostbahnhof, Oct. 17. Christoph Soeder/picture alliance via Getty Images hide caption

toggle caption

Christoph Soeder/picture alliance via Getty Images

After decades of neglect , the government has announced a 100-billion-euro investment in rail infrastructure. But Lukas Iffländer, vice chair of the railway passenger lobby group Pro Bahn, says it will take more than money to get German trains back on track.

"We are now paying the price for years and years of neglect, basically since 1998," Iffländer says. It's not just crumbling tracks and sticky signals that need attention, he explains, but the network operator's overly bureaucratic infrastructure.

"Every process at Deutsche Bahn is really complicated," Iffländer says. "It takes forever and that frustrates the people that actually want to do something."

Iffländer says Deutsche Bahn is top heavy: While there are not enough train engineers and signal operators, there are too many managers sitting at desks.

German news weekly Der Spiegel recently reported that upper management has allegedly approved canceling long-distance trains to bump up punctuality ratings because canceled trains are not recorded in the statistics.

Deutsche Bahn declined NPR's requests for an interview, but in a written statement it denied embellishing its data. It said that the Spiegel report is "based on chat messages between dispatchers," not "actual data used for collecting statistics."

On a different train — the 11:18 a.m. from Munich to Berlin — passengers are packed like sardines at double capacity because another fully booked Intercity Express was canceled at the very last minute.

The mood is surprisingly jolly, despite the fact that half of the passengers have been standing for more than four hours now — with no hope of getting through the crowded carriages to use the restroom.

Catherine Launay, 51, is lucky enough to have a seat. She's from France and says she's surprised passengers are not kicking up more of a fuss.

"If this had been a French train, there'd have been more of an uproar!" Launay quips. "In fact, French passengers would have revolted by now."

In an effort to prevent aggressive passenger behavior toward train staff, Deutsche Bahn has launched a mockumentary series for TikTok, Instagram and YouTube about a train crew struggling to cope under increasingly preposterous conditions.

YouTube

The fictional train staff's dance routine to a techno beat, while singing " zenk yoo for träveling wiz Deutsche Bahn ," has gone down surprisingly well with passengers, even if they can't actually watch it on board because the Wi-Fi can't cope with streaming.

And as our train rattles along the track, it's difficult to differentiate between Deutsche Bahn parody and reality. The train conductor wishes passengers a pleasant journey "as far as it's possible," adding "we should just about make it to Berlin." The train car chortles.

But Deutsche Bahn is no laughing matter for Federal Transport Minister Patrick Schnieder, who recently warned that "many equate the malfunctioning of railways with the malfunctioning of our state."

Many are putting their hopes in the railway company's new CEO, Evelyn Palla, based on her track record at Austrian Federal Railways.

Palla announced plans this week to make Deutsche Bahn more trim and efficient by eliminating executive positions, but she warned that there's so much to fix, it will take time.

As we finally pull into Berlin's main train station, passengers are resigned to the fact that — whether it's signal failure, humor failure or state failure — Germany's trains appear to have gone off the rails.

YouTube's CEO limits his kids' social media use – other tech bosses do the same

Hacker News
www.cnbc.com
2025-12-13 12:03:51
Comments...
Original Article

Neal Mohan, the CEO of YouTube speaks during a panel for the Summit for Democracy on March 30, 2023 in Washington, DC.

Anna Moneymaker | Getty Images

YouTube's CEO Neal Mohan is the latest in a line of tech bosses who have admitted to limiting their children's social media use, as the harms of being online for young people have become more evident.

Mohan, who took the helm of YouTube's leadership in 2023, was just named Time's 2025 CEO of the Year. He said in an interview with the magazine that his children's use of media platforms is controlled and restricted.

"We do limit their time on YouTube and other platforms and other forms of media. On weekdays we tend to be more strict, on weekends we tend to be less so. We're not perfect by any stretch," Mohan said in one TikTok video posted by Time Magazine on Thursday.

He stressed "everything in moderation" is what works best for him and his wife, and that extends to other online services and platforms. Mohan has three children: two sons and one daughter.

Experts have continued to sound the alarm on how excessive smartphones and social media use has harmed children and teenagers. Jonathan Haidt, NYU professor and author of "The Anxious Generation," has advocated for children to not have smartphones before the age of 14 and no access to social media before the age of 16.

"Let them have a flip phone, but remember, a smartphone isn't really a phone. They could make phone calls on it, but it's a multi-purpose device by which the world can get to your children," Haidt said in an interview with CNBC's Tania Bryer earlier this year.

This week, Australia became the first country to formally bar users under the age of 16 from accessing major social media platforms. Ahead of the legislation's passage last year, a YouGov survey found that 77% of Australians backed the under-16 social media ban. Still, the rollout has faced some resistance since becoming law.

Mohan said in a more extensive interview with Time on Wednesday that he feels a "paramount responsibility" to young people and giving parents greater control over how their kids use the platform. YouTube Kids was launched in 2015 as a child-friendly version of the Google-owned platform.

He said his goal is "to make it easy for all parents" to manage their children's YouTube use "in a way that is suitable to their household," especially as every parent has a different approach.

Bill Gates, Mark Cuban

Several tech bosses have taken a similar approach. YouTube's former CEO Susan Wojcicki , also barred her children from browsing videos on the app, unless they were using YouTube Kids. She also limited the amount of time they spent on the platform.

"I allow my younger kids to use YouTube Kids, but I limit the amount of time that they're on it," Wojcicki told CNBC in 2019. "I think too much of anything is not a good thing."

Bill Gates, Microsoft's co-founder, is amongst the tech titans who are against allowing young people too much screen time. With three children, now adults, Gates openly talked about not giving them cell phones until they were in their teens.

"We don't have cell phones at the table when we are having a meal, we didn't give our kids cell phones until they were 14 and they complained other kids got them earlier," Gates said years ago.

Meanwhile, billionaire Mark Cuban would even resort to installing Cisco routers and using management software to monitor which apps his children were on and shut off their phone activity.

Bookmark for CAD/2d/3D Useful links

Hacker News
github.com
2025-12-13 11:16:45
Comments...
Original Article

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Appearance settings

The Typeframe PX-88 Portable Computing System

Lobsters
www.typeframe.net
2025-12-13 10:23:56
Comments...
Original Article

The Typeframe PX-88 Portable Computing System

A stacked deck.

It's true. The odds are finally in your favor.
The Typeframe PX-88 is an integrated system that has been perfectly arranged to guarantee a superior outcome for the operator. Leave it to Typeframe to integrate these critical elements into one commanding machine.

The PX-88 delivers all the power and specialized features expected from a professional system - but built around a dedicated, uncompromising user experience. Is it a cyberdeck or a writerdeck? It's whatever you need it to be. The reliable Raspberry Pi 4 B core handles demanding web-based editors and complex tasks with robust performance. The compact size belies the strength within.

A mechanical keyboard provides a superior, tactile input experience - a professional tool unmatched by common consumer electronics. Furthermore, the system is designed for simple construction with minimal required soldering, and maintenance is streamlined - all internal components are easily reached via sliding access panels.

If you have been looking for a portable, professional computer where input quality meets core performance, look at the PX-88.

Typeframe. Built for your best work, built by you.

Docusaurus themed image Docusaurus themed image

Rich Headers: leveraging this mysterious artifact of the PE format

Lobsters
www.virusbulletin.com
2025-12-13 10:12:03
Comments...

Computer Animator and Amiga fanatic Dick Van Dyke turns 100

Hacker News
news.ycombinator.com
2025-12-13 08:15:48
Comments...
Original Article

For Mary Poppins, Disney used the sodium vapour process to get monochromatic light into a narrow channel for matte from a light splitting prism.

https://en.wikipedia.org/wiki/Sodium_vapor_process

It's charming. I'm sure digital post offers many advantages. Van Dyke might be one of a few who has done both.

What's the point of lightweight code with modern computers?

Lobsters
liam-on-linux.dreamwidth.org
2025-12-13 08:06:41
Comments...
Original Article

I think there are many.

Some examples:

* The fastest code is the code you don't run.

Smaller = faster, and we all want faster. Moore's law is over, Dennard scaling isn't affordable any more, smaller feature sizes are getting absurdly difficult and therefore expensive to fab. So if we want our computers to keep getting faster as we've got used to over the last 40-50 years then the only way to keep delivering that will be to start ruthlessly optimising, shrinking, finding more efficient ways to implement what we've got used to.

Smaller systems are better for performance.

* The smaller the code, the less there is to go wrong.

Smaller doesn't just mean faster, it should mean simpler and cleaner too. Less to go wrong. Easier to debug. Wrappers and VMs and bytecodes and runtimes are bad: they make life easier but they are less efficient and make issues harder to troubleshoot. Part of the Unix philosophy is to embed the KISS principle.

So that's performance and troubleshooting. We aren't done.

* The less you run, the smaller the attack surface.

Smaller code and less code means fewer APIs, fewer interfaces, less points of failure. Look at djb's decades-long policy of offering rewards to people who find holes in qmail or djbdns. Look at OpenBSD. We all need better more secure code. Smaller simpler systems built from fewer layers means more security, less attack surface, less to audit.

Higher performance, and easier troubleshooting, and better security. There's 3 reasons.

Practical examples...

The Atom editor spawned an entire class of app: Electron apps, Javascript on Node, bundled with Chromium. Slack, Discord, VSCode: there are multiple apps used by tens to hundreds of millions of people now. Look at how vast they are. Balena Etcher is a, what, nearly 100 MB download to write an image to USB? Native apps like Rufus do it in a few megabytes. Smaller ones like USBimager do it in hundreds of kilobytes. A dd command in under 100 bytes.

Now some of the people behind Atom wrote Zed.

It's 10% of the size and 10x the speed, in part because it's a native Rust app.

The COSMIC desktop looks like GNOME, works like GNOME Shell, but it's smaller and faster and more customisable because it's native Rust code.

GNOME Shell is Javascript running on an embedded copy of Mozilla's Javascript runtime.

Just like dotcoms wanted to dis-intermediate business, remove middlemen and distributors for faster sales, we could use disintermediation in our software. Fewer runtimes, better smarter compiled languages so we can trap more errors and have faster and safer compiled native code.

Smaller, simpler, cleaner, fewer layers, less abstractions: these are all goods things which are desirable .

Dennis Ritchie and Ken Thompson knew this. That's why Research Unix evolved into Plan 9, which puts way more stuff through the filesystem to remove whole types of API. Everything's in a container all the time, the filesystem abstracts the network and the GUI and more. Under 10% of the syscalls of Linux, the kernel is 5MB of source, and yet it has much of Kubernetes in there.

Then they went further, replaced C too, made a simpler safer language, embedded its runtime right into the kernel, and made binaries CPU-independent, and turned the entire network-aware OS into a runtime to compete with the JVM, so it could run as a browser plugin as well as a bare-metal OS. Now we have ubiquitous virtualisation so lean into it: separate domains. If your user-facing OS only runs in a VM then it doesn't need a filesystem or hardware drivers, because it won't see hardware, only virtualised facilities, so rip all that stuff out. Your container host doesn't need to have a console or manage disks.

This is what we should be doing. This is what we need to do. Hack away at the code complexity. Don't add functionality, remove it. Simplify it. Enforce standards by putting them in the kernel and removing dozens of overlapping implementations. Make codebases that are smaller and readable by humans.

Leave the vast bloated stuff to commercial companies and proprietary software where nobody gets to read it except LLM bots anyway.

How a US Citizen Was Scanned With ICE's Facial Recognition Tech

403 Media
www.404media.co
2025-12-13 08:01:06
Jesus Gutiérrez told immigration agents he was a U.S. citizen. Only after they scanned his face, did the agents let him go....
Original Article

This article is a partnership between Reveal and 404 Media.

Jesus Gutiérrez, 23, was walking home one morning from a Chicago gym when he noticed a gray Cadillac SUV with no license plates. He kept walking, shrugging it off. Then the car pulled over and two men got out.

The federal immigration officials told him not to run. They then peppered Gutiérrez with questions: Where are you going? Where are you coming from? Do you have your ID on you?

Gutiérrez is a U.S. citizen. He told the officials this. He didn’t have any identification on him, but, panicking, he tried to find a copy on his phone. The agents put him into the car, where another two agents were waiting, and handcuffed him. Just sit there and be quiet, they said.

Without Gutiérrez’s ID, the agents resorted to another approach. They took a photo of his face. A short while later, the agents got their answer: “Oh yeah, he’s right. He’s saying the right thing. He does got papers,” Gutiérrez recalled the agents saying.

💡

Has this happened to you or someone you know? Do you have any videos of ICE or CBP scanning people's faces? Do you work for either agency? I would love to hear from you. Using a non-work device, you can message me securely on Signal at joseph.404 or send me an email at joseph@404media.co.

Gutiérrez’s experience, which he recounted to Reveal , is one snapshot of something that federal authorities have acknowledged to 404 Media that they are doing across the country: scanning people’s faces with a facial recognition app that brings up their name, date of birth, “alien number” if they’re an immigrant, and whether they have an order of deportation. 404 Media previously obtained internal Immigration and Customs Enforcement (ICE) emails revealing the agency’s facial recognition app, called Mobile Fortify, and catalogued social media videos showing agents scanning people’s faces to verify their citizenship .

Now, Reveal has spoken to a person who appears to have had that technology used against them. Gutiérrez sent Reveal a copy of his passport to verify his citizenship.

“You just grabbing, like, random people, dude,” Gutiérrez said he told the agents after they scanned his face. The officials eventually dropped off Gutiérrez after driving for around an hour. For several days, he didn’t go anywhere, not even to the gym. Gutiérrez told his father at the time that he “got kidnapped.”

“This is a flagrant violation of rights and incompatible with a free society,” said Nathan Freed Wessler, deputy project director for the American Civil Liberties Union’s (ACLU) Speech, Privacy, and Technology Project. “Immigration agents have no business scanning our faces with this glitchy, privacy-destroying technology—especially after often stopping people based on nothing more than the color of their skin or the neighborhood they live in.”

A screenshot of an internal DHS document obtained by 404 Media. Available here .

Mobile Fortify is available to ICE and Customs and Border Protection (CBP) officials on their work-issued phones. After an agent scans someone’s face, the app queries an unprecedented collection of U.S. government databases, including one run by the FBI and another that checks for outstanding state warrants, according to user manuals seen by 404 Media. The app runs the person’s face against a database of 200 million images, according to internal ICE material 404 Media viewed.

“The photograph shown [in the app’s results] is the photograph that was taken during the individual’s most recent encounter with CBP, however the matching will be against all pictures CBP may maintain on the individual,” said an internal Department of Homeland Security (DHS) document 404 Media obtained . The app turns the system usually used for verifying travelers at the border inward against people on U.S. streets.

The need for Mobile Fortify, according to that internal document, is for immigration authorities to identify people who can be removed from the country. But it acknowledges that it may be used against U.S. citizens, like in Gutiérrez’s case.

“It is conceivable that a photo taken by an agent using the Mobile Fortify mobile application could be that of someone other than an alien, including U.S. citizens or lawful permanent residents,” the document reads.

PRX Play - Embeddable Player

Embeddable Player

Rep. Bennie G. Thompson, ranking member of the House Homeland Security Committee, previously told 404 Media that ICE will prioritize the results of the app over birth certificates. “ICE officials have told us that an apparent biometric match by Mobile Fortify is a ‘definitive’ determination of a person’s status and that an ICE officer may ignore evidence of American citizenship—including a birth certificate—if the app says the person is an alien,” he said. “ICE using a mobile biometrics app in ways its developers at CBP never intended or tested is a frightening, repugnant, and unconstitutional attack on Americans’ rights and freedoms.”

404 Media has found other instances in which ICE and CBP agents have used a facial recognition app to verify someone’s identity and citizenship. In one that appeared to take place in Chicago, a Border Patrol officer stopped two young men on bicycles before asking his colleague, “Can you do facial?” The other official then scanned one of the boy’s faces, according to a video posted on social media. In another, a group of ICE officers surrounded a man driving a car. He said he was an American citizen. “Alright, we just got to verify that,” one of them said. A second then pointed their phone’s camera at the man and asked him to remove his hat. “If you could take your hat off, it would be a lot quicker,” the officer said. “I’m going to run your information.”

In Gutiérrez’s case, there is little indication that he was stopped for any reason beyond the color of his skin. He is of Mexican descent, he said. Stops of people based on their race, use of Spanish, or location (such as a car wash or bus stop) have become known among critics as “Kavanaugh stops,” after Supreme Court Justice Brett Kavanaugh justified the method in a September opinion .

ICE and CBP Agents Are Scanning Peoples’ Faces on the Street To Verify Citizenship

Videos on social media show officers from ICE and CBP using facial recognition technology on people in the field. One expert described the practice as “pure dystopian creep.”

404 Media Joseph Cox

“The Government sometimes makes brief investigative stops to check the immigration status of those who gather in locations where people are hired for day jobs; who work or appear to work in jobs such as construction, landscaping, agriculture, or car washes that often do not require paperwork and are therefore attractive to illegal immigrants; and who do not speak much if any English,” the opinion says. (Gutiérrez speaks Spanish but conducted his interview with Reveal in English.) “If the officers learn that the individual they stopped is a U.S. citizen or otherwise lawfully in the United States, they promptly let the individual go. If the individual is illegally in the United States, the officers may arrest the individual and initiate the process for removal.”

The ACLU’s Wessler added: “In the United States, we should be free to go about our business without government agents scanning our faces, accessing our personal information, saving our photos for years, and putting us at risk of misidentifications and wrongful detentions. ICE and CBP’s use of Mobile Fortify on the streets of America should end immediately.”

DHS Assistant Secretary Tricia McLaughlin said in a statement, “DHS is not going to confirm or deny law enforcement capabilities or methods.” CBP said that the agency built the app to support ICE operations and that it has been used by ICE around the country.

A CBP spokesperson added in a statement, “Mobile Fortify is a law enforcement app developed by U.S. Customs and Border Protection for ICE agents and officers. It helps field personnel gather information during immigration inspections, but agents must consider all circumstances before deciding on someone's immigration status. CBP personnel working with ICE teams can access the app after completing required training. Further details cannot be shared due to law enforcement sensitivities.”

Gutiérrez said that at the end of his encounter, while he was still in the car, the agents were laughing.

About the author

Joseph is an award-winning investigative journalist focused on generating impact. His work has triggered hundreds of millions of dollars worth of fines, shut down tech companies, and much more.

Joseph Cox

OSS Friday Update - Fibers are the Future of Ruby

Lobsters
noteflakes.com
2025-12-13 07:22:53
Comments...
Original Article

12·12·2025

In the last few days I’ve managed to finalize work on the UringMachine fiber scheduler. Beyond making sure the fiber scheduler is feature complete, that is, it implements all the different Fiber Scheduler hooks and their expected behaviour. To make sure of this, I also spent a couple of days writing test cases, not only of the fiber scheduler, but also of UM’s low-level API.

Beyond the tests, I wrote a series of benchmarks to have an idea of how UringMachine compares to other concurrency solutions:

You can consult the full results here . I’ll refrain from making overly generalized statements about what these benchmark results mean, but I think they demonstrate the promise of working with fibers to create concurrent Ruby apps.

So, as these benchmarks show, the Fiber Scheduler can bring significant benefits to concurrent Ruby apps, with minimal changes to the code (basically, instead of Thread.new you’ll use Fiber.schedule ). The fact that the scheduler does the I/O transparently behind the scenes and integrates with the rest of the Ruby ecosystem feels almost like magic.

So I think this really validates the approach of Samuel Williams in designing how the fiber scheduler interfaces with the rest of the Ruby runtime. And the fact that the web server he authored, Falcon , is now used in production at Shopify, is an even stronger validation!

Here’s a detailed report of my work this last week:

  • Samuel has fixed the issue with the hanging #pwrite (it turns out the the #io_pwrite hook was being invoked with the GVL released.)

  • Added support for SQPOLL mode when setting up a UringMachine instance. It’s not clear to me what are the performance implications of that, but I’ll try to make some time to check this against TP2 , a UringMachine-based web server I’m currently using in a bunch of projects.

  • started looking at getting #io_close to work, and found out that Samuel has already done the work, that is the code was already there, but was commented out. Samuel explained that it was impossible to get it to work due to the complexity of the implementation of IO#close , and indeed when I tried it myself I saw that in fact it was just not possible the way the IO state is managed when an IO is closed. I then had the idea that maybe we could pass the underlying fd instead of the IO object itself to the #io_close hook. The only issue is that this breaks the convention where the different io_xxx hooks take an io as their first argument. Nevertheless, I suggested this idea to Samuel and gladly he accepted when he saw this is the only we can make this hook work. Samuel then proceeded to prepare a PR and merge it.

  • Added the #io_close hook to the UringMachine fiber scheduler, as well as a #yield hook for dealing with thread interrupts in response to another PR by Samuel. I also added missing docs for the different methods in the fiber scheduler.

  • Spent a lot of time writing lots of tests for the fiber scheduler. I tried to cover the entire IO API - both class- and instance methods. I also wrote some “integration” tests - different scenarios not unlike those in the benchmarks, which exercise the different hooks in the fiber scheduler.

  • Added some new APIs to help with testing: UM#await_fibers is a method for waiting for one or more fibers to terminate. Unlike UM#join , it doesn’t return the return values of the given fibers, it just waits for them to terminate. Another new API is UM.socketpair , which is like Socket.socketpair except it returns raw fd’s.

  • Fixed some small issues in the UM fiber scheduler and in the UM low-level API implementation.

  • Added and streamlined metrics that indicate the following:

    • The ring size
    • Total number of ops
    • Total number of fiber switches
    • Total number of waits for CQEs
    • Current number of pending ops
    • Current number of unsubmitted ops
    • Current size of runqueue
    • Current number of transient ops
    • Current number of free ops

    I also added some basic time measurements:

    • Total CPU time
    • Total time spent waiting for CQEs

    These are off by default, but can be enabled by calling UM#profile(true) . I’d like to do a lot more with profiling, like measuring the CPU time spent on each fiber, but I’m a bit apprehensive of the performance costs involved, as getting the CLOCK_THREAD_CPUTIME_ID clock is relatively slow, and then managing this for each fiber means getting and setting a couple of instance variables, which can really slow things down. On top of that, I’m not that sure this is really needed.

What’s Next for UringMachine

  • One of the ideas I discussed with Samuel is to add support for registered buffers that integrates with the IO::Buffer class. While UringMachine already has support for buffer rings, it uses a custom implementation of buffers. So I might start by converting this to use IO::Buffer instead.

  • I’d also like to do a bit more work on performance tuning the UringMachine low-level API, specifically to be able to control the maximum number of fiber context switches before doing I/O work, i.e. submitting ops and checking for completions.

  • Beyond that, I also want to spend some time documenting the UringMachine API, as it is sorely lacking, and I’d like for other people to be able to play with it.

The (Mostly) Complete Unicode Spiral (2022)

Lobsters
shkspr.mobi
2025-12-13 07:19:31
Comments...
Original Article

I present to you, dear reader, a spiral containing every 0 Unicode 14 character in the GNU Unifont . Starting at the centre with the control characters, spiralling clockwise through the remnants of ASCII, and out across the entirety of the Basic Multi Lingual Plane. Then beyond into the esoteric mysteries of the Higher Planes 1 .

A spiral of tightly packed characters.

Zoom in for the massiveness . It's a 10,000x10,000px image. Because the Unifont displays individual characters in a 16x16px square, it is quite legible even when printed out on a domestic laser printer at 600dpi:

I also made it as a square spiral - which fits into a smaller space.

A giant square spiral.

Again, printed out at 600dpi it is readable. Just!

Printed onto A0 - 841mm square - it's a bit better. The ASCII set is readable: Black characters on white paper.

But characters in CJK weren't particularly legible:

Various CJK characters - some of them look like ink smudges.

If I wanted the 16px symbols to each be 5mm wide, I'd need to print this on paper over 3 metres wide!

WHY??!?

Because visualising one-dimensional data structures in two-dimensional space is fun! That's why 😃

I was inspired by seeing two lovely piece of artwork recently.

The first was 2015's Unicode in a spiral by Reddit user cormullion. A spiral of text. (Click to embiggen.)

It's gorgeous, but doesn't include all characters. Oh, and you also have to rotate your head to read each character.

There's a larger version which covers a lot more of the Basic Multilingual Plane An incredibly dense spiral of information. It's an 18MB PDF . And, because of the resolution of the resolution of the font, it needs to be printed out on a 1 metre square at a minimum.

The second interesting thing I found was a 2016 Hilbert Curve of Unicode:

The Hilbert Curve poster is beautiful. But it only goes up to Unicode 10 - and we're on Unicode 14 by now. Despite the æsthetically pleasing nature of fractal curves, I find them quite un-intuitive.

Neither show off the gaps in Unicode. That is, where there is space to fit more symbols.

So I wanted to do something which satisfied these criteria:

  • Contained all of Unicode 14
  • Was legible at a small size
  • Showed spaces where there are empty sections
  • Readable without tilting one's head
  • Slightly more visually interesting than a grid

HOW?!?!

I've written before about the wonders of the Unifont . It contains all of the Unicode 14 glyphs - each squeezed down into a 16x16px box. Even emoji!

Lots of Emoji rendered in small, monochrome pixels.

Well. Mostly…

Limitations

Although I wanted every character, there are some practical problem. Firstly:

Unifont only stores one glyph per printable Unicode code point. This means that complex scripts with special forms for letter combinations including consonant combinations and floating vowel marks such as with Indic scripts (Devanagari, Bengali, Tamil, etc.) or letters that change shape depending upon their position in a word (Indic and Arabic scripts) will not render well in Unifont.

So there are some scripts which will look a bit ugly. And some characters which won't be well represented.

The second issue is one of size. Some of the newer characters are simply too big:

Scripts such as Cuneiform, Egyptian Hieroglyphs, and Bamum Supplement will not be drawn on a 16-by-16 pixel grid. There are plans to draw these scripts on a 32-by-32 pixel grid in the future.

That means it misses out on characters like 𒀰 , 𒁏 and, of course, 𒀱 . Which, to be fair, would be hard to squeeze in!

The third problem is that Unicode is updating all the time. Although the Unifont is at Version 14 - Python's Unicode Database is stuck at V13. Luckily, there is a library called UnicodeData2 which includes V14.

But, given those limitations, I thought it was possible to craft something nice.

Python Code

I split the problem into several parts.

Plotting equidistant points along a spiral

As ever, I turned to StackOverflow and found a neat little solution :

 Python 3def spiral_points(arc=1, separation=1):
    #   Adapted from https://stackoverflow.com/a/27528612/1127699
    """generate points on an Archimedes' spiral with `arc` giving the length of arc between two points and `separation` giving the distance between consecutive turnings
    - approximate arc length with circle arc at given distance
    - use a spiral equation r = b * phi
    """
    def polar_to_cartesian(r, phi):
        return ( round( r * math.cos(phi) ),
                 round( r * math.sin(phi) )
               )

    # yield a point at origin
    yield (0, 0)

    # initialize the next point in the required distance
    r = arc
    b = separation / (2 * math.pi)
    # find the first phi to satisfy distance of `arc` to the second point
    phi = float(r) / b
    while True:
        yield polar_to_cartesian(r, phi)
        # advance the variables
        # calculate phi that will give desired arc length at current radius (approximating with circle)
        phi += float(arc) / r
        r = b * phi

Drawing a squaril

I wanted a grid which looked like this:

 TXT9 A B
8 1 2
7 0 3
6 5 4

I found a blog post and source code for a spiral array . It's pretty simple - although I'm sure there's lots of ways to do this:

 Python 3n = 12
nested_list= [[0 for i in range(n)] for j in range(n)]
low=0
high=n-1
x=1
levels=int((n+1)/2)
for level in range(levels):
    for i in range(low,high+1):
        nested_list[level][i]= x
        x+=1

    for i in range(low+1,high+1):
        nested_list[i][high]= x
        x+=1

    for i in range(high-1,low-1,-1):
        nested_list[high][i]= x
        x+=1

    for i in range(high-1,low,-1):
        nested_list[i][low]= x
        x+=1

    low+=1
    high-=1

for i in range(n):
    for j in range(n):
        print(nested_list[i][j],end="\t")# print the row elements with
        # a tab space after each element
    print()# Print in new line after each row

However, that printed the spiral backwards:

 TXTB A 9
2 1 8
3 0 7
4 5 6

Luckily, Python makes it easy to reverse lists:

 Python 3for l in nested_list :
    l.reverse()

Drawing the characters

Turning a number into a Unicode character is as simple as:

 Python 3unicode_character = chr(character_int)

But how do we know if the font contains that character? I stole some code from StackOverflow which uses the FontTools library :

 Python 3from fontTools.ttLib import TTFont
font = TTFont(fontpath)   # specify the path to the font in question

def char_in_font(unicode_char, font):
    for cmap in font['cmap'].tables:
        if cmap.isUnicode():
            if ord(unicode_char) in cmap.cmap:
                return True
    return False

But, of course, it is a bit more complicated than that. The Unifont contains some placeholder glyphs - the little black square with hex digits in them that you see here:

A grid of Unicode characters

I didn't want to draw them. But they exist in the font. So how do I skip them?

Using the Python Unicode Database it's possible to look up the name of a Unicode code-point. e.g. chr(65) is LATIN CAPITAL LETTER A . So if there is no name in the database, skip that character.

But, of course, it is a bit more complicated than that! The Unicode database only goes up to Unicode 13. And, for some reason, the control characters don't have names. So the code becomes a tangled mess of if...else statements. Ah well!

Drawing the characters should have been easy. I was using Pillow to draw text . But, despite the pixely nature of the font itself Pillow was performing anti-aliasing - creating unwanted grey subpixels.

I thought the fix was simple:

Sadly, that does introduce some other artefacts - so I've raised a bug with Pillow .

In the end, I kept the anti-aliasing, but then converted the grey pixels to black. And then converted the entire image to monochrome:

 Python 3threshold = 191
image = image.point(lambda p: p > threshold and 255)
image = image.convert(&#039;1&#039;)

Putting It All Together

Once I'd go the co-ordinates for either the spiral or squaril, I drew the character on the canvas:

 Python 3draw.text(
   (x , y),
   unicode_character,
   font=font,
   fill=font_colour)

Except it didn't work!

Sadly, Pillow can't draw non-printable glyphs - even when the font contains something drawable. This is because it can't pass the correct options to the harfbuzz library.

So, I went oldskool! I converted every glyph in the font to a PNG and saved them to disk.

 Python 3from fontforge import *

font = open("unifont_upper-14.0.04.ttf")

for i in range( len(font) ) :
    try:
        font[i].export( "pngs/" + str(i) + ".png", pixelsize=16, bitdepth=1)
    except Exception as e:
        print ( str(i) )
        print ( e )

Look, if it's hacky but it works; it isn't hacky! Right?

From there, it's a case of opening the .png and pasting it onto the canvas:

 Python 3character_png = Image.open('pngs/' + str(character_int) + ".png")
image.paste( character_png, (round(x) , round(y)) )

It was too big!

And now we hit the final problem. The image was over 20,000 pixels wide. Why? The Variation Selectors ! The last of which is at position U+E01EF . Which means the spiral looks like this:

A wide image showing the huge gap between the selectors and the rest of the spiral.

Here they are in close up: A series of boxes in an arc.

So I decided to remove that block!

Source Code

All the code is on GitLab . Because GitHub is so 2019…

Licensing?

The GNU Unifont has a dual licence. GPL2 and OFL . The image is a "document" for the purposes of the OFL and the GPL font exemption. But, I guess you could reverse engineer a font-file from it. So, if you use the image to generate a font, please consider that it inherits the original licence . If you just want to print it out, or use it as art, then the image itself is CC BY-SA .

This is based on my lay-person's understanding of the various copyleft licence compatibility issues . Corrections and clarifications welcome!

What's next?

I would like to print this out on paper. At 200dpi, it would be about 1.5m squared. Which I guess is possible, but might be expensive.

At 600dpi, the square will just about fit on A3 paper. But the quality is atrocious. Even at A0 it wasn't great. Realistically, it needs to be at least 3.3 metres along each side! No idea where I can find a printer which will do that. Or where in my house I'd have space for it!

Of course, it will need updating whenever there is a new release of either Unicode or Unifont.

If you have any suggestions or feedback - please drop them in the comment box!

Crypto, FIDO and Security Tokens

Lobsters
docs.google.com
2025-12-13 06:55:58
Comments...
Original Article

1

Crypto, FIDO and Security Tokens

How to contribute

2

Please use the Google Sheets comments to discuss or propose changes. y

Is Supported

3

Scope:

USB-connected, TFA, PKI, OTP, PIV tokens

Send updates, access requests or inquires to stv0g@0l.de. n

Not supported

4

Last Update: 2024-05-20 Please feel free to contact me for edit permissions ?

Maybe supported

5

Web Link: https://l.0l.de/tokens (y)

Supported with quirks

6

https://l.0l.de/tokens-view

7

Form Factor MSRP Protocols Interfaces USB Certification Algorithms Digests Capacity Internal Technical Details Extensions

8

Vendor Model

Released

End-of-Life

USB

Lightning

Nano

Currency

Open Source

Touch

Fingerprint

FIDO2

U2F

OATH-HOTP

OATH-TOTP

OCRA

PIV

OpenPGP

OTP-HID

TPM 2.0

PKCS #15

USB

NFC

BLE

WIFI

HID

CTAP Version

VID PID FCC ID

FIPS

CC

EMVCo

U2F

FIDO

RSA 1024

RSA 2048

RSA 3072

RSA 4096

secp256r1

secp256k1

secp384r1

secp521r1

brainpoolP384r1

brainpoolP512r1

brainpoolP256r1

X25519

ED25519

SHA1

SHA256

SHA512

OTP Credentials

FIDO Resident Keys

Operating System Secure Element MCU OATH No Truncate
FIDO HMAC-Secret

Last Updated

Link Teardown Comment

9

Yubico YubiKey 5Ci 2019 C y n 75.00 $ n y n y y y y n y y y n y n n 0x1050 0x040X n 32 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013708440-YubiKey-5Ci

10

Yubico YubiKey Bio - FIDO Edition 2021 A n n 90.00 $ n y y y n n n n n n n y n n 0x1050 0x040X n 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/4407743521810-YubiKey-Bio-FIDO-Edition

11

Yubico

YubiKey C Bio - FIDO Edition

2021 C n n 95.00 $ n y y y n n n n n n n y n n 0x1050 0x040X n 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/4407752687378-YubiKey-C-Bio-FIDO-Edition

12

Yubico YubiKey 5C NFC 2020 C n n 55.00 $ n y n y y y y n y y y n y y n 0x1050 0x040X n 32 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/360016649339-YubiKey-5C-NFC

13

Yubico YubiKey 5 NFC 2018 A n n 50.00 $ n y n y y y y n y y y n y y n 0x1050 0x040X n 32 25 Infineon SLE 78CLUFX5000P01 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013656980-YubiKey-5-NFC

https://www.hexview.com/~scl/neo5/

14

Yubico YubiKey 5 Nano 2018 A n y 60.00 $ n y n y y y y n y y y n y n n 0x1050 0x040X n 32 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013708340-YubiKey-5-Nano

15

Yubico YubiKey 5C 2018 C n n 55.00 $ n y n y y y y n y y y n y n n 0x1050 0x040X n 32 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013724359-YubiKey-5C

16

Yubico YubiKey 5C Nano 2018 C n y 65.00 $ n y n y y y y n y y y n y n n 0x1050 0x040X n 32 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013724699-YubiKey-5C-Nano

17

Yubico

Security Key NFC by Yubico

2023 A n n 25.00 $ n y n y y n n n n n n n y y n 0x1050 n 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013779399-Security-Key-NFC

18

Yubico

Security Key C NFC by Yubico

2023 C n n 29.00 $ n y n y y n n n n n n n y y n 0x1050 n 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/4408701728914-Security-Key-C-NFC

19

Yubico

Security Key NFC - Enterprise Edition

2023 A n n n y n y y n n n n n n n y y n 0x1050 n 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/7450466556700-Security-Key-NFC-Enterprise-Edition

20

Yubico

Security Key C NFC - Enterprise Edition

2023 C n n n y n y y n n n n n n n y y n 0x1050 n 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/7450467794076-Security-Key-C-NFC-Enterprise-Edition

21

Yubico YubiKey 5C NFC FIPS 2021 C n n 85.00 $ n y n y y y y n y y y n y y n 0x1050 y 32 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/360021467299-YubiKey-5C-NFC-FIPS

22

Yubico YubiKey 5Ci FIPS 2021 C y n 105.00 $ n y n y y y y n y y y n y n n 0x1050 y 32 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/360021443360-YubiKey-5Ci-FIPS

23

Yubico YubiKey 5 NFC FIPS 2021 A n n 80.00 $ n y n y y y y n y y y n y y n 0x1050 y 32 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/360021443340-YubiKey-5-NFC-FIPS

24

Yubico YubiKey 5 Nano FIPS 2021 A n y 90.00 $ n y n y y y y n y y y n y n n 0x1050 y 32 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/360021467319-YubiKey-5-Nano-FIPS

25

Yubico YubiKey 5C FIPS 2021 C n n 85.00 $ n y n y y y y n y y y n y n n 0x1050 y 32 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/360021467359-YubiKey-5C-FIPS

26

Yubico YubiKey 5C Nano FIPS 2021 C n y 95.00 $ n y n y y y y n y y y n y n n 0x1050 y 32 25 2023-10-30 https://support.yubico.com/hc/en-us/articles/360021443380-YubiKey-5C-Nano-FIPS

27

Yubico YubiHSM 2 FIPS 2021 A n n 950.00 $ n n y n n 0x1050 y 2023-10-30 https://support.yubico.com/hc/en-us/articles/360021458320-YubiHSM-2-FIPS

28

Yubico YubiHSM 2 2017 A n n 650.00 $ n n y n n 0x1050 0x0030 n 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013643200-YubiHSM-2

29

Yubico YubiKey 4 2015 2018 A n n n y n n y y y n y y y n y n n 0x1050 0x040X n 32 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013714599-YubiKey-4

30

Yubico YubiKey 4 FIPS A n n n y n n y 0x1050 0x040X y 2023-10-30 https://support.yubico.com/hc/en-us/articles/360016649199-YubiKey-FIPS-4-Series-Technical-Manual

31

Yubico YubiKey 4 Nano 2015 2018 A n y n y n n y y y n y y y n y n n 0x1050 0x040X n 32 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013647780-YubiKey-4-Nano

32

Yubico YubiKey 4C 2017 2018 C n n n y n n y y y n y y y n y n n 0x1050 0x040X n 32 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013647820-YubiKey-4C

33

Yubico YubiKey 4C Nano 2017 2018 C n y n y n n y y y n y y y n y n n 0x1050 0x040X n 32 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013647840-YubiKey-4C-Nano

34

Yubico YubiKey NEO 2012 2018 A n n n y n n y y y n y y y n y y n 0x1050 0x011X n 32 JCOP V2.4.2 R1 NXP A7005C NXP LPC11U24 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013714579-YubiKey-NEO

https://www.hexview.com/~scl/neo/

35

Yubico Security Key by Yubico 2018 2020 A n n n y n n y n n 0x1050 0x0120 n 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013647720-Security-Key-by-Yubico

36

Yubico FIDO U2F Security Key 2013 2018 A n n n y n y y n n n n n n n y n n 0x1050 n 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013656800-FIDO-U2F-Security-Key

37

Yubico YubiHSM 1 2015 2017 A n n n n y n n 0x1050 n 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013662860--YubiHSM-1

38

Yubico YubiKey FIPS 2018 2021 A n n n y n n y n n 0x1050 y 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013761699-YubiKey-FIPS-4-Series-

39

Yubico YubiKey Nano FIPS 2018 2021 A n y n y n n y n n 0x1050 y 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013778259-YubiKey-Nano-FIPS-4-Series-

40

Yubico YubiKey C FIPS 2018 2021 C n n n y n n y n n 0x1050 y 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013729079--YubiKey-C-FIPS-4-Series-

41

Yubico YubiKey C Nano FIPS 2018 2021 C n y n y n n y n n 0x1050 y 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013761279-YubiKey-C-Nano-FIPS-4-Series-

42

Yubico

Yubikey Standard v1 / Touch (black circle)

2007 A n n y n n n y n n 0x1050 0x0010 n 2023-10-31

43

Yubico

Yubikey Standard v2 (gold circle, no dot)

A n n y n n n y n n 0x1050 0x0010 n 2023-10-31

44

Yubico

YubiKey Standard v2 (gold circle w/dot)

2014 2016 A n n y n n n y n n 0x1050 0x0010 n 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013656120-YubiKey-Standard

45

Yubico YubiKey RFID 2009 ? A n n y n n n y n n 0x1050 0x0010 n 2023-10-31

https://web.archive.org/web/20110829223639/https://store.yubico.com//store/catalog/product_info.php?products_id=17

Only briefly available

46

Yubico YubiKey 4 Nano 2016 2017 A n y n y n n y n n 0x1050 0x040X n 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013647780-YubiKey-4-Nano

47

Yubico YubiKey NEO-n 2014 2016 A n y n y n n y n n 0x1050 0x011X n 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013714639-YubiKey-NEO-n

48

Yubico YubiKey Nano 2012 2016 A n y n y n n y n n 0x1050 n 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013656840-YubiKey-Nano

49

Yubico YubiKey Plus 2014 2015 A n n n y n n y n n 0x1050 0x0410 n 2023-10-30

50

Yubico YubiKey VIP 2011 2017 A n n n y n n y n n 0x1050 0x0010 n 2023-10-30

51

Yubico YubiKey Edge-n 2015 2016 A n y n y n n y n n 0x1050 n 2023-10-30 https://support.yubico.com/hc/en-us/articles/360013714659-YubiKey-Edge-n

52

Yubico

Mt. Gox YubiKey II / Standard

2011 n y n y n y n n 0x1050 0x0010 2023-10-31 https://twitter.com/MagicalTux/status/1311228582534864898

Custom Mt. Gox-speciifc firmware

53

Yubico Gnubby U2F A n n n y n y n n 0x1050 0x0200 n 2023-10-31

Internal Google firmware, not commercially sold

54

NitroKey Nitrokey 3A NFC C n n 55.00 € y y n n y y n Trussed NXP SE050 LPC55S6x 2023-10-30 https://shop.nitrokey.com/de_DE/shop/product/nk3an-nitrokey-3a-nfc-147

55

NitroKey Nitrokey 3C NFC C n n 59.00 € y y n n y y n Trussed NXP SE050 LPC55S6x 2023-10-30 https://shop.nitrokey.com/de_DE/shop/product/nk3cn-nitrokey-3c-nfc-148

56

NitroKey Nitrokey 3A Mini A n y 49.00 € y y n n y n n Trussed NXP SE050 LPC55S6x 2023-10-30 https://shop.nitrokey.com/de_DE/shop/product/nk3am-nitrokey-3a-mini-149

57

NitroKey Nitrokey Storage 2 A n n 109.00 € n n y n n 2023-10-30 https://shop.nitrokey.com/de_DE/shop/product/nitrokey-storage-2-56

58

NitroKey Nitrokey FIDO2 A n n 29.00 € y n n y n n 2023-10-30 https://shop.nitrokey.com/de_DE/shop/product/nkfi2-nitrokey-fido2-55

59

NitroKey Nitrokey HSM 2 A n n 99.00 € n y n n 2023-10-30 https://shop.nitrokey.com/de_DE/shop/product/nkhs2-nitrokey-hsm-2-7

60

NitroKey Nitrokey Start A n n 29.00 € y n n y n n 2023-10-30 https://shop.nitrokey.com/de_DE/shop/product/nksa-nitrokey-start-6

61

NitroKey Nitrokey Pro 2 A n n 99.00 € y n n y n n 2023-10-30 https://shop.nitrokey.com/de_DE/shop/product/nkpr2-nitrokey-pro-2-3

62

SoloKeys

Solo 2A+ NFC Security Key

A n n 50.00 $ y n y y n n n n n n n y y n 2023-11-01

https://solokeys.com/collections/all/products/limited-edition-solo-2a-nfc-security-key-built-with-trussed%C2%AE

63

SoloKeys

Solo 2C+ NFC Security Key

C n n 50.00 $ y n y y n n n n n n n y y n 2023-11-01

https://solokeys.com/collections/all/products/limited-edition-solo-2c-nfc-security-key-built-with-trussed%C2%AE

64

SoloKeys Solo 1 Tap USB-A A n n 25.00 $ y n y y n n n n ? n n y n 2023-11-01 https://solokeys.com/collections/all/products/solo-tap-usb-a-preorder

65

Solokeys Somu A n y y n y y n n n n ? n n y n 2023-11-01 https://www.crowdsupply.com/solokeys/somu

66

Token2 T2F2-TypeC C n n 15.00 $ n n y y (y) n n n n n n y n 0x349E 2023-11-01 https://www.token2.com/shop/product/token2-t2f2-typec-fido2-and-u2f-security-key

67

Token2 T2F2-NFC-Slim A n n 18.00 $ n y n y y (y) y n n n n n y y n 0x349E 2023-11-01 https://www.token2.com/shop/product/token2-t2f2-nfc-slim-fido2-u2f-and-totp-security-key

68

Token2 T2F2-NFC-Slim-TypeC C n n 18,5 $ n y n y y (y) y n n n n n y y n 0x349E 2023-11-01 https://www.token2.com/shop/product/token2-t2f2-nfc-slim-typec-fido2-u2f-and-totp-security-key

69

Token2 T2F2-Bio2-TypeC C n n 36,5 $ n y y y n y n n n n n y n 0x349E 2023-11-01

https://www.token2.com/shop/product/token2-t2f2-bio2-typec-fido2-1-u2f-and-totp-security-key-with-fingerprint-protection

70

Token2 T2F2-Bio2 A n n 36.00 $ n y y y n y n n n n n y n 0x349E 2023-11-01

https://www.token2.com/shop/product/token2-t2f2-bio-fido2-1-pre-u2f-and-totp-security-key-with-fingerprint-protection

71

Token2 T2F2 A n n 13,50 $ n n y y (y) n n n n n n y n 0x096E 0x0854 2023-11-21 https://www.token2.com/shop/product/token2-t2f2-fido2-and-u2f-security-key

Rebranded Feitian

72

Token2 T2F2-mini A n y 18.00 $ n n y y (y) n n n n n n y n 0x349E 2023-11-01 https://www.token2.com/shop/product/-token2-t2f2-mini-fido2-and-u2f-security-key

73

Token2 T2F2-ALU A n n 14.00 $ n n y y (y) y n n n n n y n 0x349E 2023-11-01 https://www.token2.com/shop/product/token2-t2f2-alu-fido2-u2f-totp-security-key

74

Token2 T2F2-Bio A n n 33.00 $ n y y y (y) y n n n n n y n 0x349E 2023-11-01

https://www.token2.com/shop/product/token2-t2f2-bio-fido2-u2f-and-totp-security-key-with-fingerprint-protection

75

Token2 T2F2-NFC-Dual A/C n n 22.00 $ n y n y y (y) y n n n n n y y n 0x349E 0x0022 2023-11-21

https://www.token2.com/shop/product/token2-t2f2-nfc-dual-fido2-u2f-and-totp-security-key-with-usb-a-and-usb-typec-connectors

76

Token2 T2F2-NFC-TypeC C n n 17.00 $ n n y y (y) y n n n n n y y n 0x349E 2023-11-01 https://www.token2.com/shop/product/token2-t2f2-nfc-typec-fido2-u2f-and-totp-security-key

77

Token2 T2G2-Bio2-Bronze A n n 37.00 $ n y y y n y n n n n n y n 0x349E 2023-11-01

https://www.token2.com/shop/product/token2-t2f2-bio2-bronze-fido2-1-u2f-and-totp-security-key-with-fingerprint-protection

78

Token2 T2F2-PIN+/TypeC C n n 22.00 $ n n y y (y) y n n n n n y n 0x349E 2023-11-01

https://www.token2.com/shop/product/token2-t2f2-pin-typec-fido2-u2f-and-totp-security-key-with-pin-complexity-feature

79

Token2 T2F2-PIN+ A n n 21,50 $ n n y y (y) y n n n n n y n 0x349E 2023-11-01

https://www.token2.com/shop/product/token2-t2f2-pin-fido2-u2f-and-totp-security-key-with-pin-complexity-feature

80

Cryptoken TOKEY JCOP3 A n 59,95 € n y n 2023-10-30 https://www.cryptoken.com/de/products/tokey-jcop3/130951

81

Thales

SafeNet eToken 5300-C, Pin-only

C n n y n 2023-10-30

82

Thales

SafeNet eToken 5300 Micro

A n n y n 2023-10-30

83

Thales

SafeNet eToken 5300 MINI

A n n y n 2023-10-30

84

FEITIAN

ePass2003 PKI Token (A1+)

A n n 32.00 $ n n y n n 2023-10-31 https://www.ftsafe.com/store/product/epass2003-pki-token/

85

FEITIAN

ePass2003 PKI Token (X8)

A n n 34.00 $ n n y n n 2023-10-31 https://www.ftsafe.com/store/product/epass2003-pki-token/

86

FEITIAN

StorePass2003 PKI Token (X8)

A n n 52.00 $ n n y n n y y y y y y y 2023-10-31 https://www.ftsafe.com/store/product/storepass2003-pki-token/

87

FEITIAN

ePass FIDO NFC USB-A (K9B)

A n n 35.00 $ n n y n 2023-10-31 https://www.ftsafe.com/store/product/epass-fido-security-key/

88

FEITIAN

ePass FIDO NFC USB-C (K40)

C n n 49.00 $ n n y y n 2023-10-31 https://www.ftsafe.com/store/product/epass-fido-nfc-security-key/

89

FEITIAN

BioPass FIDO2 Plus USB-A (K45+)

A n n 69.00 $ n y y n 2023-10-31 https://www.ftsafe.com/store/product/biopass-fido-security-key/

90

FEITIAN

BioPass FIDO2 Plus USB-A (K27+)

A n n 69.00 $ n y y n 2023-10-31 https://www.ftsafe.com/store/product/biopass-fido-security-key/

91

FEITIAN

BioPass FIDO2 Plus USB-C (K26+)

C n n 69.00 $ n y y y n 2023-10-31 https://www.ftsafe.com/store/product/biopass-fido-security-key/

92

FEITIAN

BioPass FIDO2 Pro USB-C (K49)

C n n 95.00 $ n y y n 2023-10-31 https://www.ftsafe.com/store/product/biopass-fido2-pro-security-key/

93

FEITIAN

BioPass FIDO2 Pro USB-A (K45)

A n n 95.00 $ n y y n 2023-10-31 https://www.ftsafe.com/store/product/biopass-fido2-pro-security-key/

94

FEITIAN

ePass FIDO NFC K9+ with DESFire

A n n $ n n y y n 2023-10-31 https://www.ftsafe.com/store/product/iepass-fido-security-key/

95

FEITIAN iePass FIDO (K44) C n y 78.00 $ n n y n 2023-10-31 https://www.ftsafe.com/store/product/iepass-fido-security-key/

96

FEITIAN

ePass FIDO NFC USB-A (K9B)

A n n 35.00 $ n n y n JCOP V2.4.2 R0.9 NXP A7005A NXP LPC11U24 2023-10-31 https://www.ftsafe.com/store/product/epass-fido-nfc-security-key/

https://www.hexview.com/~scl/titan/

97

FEITIAN

ePass FIDO NFC USB-C (K40)

C n n 49.00 $ n n y n 2023-10-31 https://www.ftsafe.com/store/product/epass-fido-nfc-security-key/

98

FEITIAN

ePass FIDO NFC Plus USB-A (K9D)

A n n 54.00 $ n n y n 2023-10-31 https://www.ftsafe.com/store/product/epass-fido-nfc-plus-security-key/

99

FEITIAN

ePass FIDO NFC Plus USB-C (K40+)

C n n 54.00 $ n n y n 2023-10-31 https://www.ftsafe.com/store/product/epass-fido-nfc-plus-security-key/

100

FEITIAN ePass FIDO USB-A (A4B) A n n 32.00 $ n n y n 2023-10-31 https://www.ftsafe.com/store/product/epass-fido-security-key/

Doxers Posing as Cops Are Tricking Big Tech Firms into Sharing People's Data

Hacker News
www.wired.com
2025-12-13 05:06:59
Comments...
Original Article

A spoofed email address and an easily faked document is all it takes for major tech companies to hand over your most personal information.

Folder eye and circuit board

Photograph: Maxkabakov/Getty Images

When a privacy specialist at the legal response operations center of Charter Communications received an emergency data request via email on September 4 from Officer Jason Corse of the Jacksonville Sheriff’s Office, it took her just minutes to respond, with the name, home address, phone numbers, and email address of the “target.”

But the email had not in fact come from Corse or anyone else at the Jacksonville Sheriff’s Office. It was sent by a member of a hacking group that provides doxing-as-a-service to customers willing to pay for highly sensitive personal data held by tech companies in the United States.

“This took all of 20 minutes,” Exempt, a member of the group that carried out the ploy, told WIRED. He claims that his group has been successful in extracting similar information from virtually every major US tech company, including Apple and Amazon, as well as more fringe platforms like video-sharing site Rumble, which is popular with far-right influencers.

Exempt shared the information Charter Communications sent to the group with WIRED, and explained that the victim was a “gamer” from New York. When asked if he worried about how the information he obtained was used against the target, Exempt said: “I usually do not care.”

The victim did not respond to WIRED’s requests for comment.

“It is definitely concerning to hear criminals impersonating officers in such a manner, more so when they are claiming to be one of our employees,” says Christian Hancock, the media relations manager at the Jacksonville Sheriff’s Office. Officer Corse declined to comment.

Charter Communications declined to comment.

This method of tricking companies into handing over information that can be used to harass, threaten, and intimidate victims has been known about for years . But WIRED has gained unprecedented insight into how one of these doxing groups operates, and why, despite years of warnings, it is still happening so often.

The Charter Communications incident was one of up to 500 successful requests Exempt claims to have made in recent years. To back up his claims, the hacker shared multiple documents and recordings with WIRED, including what he claimed were screenshots of email requests, fake subpoenas, responses from tech companies, and even a video recording of a phone call with one company’s law enforcement response team, which was seeking to verify a request. Exempt also shared evidence suggesting that a current law enforcement officer (Exempt refused to provide the officer’s location or name) was in contact with the group about allegedly working with them to submit requests from his own account in return for a cut of the profits.

“All I need is an IP address, which I can gain pretty easily, [and] next thing you know I have names, addresses, emails, and cell numbers,” says Exempt, adding that he can then use that information to make emergency data requests. “And with a subpoena and search warrant, I can access DMs, texts, call logs. That’s someone’s full life in my hands in the space of hours, depending on the response times of the company or provider.”

This type of doxing appears to be a lucrative business. Exempt claims his group brought in over $18,000 in the month of August alone. In one case, Exempt says he was paid $1,200 for a single dox of a person who was supposedly “grooming minors on an online gaming platform he owns. The individual was then allegedly promptly swatted.”

WIRED reviewed the information posted online about a 23-year-old from the southwestern US, which includes their home address, phone number, email addresses, and social media accounts. The person did not respond to WIRED’s request for comment. WIRED was unable to independently confirm if the person was swatted.

In the US, federal, state, and local law enforcement agencies who need to identify the owner of a social media account, or details about a specific phone, send the relevant company a subpoena or warrant requesting the information.

All major companies operating in the US have departments and specific staff assigned to dealing with these requests, which are typically sent via email. The companies, once they review the subpoena and see it has come from what looks like a law enforcement agency, typically comply with the requests, sometimes taking additional verification steps such as phoning the officer involved to confirm that they did indeed send the request.

But officers can also make emergency data requests, or EDRs, in cases involving a threat of imminent harm or death. These requests typically bypass any additional verification steps by the companies who are under pressure to fulfill the request as quickly as possible.

This is the loophole that hackers like Exempt, who says he is “a Gen Z male located within the Europe area,” can exploit.

The problem partly stems from the fact that there are around 18,000 individual law enforcement agencies in the US, all of which use their own email naming conventions and domain registrations, including .us, .net, .org, .gov, and .com.

The hackers typically use one of two ways to trick companies into making them believe the emails are coming from real law enforcement agencies. In some cases, they use authentic law enforcement email accounts that they have compromised via social engineering or using credentials stolen in previous hacks. Other times, they create convincing fake domains that closely mimic legitimate police departments.

“This was an email address that looked like the real thing,” says Exempt, explaining the mechanics of how he tricked Charter Communications. “The real domain of the Jacksonville Sheriff’s Office in Florida is jaxsheriff.org. We purchased jaxsheriff.us and then spoofed our number as the department’s, so that when we called them to verify receipt of the legal process, when they searched the number, it would come back to the sheriff’s office, giving them no reason to doubt it. We use real badge numbers and officer names as well.”

The hackers also craft highly convincing fake official documents by mimicking official records.

“We look at real subpoenas through public records where available and use the legally correct wording and sections of the law in the subpoena so that everything is legally correct and binding, so that we realistically have zero percent chance of them second-guessing it,” says Exempt. This has worked in multiple states and courts in the US, he claims.

“As an extra verification step, we sometimes check online to see if the named judge is actually in court that day, so that if a company was to phone up and verify, they would be in the building but most likely be too busy to be able to verify the singular document,” says Exempt.

In many cases, Exempt says, the email and attached subpoena is enough to extract the information. In one example shared with WIRED, Exempt claims that his group, which he says is made up of around nine people located across Europe and the US, was able to obtain the information used to register the official Rumble account belonging to British far-right activist Tommy Robinson.

Robinson and Rumble did not respond to requests for comment.

Even in cases where companies do take additional steps to verify the subpoenas are coming from real officers, the hackers are able to circumvent this.

In a recording of a phone call shared with WIRED, a representative from Amazon’s law enforcement response team called the number included in the faked email Exempt sent, and spoke with Exempt to verify that they had received the documents she had sent him via an online portal.

“Amazon identified and blocked someone that was requesting data from us while impersonating law enforcement,” says Adam Montgomery, an Amazon spokesperson. “The impersonator received basic account data for fewer than 10 customers. We quickly took steps to protect these customer accounts, and have put additional safeguards in place to prevent this from happening again.”

When asked for details of what those safeguards were, Amazon declined to comment.

While the hackers are clearly exploiting massive loopholes in the system, in some cases, the tech companies themselves have laid out step-by-step guides on how to craft these requests.

“In order to request that Apple voluntarily disclose information on an emergency basis, the requesting government or law enforcement officer should complete the Emergency Government & Law Enforcement Information Request form and transmit it directly from their official government or law enforcement email address to [a specific @apple.com email address] with the words “Emergency Request” in the subject line,” Apple writes .

Exempt shared with WIRED an example of a request he made to Apple using a fake subpoena as well as the information Apple sent back to him that included an iCloud account holder’s home address, cell phone number, and email addresses. Apple did not respond to a request for comment.

One online database maintained by SEARCH, a nonprofit criminal justice support organization, lists direct contact details for the law enforcement divisions of over 700 internet service providers and other online content providers.

“The core issue isn't companies being careless, it's that traditional communications channels, like email, weren't built for the level of identity verification, context evaluation, and real-time decisioning that modern investigations and legal compliance require,” says Matt Donahue, a former FBI agent who left the agency in 2020. Soon after, Donahue founded Kodex, a company that works with business clients to build secure online portals that law enforcement can use to make data requests.

While technologies like Kodex provide a much safer alternative to email, over 80 percent of the companies listed on the SEARCH database still accept emergency data requests via emails, according to one review conducted by Kodex ,

But even those who only use Kodex are not in the clear. Exempt claims that he was able to make requests through Kodex for a period of time, using compromised law enforcement email accounts. However, because of Kodex’s enhanced safety features, including whitelisting specific devices from which requests can be made, Exempt and his group have now lost access to the system.

The hacker claims, however, that they are now working to regain access via another avenue.

“We are in talks with a deputy from a large sheriff’s office … who we got paid to dox [and] who is now interested in either renting his Kodex account to us or he may submit the requests for us on his side,” says Exempt. “This is in [the] very early stages of talks. He would want a percentage of the money we make and his dox removed on a well-known doxing site.”

To back up his claim, Exempt shared a screenshot of an alleged text exchange with the officer, including a blurred image that he refers to as his ID card. “Y’all have the SSN and the rest of the info you need about me and my fam,” the alleged officer wrote in a message. “I’m on the fence about it right now, but we will all get what we want out of this if we do a d[eal].”

When asked if he thought it was possible the officer was trying to entrap them, Exempt said probably not, “just for the fact he has been doxed, and within that dox, some pretty damning stuff about said officer came out, which he clearly wants removed. So I’m pretty certain he is being honest about the fact he is considering it.”

Donahue says Kodex’s system could flag such behavior because it is able to “pattern-match” the behavior of law enforcement agents and how they interact with companies that use the Kodek platform. “We can and do detect behavioral changes that allow us to protect our customers on a continuous basis as opposed to a one-time verification,” says Donahue.

While the hackers are taking advantage of the weakness in email security, they are also taking advantage of companies’ desire to help law enforcement save lives.

“Public/private-sector coordination is an incredibly complex and nuanced space that could very well be the difference between a kid being found in a trunk, or not,” says Donahue. “Lawful government data requests sit at the very unique intersection of data privacy, public safety, security, legal compliance, and civil rights, so anyone suggesting these requests are carelessly responded to in minutes has little to no understanding of the subject matter.”

David Gilbert is a reporter at WIRED covering disinformation, online extremism, and how these two online trends impact people’s lives across the globe, with a special focus on the 2024 US presidential election. Prior to joining WIRED, he worked at VICE News. He lives in Ireland. ... Read More

Apple has locked my Apple ID, and I have no recourse. A plea for help

Hacker News
hey.paris
2025-12-13 04:55:59
Comments...
Original Article

Summary: A major brick-and-mortar store sold an Apple Gift Card that Apple seemingly took offence to, and locked out my entire Apple ID, effectively bricking my devices and my iCloud Account, Apple Developer ID, and everything associated with it, and I have no recourse. Can you help? Email paris AT paris.id.au (and read on for the details). ❤️

Here’s how Apple “Permanently” locked my Apple ID.

I am writing this as a desperate measure. After nearly 30 years as a loyal customer, authoring technical books on Apple’s own programming languages (Objective-C and Swift) , and spending tens upon tens upon tens of thousands of dollars on devices, apps, conferences, and services, I have been locked out of my personal and professional digital life with no explanation and no recourse.

The Situation

My Apple ID, which I have held for around 25 years (it was originally a username, before they had to be email addresses; it’s from the iTools era), has been permanently disabled. This isn’t just an email address; it is my core digital identity. It holds terabytes of family photos, my entire message history, and is the key to syncing my work across the ecosystem.

  • The Trigger: The only recent activity on my account was a recent attempt to redeem a $500 Apple Gift Card to pay for my 6TB iCloud+ storage plan. The code failed. The vendor suggested that the card number was likely compromised and agreed to reissue it. Shortly after, my account was locked.
    • An Apple Support representative suggested that this was the cause of the issue: indicating that something was likely untoward about this card.
    • The card was purchased from a major brick-and-mortar retailer (Australians, think Woolworths scale; Americans, think Walmart scale), so if I cannot rely on the provenance of that, and have no recourse, what am I meant to do? We have even sent the receipt, indicating the card’s serial number and purchase location to Apple.

  • The Consequence: My account is flagged as “closed in accordance with the Apple Media Services Terms and Conditions”.
  • The Damage: I effectively have over $30,000 worth of previously-active “bricked" hardware. My iPhone, iPad, Watch, and Macs cannot sync, update, or function properly. I have lost access to thousands of dollars in purchased software and media.
    • Apple representatives claim that only the “Media and Services” side of my account is blocked, but now my devices have signed me out of iMessage (and I can’t sign back in), and I can’t even sign out of the blocked iCloud account because… it’s barred from the sign-out API, as far as I can tell.
    • I can’t even login to the “Secure File Transfer” system Apple uses to exchange information, because it relies on an Apple ID. Most of the ways Apple has suggested seeking help from them involve signing in to an Apple service to upload something, or communicate with them. This doesn’t work as the account is locked.

  • I can’t even download my iCloud Photos, as:
    1. There are repeated auth-errors on my account, so I can’t make Photos work;
    2. I don’t have a 6TB device to sync them to, even if I could.

The Support Nightmare

I contacted Apple Support immediately (Case ID: 102774292094). The experience was terrifyingly dismissive:

  1. No Information: Support staff refused to tell me why the account was banned or provide specific details on the decision.
  2. No Escalation: When I begged for an escalation to Executive Customer Relations (ECR), noting that I would lose the ability to do my job and that my devices were useless, I was told that “an additional escalation won’t lead to a different outcome”.
    • Many of the reps I’ve spoken to have suggested strange things, one of the strangest was telling me that I could physically go to Apple’s Australian HQ at Level 3, 20 Martin Place, Sydney, and plead my case. They even put me on hold for 5 minutes while they looked up the address.

The “New Account” Trap

Most insultingly, the official advice from the Senior Advisor was to “create a new Apple account… and update the payment information”.

This advice is technically disastrous:

  • The Legal Catch: Apple’s Terms and Conditions rely on “Termination of Access.” By closing my account, they have revoked my license to use their services.
  • The Technical Trap: If I follow their advice and create a new account on my current devices (which are likely hardware-flagged due to the gift card error), the new account will likely be linked to the banned one and disabled for circumventing security measures.
  • The Developer Risk: As a professional Apple Developer, attempting to “dodge” a ban by creating a new ID could lead to my Developer Program membership being permanently blacklisted, amongst other things.

Who I Am

I am not a casual user. I have literally written the book on Apple development (taking over the Learning Cocoa with Objective-C series, which Apple themselves used to write, for O’Reilly Media, and then 20+ books following that) . I help run the longest-running Apple developer event not run by Apple themselves, /dev/world . I have effectively been an evangelist for this company’s technology for my entire professional life. We had an app on the App Store on Day 1 in every sense of the world.

My Plea

I am asking for a human at Apple to review this case. I suspect an automated fraud flag regarding the bad gift card triggered a nuclear response that frontline support cannot override. I have escalated this through my many friends in WWDR and SRE at Apple, with no success.

I am desperate to resolve this and restore my digital life. If you can help, please email paris AT paris.id.au

Poor Johnny still won't encrypt

Hacker News
bfswa.substack.com
2025-12-13 04:21:24
Comments...
Original Article

This title is an obvious nod to

To encrypt email in 1998 you’d run GnuPG from a terminal, importing the recipient’s public key into your local keyring then copying your email text into a file then encrypting the file for that public key: gpg -e -r alice file . Finally you’d copy the encrypted message into your email client and send it out.

In 2025 , it’s pretty much the same. In some respects, it’s worse:

  • It feels like fewer people care about email encryption today than they did in 2010.

  • Web-based email has become dominant, and that shift works against PGP usage. Desktop clients at least offered some support (native in Thunderbird, third-party extensions for Outlook and Apple Mail.) Most webmail services, by contrast, offer no native PGP support at all. Proton is a notable exception.

“But there’s S/MIME!”

S/MIME ( RFC 2311 ) was standardized around the same time as OpenPGP ( RFC 2440 ), in 1998. PGP’s trust model is the “web of trust”, though often TOFU in practice, while S/MIME’s model is the more organization-friendly hierarchical PKI model.

As a result, S/MIME is more common than PGP in enterprise email. It’s also better supported by email clients. Even Gmail for organizations supports S/MIME . You need a basic PKI and to generate key pairs, and then to distribute them manually:

What about Microsoft/Azure , the dominant enterprise stack? You’d expect managed endpoints to support key generation and distribution across an organization—centrally administered, cross-platform. In practice, Microsoft makes this harder than it should be. The process remains largely manual, poorly documented, and needlessly tedious.

Why nobody seems to care?

Auditors obsess over encryption at rest —from laptop FDE to databases’ security theaterish at-rest encryption—and over encryption in transit , usually meaning TLS. But they seldom bring up email encryption and send confidential email text and attachments like there’s no tomorrow.

The reality is blunt: most email traffic doesn’t enforce encryption, as MTA-STS adoption remains very low. Opportunistic encryption ( STARTTLS ) is more common, but obviously vulnerable to downgrade attacks.

There are even fewer incentives to fix this today, now that we have session-based messaging systems—mostly Signal , but also Olvid, Threema, and WhatsApp. Their statefulness enables protocols that, unlike PGP or S/MIME, protect against replays and provide forward secrecy (and the less critical “post-compromise security”).

Another factor is simple displacement. We use email far less than we did in 2005. Most internal written communication now happens over Slack or Teams or similar platforms. These systems are not encrypted save for the client-to-server link, with the server often running in third-party infrastructure

So expect less and less PGP and S/MIME and, if we’re lucky, a bit more MTA-STS.

Featured image: Johnny Depp.

Indices point between elements (2015)

Lobsters
blog.nelhage.com
2025-12-13 04:10:48
Comments...
Original Article

If you’re familiar with nearly any mainstream programming language, and I asked you to draw a diagram of an array, the array indices, and the array elements, odds are good you’d produce a diagram something like this:

In this post, I want to persuade you to replace that image, or, at least, to augment it with an alternate view on the world.

I want to argue that, rather than numbering elements of an array, it makes just as much sense, and in many cases more, to number the spaces between elements:

With this representation, we do have to relearn how to refer to indexing: We refer to A[i] no longer as “The element at index i ”, but rather “The element to the right of index i ”.

Representing Ranges 🔗︎

I’ll run through a few reasons why I prefer this representation, but most of them boil down to representing ranges of elements .

Suppose we have an array, and we want a way to refer to a certain subset of it, like so:

One obvious answer is with a start and a length: start=2 length=3 , but it’s often convenient to represent a range as a pair of (start, end) indices. The latter representation, for example, lets you check if an index falls into the range directly.

If we number elements, it’s not immediately apparent which index to use for end :

Both (1, 3) and (1, 4) seem initially defensible. But if we number between elements, there’s a clear, unique answer:

The indices we want are the ones that lie between the included and excluded elements: (1, 4) .

With this model, the rules of range manipulation and comparison become straightforward:

  • Two ranges are adjacent if left.end == right.start

  • One range is a subset of another if inner.start >= outer.start && inner.end <= outer.end

  • A range contains end - start elements:

In order to answer the question “if I dereference an index, is the result contained in the range?”, we need to remember that A[i] is now defined as the element after index i . With that in mind, it becomes easy to see that for a range (start, end) , elements indexed by start <= i < end are within the range.

Off-by-one errors 🔗︎

Indexing between elements, instead of indexing elements, helps avoid a large class of off-by-one errors. I’ll run through a number of examples using Python, but the fundamental issues apply to array APIs in many more languages, or to any time you’re manipulating an array-like data structure via any of these operations.

Inserting elements 🔗︎

If you want to insert an element into an array, how do you specify the location? If you name an existing element, does the new element go before or after that element?

Python’s standard library documentation somewhat awkwardly specifies that “The first argument is the index of the element before which to insert,” clarifying that this means insert(0, X) inserts X at the start of the array.

But if we number gaps between elements, instead of numbering elements, the story is perfectly clear: 0 names the gap before the first element, and so of course inserting at 0 should prepend an element. Similarly, 1 names the gap between the first and second element, and all the way on.

Slicing arrays 🔗︎

How do we refer to a partial subset of an array that we want to extract? Python, like many other languages, lets you use a pair of indexes:

>>> [1,2,3,4][1:3]
[2, 3]

The documentation , however, has to resolve the same ambiguity noted above: Is the final index excluded or included? Ruby even helpfully offers you both choices:

irb(main)> [1,2,3,4][1...3]
=> [2, 3]
irb(main)> [1,2,3,4][1..3]
=> [2, 3, 4]

As discussed earlier, if we adjust our view of indexes, there is no ambiguity at all. Conveniently, this also gives us the same semantics as Python and most other languages: there are good reasons half-inclusive ranges are generally preferable, and most languages converge on this choice.

Removing elements 🔗︎

If we want to remove a single element from an array, it does seem simpler to index elements directly – we can just name directly the index which we want to eliminate.

However, if we want to adopt the more general primitive, of removing slices, (Python’s del array[x:y] ), we run into the same problem as extracting slices, previously. Once again, shifting our thinking to index between elements removes all ambiguity.

Incrementally consuming an array 🔗︎

Suppose we’re walking through an array, consuming it by elements or groups of elements at a time. Perhaps we’re parsing a string, consuming tokens as we go.

How do we keep track of our current position? Should we keep the index of the last element we’ve processed, or of the first element we have yet to process?

If we shift our perspective, this problem too vanishes: We can store the index between the last item consumed, and the next one to be consumed. Our index neatly partitions the buffer into “processed” and “to-be-processed”, with no ambiguity at all.

C/C++ Pointers and Iterators 🔗︎

With pointers in C, or with iterators in C++ (which were essentially designed to mimic C’s pointer semantics), we speak of pointers into an array or of iterators as referring to a specific element in memory.

However, both systems allow for this additional “valid” iterator or pointer, which points “just past the end” of a container. This pointer/iterator does not name a valid element, but is a valid pointer or iterator. The C specification is full of awkward verbiage to address this special-case:

both [pointers] shall point to elements of the same array object, or one past the last element of the array object;

(N1256 §6.5.6p9). And with a C++ std::vector , v.begin() and v.end() are both valid iterators, but v.end() points “one past the end” and cannot be dereferenced.

These apparent odd inconsistencies and special cases vanish if you shift your thinking slightly in just the way I’ve been arguing: Instead of thinking of iterators as referring to individual elements, we hold that they name the interstitial points between elements.

If we do so, the “one-past-the-end” iterator is no longer “past” the end – it points directly at the end, which is no more fundamentally special than the “start” iterator which points directly at the beginning.

It’s still the case that we cannot dereference v.end() , but that behavior is a function of the “dereference” operation, which selects the element after an iterator. The iterators themselves are no longer special cases.

Postscript: 0 or 1? 🔗︎

It used to be popular, and still is in some circles, to debate whether programming languages ought start array indexing at 0 or 1 . And there are still a few holdouts, like Matlab, which number their arrays starting from 1, causing no end of pain and confusion to those poor souls who have to switch between them and more mainstream languages.

Once I started thinking of pointers or iterators or indexes as indexing between elements, one of the more mind-bending realizations that followed was that this model can harmonize the “index from 0” and “index from 1” camps!

Let’s consider an array with interstices labeled again:

The first element, A[0] in C or Python, is bracketed by indexes 0 and 1 . The decision, then to name it as 1 , does not involve changing the picture at all; If you draw indices between elements, the statement “Arrays start at 1” is simply a decision that “The deference operator refers to the element to the left of an index,” in exactly the same way that I described dereference in a 0-indexed language as taking the element to the right. And presented that way – “should dereference take the left or the right element?” – it becomes clear that 0 or 1 really is an arbitrary choice.

Acknowledgements 🔗︎

I credit my personal shift in thinking – from labeling elements to labeling interstices – to my reading of the excellent The Craft Of Text Editing , which introduces the concept both for its notion of a mark , and specifically as an implementation idea while talking about buffer representations .

I recommend giving the book a read even if you never aspire personally to implement an Emacs; It’s filled with a great number of interesting ideas, and possessed of a profound clarity of thought throughout.

Show HN: Claude Code recipes for knowledge workers

Hacker News
github.com
2025-12-13 03:57:58
Comments...
Original Article

Top 100 Claude Code Recipes for Knowledge Workers

Your Complete Guide to AI-Powered Productivity

Version 1.0 — December 2025


Welcome

This collection contains 100 practical recipes for using Claude Code to automate, accelerate, and enhance your professional work. Each recipe provides step-by-step instructions, ready-to-use prompts, and real-world examples that you can apply immediately.

Whether you're drafting emails, analyzing data, preparing presentations, or managing complex projects, there's a recipe here that will save you hours of work.


Want all 200 recipes as ready-to-use slash commands?

The Premium Collection includes 200 slash commands you can install in seconds. Just type /recipe-001 and Claude does the rest. No copying prompts. No setup. $79.99 one-time purchase.

Get the Premium Collection →

Getting Started

New to Claude Code? Start here:

  1. GETTING-STARTED.md - Installation guide for Windows, macOS, and Linux
  2. FAQ.md - Common questions answered
  3. GLOSSARY.md - Terms and definitions
  4. Pick a recipe from Tier 1 below (the universal high-frequency wins)

How to Use This Book

Reading Strategies

Don't read sequentially. This is a reference book, not a novel. Instead:

  1. Start with a task you're facing today — Scan the table of contents for something relevant
  2. Try one recipe completely — Follow all the steps, including the refinement prompts
  3. Customize as you go — The prompts are templates; adapt them to your situation
  4. Build from there — Once you've mastered one recipe, explore related ones

Understanding the Difficulty Levels

Level What It Means Typical User
Beginner Simple inputs, straightforward prompts, quick results Anyone, even first-time Claude Code users
Intermediate Multiple inputs, some iteration needed, judgment required Comfortable with Claude Code basics
Advanced Complex inputs, multiple steps, significant customization Regular Claude Code users, domain expertise helpful

Understanding Time Estimates

Each recipe shows:

  • Setup Time — How long the first use takes (includes learning the workflow)
  • Time Saved — Hours saved compared to doing the task manually

These are realistic estimates for typical professional work. Your actual times may vary based on the complexity of your specific task.

Getting the Best Results

  1. Be specific in your prompts — "Analyze Q3 sales focusing on regional trends" beats "Analyze sales"
  2. Provide context — Tell Claude your role, your audience, and what "good" looks like
  3. Iterate — Treat first outputs as drafts; ask for revisions and refinements
  4. Review everything — AI output needs human judgment before use in professional contexts

What Each Recipe Contains

Every recipe follows a consistent structure:

Section What It Tells You
The Problem What pain point this recipe solves
The Outcome What you'll have when done
When to Use Good fit vs. not a good fit scenarios
Prerequisites What you need before starting
How Claude Helps What Claude does well vs. where you add judgment
Step-by-Step Detailed walkthrough with prompts
Example Output What good results look like
Troubleshooting Common issues and solutions
Variations Alternative approaches for different needs

How the Recipes Are Organized

The 100 recipes are organized into 10 tiers, progressing from universal daily tasks to specialized professional functions:

Tier Category Recipes Best For
1 Universal High-Frequency Wins 001-010 Everyone - start here
2 Leadership & Management 011-020 Managers, Directors, Executives
3 Strategy & Analysis 021-030 Analysts, Strategists, Finance
4 Professional Communication 031-040 Marketing, Communications, Sales
5 Operations & Compliance 041-050 Operations, Legal, Quality
6 Data & Reporting 051-060 Analysts, Business Intelligence
7 HR & People Operations 061-070 HR, L&D, People Managers
8 Sales & Customer Operations 071-080 Sales, Customer Success, Marketing
9 Project & Product Management 081-090 PMs, Product Managers, Scrum Masters
10 Technical & Specialized 091-100 Technical Writers, IT, Legal

Complete Recipe Index

Tier 1: Universal High-Frequency Wins (001-010)

Tasks everyone does daily or weekly. Start here for immediate productivity gains.

# Recipe Time Saved
001 Meeting Notes to Action Items 30-60 min/meeting
002 Weekly Status Report Generation 2-3 hours/week
003 Email Drafting and Response 1-2 hours/day
004 Document Summarization 1-2 hours/document
005 Rapid Presentation Development 3-6 hours/deck
006 Calendar and Schedule Optimization 1-2 hours/week
007 Research Synthesis 2-4 hours/project
008 Task and Project List Organization 1-2 hours/week
009 Data Cleanup and Formatting 2-4 hours/dataset
010 Quick Reference Guide Creation 2-3 hours/guide

Tier 2: Leadership & Management (011-020)

For managers, directors, and executives managing teams and making decisions.

# Recipe Time Saved
011 Board and Leadership Meeting Prep 4-8 hours/meeting
012 High-Stakes Communication Drafting 1-2 hours/message
013 Performance Review Writing 2-3 hours/review
014 One-on-One Meeting Prep 30-60 min/meeting
015 Team Capacity and Resource Planning 3-5 hours/cycle
016 Budget Scenario Modeling 4-8 hours/cycle
017 OKR and Goal Setting 3-5 hours/cycle
018 Delegation Brief Creation 1-2 hours/task
019 Meeting Facilitation Planning 1-2 hours/meeting
020 Org Design and Structure Analysis 5-10 hours/analysis

Tier 3: Strategy & Analysis (021-030)

For strategic planning, financial analysis, and business intelligence.

# Recipe Time Saved
021 Competitive Intelligence Synthesis 6-10 hours/report
022 Strategic Planning Document Synthesis 15-25 hours/cycle
023 Market Research Analysis Summary 4-8 hours/report
024 Customer Account Analysis Deep Dives 3-5 hours/analysis
025 Financial Analysis and Trend Interpretation 3-6 hours/analysis
026 Risk Assessment and Mitigation Planning 4-8 hours/assessment
027 Business Case Development 6-12 hours/case
028 Benchmarking Analysis 4-8 hours/analysis
029 SWOT and Strategic Framework Analysis 3-6 hours/analysis
030 Investor and Board Materials Preparation 8-15 hours/package

Tier 4: Professional Communication (031-040)

For creating compelling content and professional communications.

# Recipe Time Saved
031 Proposal and Pitch Document Creation 4-8 hours/proposal
032 RFP and Government Contract Response 10-20 hours/response
033 Newsletter and Update Content Creation 2-4 hours/issue
034 Blog Post and Thought Leadership 3-5 hours/post
035 Case Study Development 4-8 hours/study
036 Executive Communication Ghostwriting 2-4 hours/piece
037 Press Release and Media Statement Drafting 2-4 hours/release
038 Customer Communication Templates 3-6 hours/template set
039 Technical Writing for Non-Technical Audiences 3-5 hours/document
040 Speech and Presentation Script Writing 4-8 hours/speech

Tier 5: Operations & Compliance (041-050)

For operational excellence, compliance, and process documentation.

# Recipe Time Saved
041 Policy and Procedure Documentation 8-15 hours/policy
042 Process Documentation and SOPs 4-8 hours/process
043 Contract Review and Risk Summarization 2-4 hours/contract
044 Vendor and Technology Evaluation Matrices 4-6 hours/evaluation
045 Incident Analysis and Root Cause Reports 3-5 hours/incident
046 Regulatory Change Impact Assessment 6-12 hours/assessment
047 Audit Preparation Documentation 8-15 hours/audit
048 Quality Control Checklists and Standards 4-8 hours/checklist
049 Change Management Documentation 5-10 hours/change
050 Service Level Agreement Development 6-12 hours/SLA

Tier 6: Data & Reporting (051-060)

For data analysis, visualization narratives, and business reporting.

# Recipe Time Saved
051 Dashboard Narrative Generation 1-2 hours/report
052 Survey Results Analysis and Reporting 3-6 hours/survey
053 KPI Variance Analysis and Commentary 2-4 hours/review
054 Customer Feedback Synthesis 3-5 hours/synthesis
055 Sales Pipeline Analysis and Commentary 2-3 hours/review
056 Cohort and Segment Analysis 3-5 hours/analysis
057 Trend Identification and Forecasting Narratives 3-6 hours/forecast
058 Report Automation Design 4-8 hours/setup
059 Data Quality Assessment 3-6 hours/assessment
060 Executive Summary Generation from Detailed Reports 30-60 min/summary

Tier 7: HR & People Operations (061-070)

For human resources, learning & development, and people management.

# Recipe Time Saved
061 Job Descriptions and Role Documentation 2-3 hours/role
062 Interview Question Banks and Scorecards 3-5 hours/role
063 Employee Handbook and Policy Updates 6-12 hours/update
064 Training Material Development 10-20 hours/course
065 Onboarding Documentation and Checklists 4-6 hours/program
066 Employee Communication Campaigns 3-6 hours/campaign
067 Exit Interview Analysis and Themes 3-5 hours/analysis
068 Compensation Analysis and Recommendations 4-8 hours/analysis
069 Diversity and Inclusion Reporting 4-8 hours/report
070 Skills Gap Analysis 4-8 hours/analysis

Tier 8: Sales & Customer Operations (071-080)

For sales teams, customer success, and revenue operations.

# Recipe Time Saved
071 Sales Call Preparation Briefs 30-60 min/call
072 Win/Loss Analysis and Themes 4-6 hours/analysis
073 Customer Success Playbook Development 8-15 hours/playbook
074 QBR Preparation 2-4 hours/QBR
075 Sales Enablement Content Creation 3-5 hours/asset
076 Lead Scoring and Prioritization Analysis 2-4 hours/analysis
077 Customer Journey Mapping 4-8 hours/map
078 Pricing Analysis and Recommendation 4-6 hours/analysis
079 Territory and Account Assignment Analysis 4-8 hours/analysis
080 Campaign Performance Analysis 2-4 hours/campaign

Tier 9: Project & Product Management (081-090)

For project managers, product managers, and agile teams.

# Recipe Time Saved
081 Project Charter and Kickoff Documentation 3-5 hours/project
082 Requirements Documentation 4-8 hours/document
083 Sprint Planning and Backlog Refinement 2-3 hours/sprint
084 Product Roadmap Communication 3-5 hours/update
085 Release Notes and Changelog Generation 1-2 hours/release
086 Project Status and Health Reporting 2-3 hours/report
087 Stakeholder Analysis and Communication Planning 3-5 hours/analysis
088 Lessons Learned Facilitation and Documentation 3-5 hours/session
089 Product Feature Prioritization Analysis 3-5 hours/cycle
090 User Research Synthesis 4-8 hours/study

Tier 10: Technical & Specialized Functions (091-100)

For technical documentation, IT operations, and specialized professional functions.

# Recipe Time Saved
091 Technical Documentation from Engineering Inputs 3-5 hours/document
092 API and Integration Documentation 4-6 hours/API
093 Security Assessment Documentation 4-8 hours/assessment
094 Knowledge Base and FAQ Generation 3-6 hours/section
095 IT Service Catalog Development 8-12 hours/section
096 Technical Specification Writing 4-8 hours/spec
097 Code Review Summary and Documentation 1-2 hours/review
098 M&A and Partnership Due Diligence 10-20 hours/deal
099 Legal Research Summarization 4-8 hours/research
100 Intellectual Property Documentation 6-12 hours/portfolio

Quick Start by Role

If you're a Manager or Director

Start with: 001 , 002 , 013 , 014

If you're an Executive

Start with: 011 , 012 , 022 , 030

If you're in Sales

Start with: 003 , 071 , 055 , 075

If you're in Marketing

Start with: 033 , 034 , 035 , 080

If you're in HR

Start with: 061 , 062 , 065 , 064

If you're a Product Manager

Start with: 082 , 084 , 089 , 090

If you're a Project Manager

Start with: 001 , 081 , 086 , 087

If you're an Analyst

Start with: 004 , 007 , 051 , 060


How to Use Each Recipe

Every recipe follows the same structure:

  1. The Problem - What pain point this recipe solves
  2. The Outcome - What you'll have when done
  3. When to Use - Good fit vs. not a good fit scenarios
  4. Prerequisites - What you need before starting
  5. How Claude Code Helps - What Claude does well and where you add judgment
  6. Input Examples - What to feed Claude
  7. Step-by-Step Implementation - Detailed walkthrough with prompts
  8. Example Output - What good results look like
  9. Troubleshooting - Common issues and solutions
  10. Tips from Experience - Practical wisdom from real use
  11. Variations - Alternative approaches for different situations
  12. Building Your System - How to make it repeatable

Premium Collection: 200 Recipes as Slash Commands

Want to skip copying prompts? The Premium Collection includes all 200 recipes as ready-to-use Claude Code slash commands.

/recipe-001 Here are my meeting notes from today...

One command. Instant results.

What's in the premium/ Folder

File Description
PREMIUM.md Full details on the Premium Collection
README.md Installation guide for sample commands
recipe-001.md to recipe-010.md 10 free sample slash commands to try

Try the Samples

Install the 10 free sample commands:

# Mac/Linux
cp premium/recipe-*.md ~/.claude/commands/

# Windows PowerShell
Copy-Item -Path "premium\recipe-*.md" -Destination "$env:USERPROFILE\.claude\commands\"

Then use them in Claude Code:

/recipe-001 Team standup: John finished API. Sarah blocked on design. Demo Friday.

Premium Collection Includes

  • 200 slash commands covering all 100 core recipes plus 100 advanced recipes
  • 20 categories from daily tasks to enterprise strategy
  • Role-based quick-start guides
  • Cheat sheets and ROI tracking templates

See premium/PREMIUM.md for full details and purchase options.


Contributing

Found a bug? Have an improvement? Want to add a recipe?

Open an issue or submit a pull request.


License

This recipe collection is provided for educational and professional use. See LICENSE.md for full terms.


Version History

See CHANGELOG.md for version history and updates.


100 Recipes. Unlimited Productivity.

Version 1.0 — Built for Claude Code — December 2025

Quoting OpenAI Codex CLI

Simon Willison
simonwillison.net
2025-12-13 03:47:43
How to use a skill (progressive disclosure): After deciding to use a skill, open its SKILL.md. Read only enough to follow the workflow. If SKILL.md points to extra folders such as references/, load only the specific files needed for the request; don't bulk-load everything. If scripts/ exist, prefer...
Original Article

How to use a skill (progressive disclosure):

  1. After deciding to use a skill, open its SKILL.md . Read only enough to follow the workflow.
  2. If SKILL.md points to extra folders such as references/ , load only the specific files needed for the request; don't bulk-load everything.
  3. If scripts/ exist, prefer running or patching them instead of retyping large code blocks.
  4. If assets/ or templates exist, reuse them instead of recreating from scratch.

Description as trigger: The YAML description in SKILL.md is the primary trigger signal; rely on it to decide applicability. If unsure, ask a brief clarification before proceeding.

OpenAI Codex CLI , core/src/skills/render.rs, full prompt

Friday Nite Videos | December 12, 2025

Portside
portside.org
2025-12-13 03:25:09
Friday Nite Videos | December 12, 2025 barry Fri, 12/12/2025 - 22:25 ...
Original Article

Friday Nite Videos | December 12, 2025

Surveillance Pricing: The Invisible Way We're Being Gouged. Know Your Rights When Dealing With ICE. Why Gerrymandering Won't Save Republicans. Brad Lander for Congress. Rescuing the Internet From “Enshittification” | The Daily Show.

Portside Portside

Google Removes Sci-Hub Domains from U.S. Search Results Due to Dated Court Order

Hacker News
torrentfreak.com
2025-12-13 03:21:32
Comments...
Original Article

Home > Anti-Piracy > Site Blocking >

Google has removed dozens of new Sci-Hub domain names from its search results in the United States. Unlike typical DMCA takedowns, the removals were triggered by a dated court order that was not enforced for several years. This appears to be one of the first times Google has deindexed an entire pirate site in the U.S. based on a 'site blocking' style injunction.

Sci-Hub In 2017, American Chemical Society (ACS), a leading source of academic publications in the field of chemistry, won a lawsuit against Sci-Hub and its operator, Alexandra Elbakyan.

The ‘Pirate Bay of Science’ had failed to appear at a Virginia federal court, resulting in an easy win for the publisher and a $4.8 million default judgment award for damages.

A Broad Anti-Piracy Injunction (2018)

More important, perhaps, was the broad permanent injunction that the Virginia federal court signed off on in 2017 . This order effectively gave ACS free rein to take down existing and newly registered Sci-Hub domain names.

The injunction also required all parties “in active concert or participation” with Sci-Hub to “cease facilitating access” to these domain names, including search engines, hosting providers, ISPs, and domain name registrars, the order clarified.

From the 2018 injunction

acs sci-hub injunction

On paper, this injunction enabled ACS to request American ISPs and search engines to ‘block’ existing and future Sci-Hub domains. However, there was no sign that the publisher was doing so. Aside from a few suspended domains, Sci-Hub remained widely accessible.

Whether ACS did not feel the need to enforce the order against search engines and other intermediaries or if these companies actively objected to the requested actions was unknown. And as time passed, the injunction became a distant memory, at least for a few years.

Google Complies with Zombie Injunction? (2025)

Earlier this week we spotted a unique request in the Lumen Database , where the 2018 injunction was cited. The notice in question asks Google to deindex 34 (sub)domains linked to Sci-Hub.

None of these domains were referenced in the 2018 injunction but are indeed linked to Sci-Hub. Many of the partially redacted domains appear to be domain variations of the scihubtw.tw mirror network, such as edu.scihubtw.tw and freeus.scihubtw.tw.

Court order notice

lumen sci

It’s surprising to see this type of enforcement seven years after the injunction was issued, but the request is legitimate. Google is certainly taking it seriously and has deindexed these domains from its search results in America. In other countries, the same domains remain accessible.

First “US-Only” Sci-Hub Removals

The December 2 notice was sent by UK law firm Wiggin LLP , which sent a similar request in September this year, targeting a few dozen other Sci-Hub domains. In total, we spotted seven notices, with the earliest dating back to 2022.

The results of these removals are also clearly visible in Google search. Those who search for Sci-Hub in the U.S. will see the following notice at the bottom of the results.

Removed by legal request

removed

It’s not clear why it took five years before ACS urged Google to take action in response to the injunction. However, these removals are similar to Google’s removal of pirate site domains in other countries in response to ISP-blocking orders. Voluntary cooperation by Google was uncovered shortly before ACS first notified the search engine.

“In Active Concert”?

Google’s voluntary cooperation with ISP blocking orders in Australia, the Netherlands, France, the UK, and elsewhere also brings up an important question. Is Google cooperating with the permanent injunction in the U.S. because it feels legally compelled to do so, or is that a voluntary gesture too?

The 2018 injunction requires all parties “in active concert or participation” with Sci-Hub to take action. While search engines are mentioned as an example, Google and other tech companies have previously argued that neutral third-party services are not necessarily “in active concert or participation”.

It is likely that Google maintains this stance, opting to voluntarily comply with orders targeting other third parties. That would mirror its response to site-blocking orders elsewhere.

We contacted Google hoping to hear answers to these questions, but the company did not respond to our request for comment.

Cycle-accurate YM2149 PSG emulator

Lobsters
github.com
2025-12-13 03:15:36
Comments...
Original Article

YM2149-RS

The most complete YM2149/AY-3-8910 ecosystem in Rust.

License: MIT

What is the YM2149?

The Yamaha YM2149 (and its compatible sibling, the General Instrument AY-3-8910 ) is a Programmable Sound Generator (PSG) — a dedicated audio chip that defined the sound of an entire computing era.

Three square-wave channels. One noise generator. Hardware envelopes. Pure 8-bit/16-bit retro soul.

If you've ever heard music from an Atari ST , Amstrad CPC , ZX Spectrum 128 , MSX , or countless arcade machines from the 1980s/90s, you've heard this chip. It powered everything from game soundtracks to the legendary European demoscene, where programmers pushed (and still push) these simple waveforms to create surprisingly complex and powerful music.

The YM2149 doesn't do wavetables or samples (mostly). It doesn't do FM synthesis. What it does is generate raw, characterful square waves with programmable frequencies, a shared noise source, and distinctive hardware envelopes — all mixed through a logarithmic DAC that gives it that unmistakable warm, buzzy, chiptune sound.

This crate brings that sound to Rust — cycle-accurate, format-complete, and ready for your emulator, game, or nostalgia project.

Why YM2149-RS?

For Demoscene Enthusiasts & Chiptune Artists: Play back your entire collection of YM, SNDH, AY, and Arkos Tracker files with authentic sound reproduction — in the terminal, browser, or your next retro-inspired game.

For Game Developers: Drop authentic PSG audio into Bevy games with a single plugin. Playlists, crossfades, visualizations, and audio-reactive gameplay hooks included.

For Emulator Authors: A clean, well-tested YM2149 core with configurable backends. Integrate the chip into your Atari ST, CPC, or custom system emulator.

For the Curious: Explore how classic sound chips work. The codebase is documented, tested, and designed to be readable.

What Makes This Special

Feature Description
Cycle-Accurate Core Precise emulation of all PSG features — envelopes, noise, mixer, SID voice, Sync Buzzer, and digi-drum effects
Multi-PSG Emulation Run multiple YM2149 chips in parallel — natively supported via Arkos Tracker format for authentic dual/triple-chip music
Seven Format Replayers YM (1-6), YMT1/YMT2, GIST (.snd), Arkos Tracker (.aks), ZXAY/EMUL (.ay), and SNDH with full 68000 CPU emulation
Zero-Compromise Bevy Integration Not a wrapper around C code — pure Rust from chip to speaker
Runs Everywhere CLI, native apps, WASM browser player, Bevy games — same codebase
Production-Ready 165+ tests, documented APIs, real-world demoscene fixtures

Crate crates.io docs.rs npm
ym2149 ym2149 ym2149 docs
ym2149-common ym2149-common ym2149-common docs
ym2149-ym-replayer ym2149-ym-replayer ym2149-ym-replayer docs
ym2149-arkos-replayer ym2149-arkos-replayer ym2149-arkos-replayer docs
ym2149-ay-replayer ym2149-ay-replayer ym2149-ay-replayer docs
ym2149-sndh-replayer ym2149-sndh-replayer ym2149-sndh-replayer docs
ym2149-gist-replayer ym2149-gist-replayer ym2149-gist-replayer docs
ym2149-wasm npm
bevy_ym2149 bevy_ym2149 bevy_ym2149 docs
bevy_ym2149_viz bevy_ym2149_viz bevy_ym2149_viz docs
ym2149-bevy ym2149-bevy

Cycle-accurate Yamaha YM2149 tooling for Rust — from raw PSG emulation and YM/YMT/SNDH importers to Arkos Tracker playback, CLI/export pipelines, Bevy integrations, visualization stacks, and a one-click WASM demo.

Quick Links
▶️ Web Player Cycle-accurate YM/AKS demo in the browser
📦 npm Package WebAssembly module for browser integration
🧱 Architecture Layered breakdown of emulator, replayers, and integrations
🧭 Quick Start Code snippets for core, CLI, Bevy, and exports
🆕 Changelog Recent features and compatibility notes

At a Glance

🧠 Core Emulator 🪕 Audio Pipelines 🕹️ Game & Bevy
Integer-accurate PSG, YM1–YM6 & tracker helpers Streaming playback, WAV export, playlist automation Plug-and-play Bevy plugins with diagnostics, viz, playlists
🌐 Browser Ready 📦 Monorepo Cohesion 🧪 Quality
WASM player (147 KB) with LHA support & drag-drop Shared versioning, unified docs, cross-crate tests 165+ tests, curated fixtures, demoscene examples

🎵 Try it in Your Browser

► Launch Web Player

Experience authentic Atari ST chiptune music directly in your browser! The WebAssembly player features:

  • ✨ Full YM2-YM6 and SNDH format support with LHA/ICE decompression
  • 🎮 Play/Pause/Stop controls with progress bar
  • 🔊 Volume control and channel muting (A/B/C)
  • 📊 Real-time metadata display
  • 📦 Compact WASM module
  • 🎯 Cycle-accurate YM2149 emulation (ported from Leonard/Oxygene's AtariAudio )
📸 Web Player Preview

Try it live: slippyex.github.io/ym2149-rs

Retro CRT-style interface with drag & drop file loading

Workspace Packages

Crate Purpose Crates.io Docs
ym2149 Core YM2149 chip emulator (cycle-accurate) crates.io/crates/ym2149 docs.rs/ym2149
ym2149-common Shared traits ( ChiptunePlayer , PlaybackMetadata ) and types crates.io/crates/ym2149-common docs.rs/ym2149-common
ym2149-ym-replayer YM file parsing and music playback (YM1-YM6, YMT1/YMT2 tracker) crates.io/crates/ym2149-ym-replayer docs.rs/ym2149-ym-replayer
ym2149-replayer-cli Standalone CLI player with streaming and export Unpublished (workspace)
ym2149-softsynth Experimental software synthesizer backend (proof-of-concept) Unpublished (workspace) crates/ym2149-softsynth/README.md
ym2149-arkos-replayer Arkos Tracker 2/3 (.aks) parser and native multi-PSG player (pure Rust) crates.io/crates/ym2149-arkos-replayer docs.rs/ym2149-arkos-replayer
ym2149-ay-replayer ZXAY/EMUL AY file parser with integrated Z80 replayer crates.io/crates/ym2149-ay-replayer docs.rs/ym2149-ay-replayer
ym2149-sndh-replayer SNDH (Atari ST) player with 68000 CPU + MFP timer + STE DAC emulation crates.io/crates/ym2149-sndh-replayer docs.rs/ym2149-sndh-replayer
ym2149-gist-replayer GIST sound effect parser and multi-voice player (Atari ST) crates.io/crates/ym2149-gist-replayer docs.rs/ym2149-gist-replayer
bevy_ym2149 Bevy audio plugin (playback, playlists, diagnostics, audio bridge) crates.io/crates/bevy_ym2149 docs.rs/bevy_ym2149
bevy_ym2149_viz Optional visualization systems & UI builders crates.io/crates/bevy_ym2149_viz docs.rs/bevy_ym2149_viz
bevy_ym2149_examples Runnable Bevy demos (basic, advanced, crossfade, feature showcase, demoscene, playlist UI) Workspace-only crates/bevy_ym2149_examples/README.md
ym2149-wasm WebAssembly bindings for browser playback ( web demo ) npmjs.com/package/ym2149-wasm crates/ym2149-wasm/README.md
ym2149-bevy Legacy re-export (shim to bevy_ym2149 ) crates.io/crates/ym2149-bevy

Naming: Bevy-focused crates follow bevy_ym2149_* , while core/backends/replayers use the ym2149-* prefix.

Advanced Bevy example

Highlights

  • Hardware-faithful : cycle-accurate YM2149 emulation (ported from Leonard/Oxygene's AtariAudio ), precise envelope, noise, mixer, SID, Sync Buzzer, digi-drum behaviours
  • 📁 ZXAY/EMUL AY : bundled replayer with Z80 CPU emulation for the Project AY catalogue
  • 🎹 SNDH support : native Atari ST music via 68000 CPU + MFP 68901 timer + STE DAC emulation
  • 🧰 CLI ready : stream YM/AKS/AY/SNDH files in the terminal with real-time visualization
  • 🎵 Native Bevy audio : seamless integration via Decodable trait with pull-based sample generation
  • 🛰️ Configurable Bevy subsystems : playlists, crossfade decks, music state graphs, channel events, diagnostics, audio bridge
  • 🖼️ Visualization stack : drop-in oscilloscope, spectrum bars, progress HUD, and demoscene showcase based on the viz crate
  • 🧪 Well-tested : cargo test --workspace (165+ tests) plus example scenes to validate runtime flows
  • 🪄 Gameplay hooks : Bevy plugin ships marker events, audio-reactive metrics, and PSG one-shot SFX events

Why Arkos Tracker Support?

Arkos Tracker is the de-facto “modern” workflow for YM2149/AY musicians: it blends a classic step-sequencer with a visual instrument designer, supports multiple PSGs per song, and lets composers mix hardware envelopes with software macros. Native support matters because:

  • Multi-PSG music – Arkos sequences can target two or more AY chips; our replayer handles that natively, both in the CLI and Bevy.
  • Modern authoring tools – Musicians can stay in the Arkos editor (PC/Mac) and drop the .aks export straight into any crate in this repo—no external tracker runtime or C++ bridge required.
  • Feature parity – Hardware effects (Sync Buzzer, DigiDrum, SID), custom arps, and per-channel envelopes all map to the same PSG core shared with YM/AY playback.
  • Cross-target builds – The same Rust replayer powers desktop, web (WASM), and Bevy integrations, so Arkos rips behave identically everywhere.

In short: Arkos lets artists work with modern ergonomics, and this workspace lets those songs run anywhere Rust does.

Quick Start

Use the Core Library

[dependencies]
# Core emulator only (minimal dependencies)
ym2149 = "0.7"

# With streaming audio output
ym2149 = { version = "0.7", features = ["streaming"] }

# YM file parsing and playback
ym2149-ym-replayer = "0.7"
use ym2149_ym_replayer::{load_song, ChiptunePlayer, ChiptunePlayerBase, PlaybackMetadata};

fn main() -> anyhow::Result<()> {
    let data = std::fs::read("song.ym")?;
    let (mut player, summary) = load_song(&data)?;

    // Use the unified ChiptunePlayerBase interface for playback
    player.play();
    let samples = player.generate_samples(summary.samples_per_frame as usize);

    // Access metadata via ChiptunePlayer trait (extends ChiptunePlayerBase)
    let meta = player.metadata();
    println!("{} by {} • {} frames", meta.title(), meta.author(), summary.frame_count);
    Ok(())
}

Run the CLI Player

# Real-time playback with scope overlay
cargo run -p ym2149-replayer-cli -- examples/ym/ND-Toxygene.ym

# Play SNDH files from the Atari ST demoscene
cargo run -p ym2149-replayer-cli -- examples/sndh/Mad_Max/Buzzer.sndh

# Play GIST sound effects (.snd)
cargo run -p ym2149-gist-replayer --example player -- examples/gist/alien.snd

# Interactive demo with Bevy visualization
cargo run -p bevy_ym2149_examples --example basic_example

CLI player

Export to Audio Files

use ym2149_ym_replayer::{load_song, export::export_to_wav_default, export::ExportConfig};

fn main() -> anyhow::Result<()> {
    let data = std::fs::read("song.ym")?;
    let (mut player, info) = load_song(&data)?;

    // Export to WAV (feature: export-wav)
    export_to_wav_default(&mut player, info, "output.wav")?;

    Ok(())
}

Note: MP3 export was removed because the system-dependent LAME/Autotools toolchain proved too brittle. Export WAV instead and transcode externally (e.g. ffmpeg -i output.wav -b:a 192k output.mp3 ).

Add the Bevy Plugin

use bevy::prelude::*;
use bevy_ym2149::{Ym2149Playback, Ym2149Plugin};
use bevy_ym2149_viz::Ym2149VizPlugin;

fn main() {
    App::new()
        .add_plugins((DefaultPlugins, Ym2149Plugin::default(), Ym2149VizPlugin::default()))
        .add_systems(Startup, |mut commands: Commands| {
            commands.spawn(Camera2d);
            commands.spawn(Ym2149Playback::new("assets/music/song.ym")).insert(Name::new("Tracker"));
        })
        .run();
}

Need a reference scene? cargo run --example advanced_example -p bevy_ym2149_examples . Want to try the browser demo? Open https://slippyex.github.io/ym2149-rs/web/simple-player.html (auto-built via GitHub Pages).

Where to Find Music Files

Looking for chiptunes to play? These community archives have thousands of tracks:

Archive Format Description
SNDH Archive .sndh The definitive Atari ST music collection — demoscene classics, game soundtracks, and more
ST-Sound / Leonard .ym Curated YM archive by Leonard/Oxygene with high-quality rips
Project AY .ay ZX Spectrum and Amstrad CPC music archive
Arkos Tracker 3 .aks Source repository with example songs and the tracker itself

Documentation & Guides

  • crates/ym2149-core/README.md – emulator architecture, feature flags, CLI/export instructions
  • crates/bevy_ym2149/README.md – plugin subsystems, playlists, music state graph, audio bridge, diagnostics
  • crates/bevy_ym2149_viz/README.md – visualization builders and systems
  • crates/bevy_ym2149_examples/README.md – example matrix + screenshot gallery (incl. playlist crossfade UI)
  • ARCHITECTURE.md – YM + Arkos playback pipelines and layering details
  • crates/ym2149-core/STREAMING_GUIDE.md – low-latency streaming details
  • examples/ – curated list of .ym , .aks , .ay , and .sndh files for regression tests and the wasm demo

Need to refresh the wasm demo bundle? Run scripts/build-wasm-examples.sh from the repo root to rebuild via wasm-pack and copy the output into crates/ym2149-wasm/examples/pkg/ .

Testing

# Entire workspace
cargo test --workspace

# Focus a crate
cargo test -p ym2149
cargo test -p bevy_ym2149

# Feature-specific tests
cargo test -p ym2149 --features streaming

Development Prerequisites

  • Rust 1.83+ (Rust 2024 edition) with cargo and rustfmt
  • Audio backend libraries for CPAL/Rodio (ALSA/PulseAudio, CoreAudio, WASAPI, etc.) when testing real-time playback
  • AY playback: ZX-only, firmware calls are unsupported (CPC/ROM-heavy AY files will be rejected)
  • Optional tooling:
    • wasm-pack for building the web player
    • node / npm or python -m http.server for serving the WASM demo locally
    • Bevy’s native dependencies (Vulkan/Metal/DX) when running the example scenes
    • cargo-make / just if you use the provided helper scripts (optional)

Project Structure

ym2149-rs/
├── crates/
│   ├── ym2149-core/            # Core YM2149 chip emulator (crates.io `ym2149`)
│   ├── ym2149-common/          # Shared traits (ChiptunePlayer, PlaybackMetadata) and types
│   ├── ym2149-softsynth/       # Experimental soft synth backend implementing the backend trait
│   ├── ym2149-ym-replayer/     # YM parser + playback engine
│   ├── ym2149-arkos-replayer/  # Arkos Tracker (.aks) parser/player
│   ├── ym2149-ay-replayer/     # ZXAY/EMUL parser + Z80 runner (ZX-only; CPC AY rejected)
│   ├── ym2149-sndh-replayer/   # SNDH player with 68000 CPU + MFP timer + STE DAC emulation
│   ├── ym2149-gist-replayer/   # GIST sound effect parser and multi-voice player
│   ├── ym2149-replayer-cli/    # Terminal streamer/exporter built on the replayers
│   ├── ym2149-wasm/            # WASM bindings + browser demo
│   ├── bevy_ym2149/            # Bevy plugin (playback, playlists, crossfade, diagnostics)
│   ├── bevy_ym2149_viz/        # Optional visualization ECS systems
│   ├── bevy_ym2149_examples/   # Runnable Bevy app gallery
│   └── ym2149-bevy/            # Legacy shim that re-exports `bevy_ym2149`
├── examples/                   # YM/SNDH sample files
├── docs/                       # Web player (GitHub Pages)
├── Cargo.toml                  # Workspace configuration
└── README.md                   # You are here

Contributing

Contributions are welcome! Please ensure:

  • cargo fmt + cargo clippy
  • cargo test --workspace
  • Documentation and examples updated for new features

License

MIT License – see LICENSE .

Credits

  • Leonard/Oxygene (Arnaud Carré) – YM format specification, ST-Sound reference material, and the AtariAudio C++ implementation that forms the basis of our YM2149 core emulation
  • Atari ST + demoscene community – for the original tunes, SNDH archive, and documentation
  • Rust audio and Bevy ecosystems – rodio/cpal, Bevy ECS, and community inspiration

Oliver Sacks fabricated key details in his books

Hacker News
boingboing.net
2025-12-13 03:12:47
Comments...

Oliver sacks put himself into his case studies. What was the cost?

Hacker News
www.newyorker.com
2025-12-13 03:12:47
Comments...
Original Article

When Oliver Sacks arrived in New York City, in September, 1965, he wore a butter-colored suit that reminded him of the sun. He had just spent a romantic week in Europe travelling with a man named Jenö Vincze, and he found himself walking too fast, fizzing with happiness. “My blood is champagne,” he wrote. He kept a letter Vincze had written him in his pocket all day, feeling as if its pages were glowing. Sacks had moved to New York to work as a fellow in neuropathology at the Albert Einstein College of Medicine, in the Bronx, and a colleague observed that he was “walking on air.” Every morning, he carefully polished his shoes and shaved. He adored his bosses. “I smile like a lighthouse in all directions,” he wrote Vincze.

Sacks was thirty-two, and he told Vincze that this was his first romantic relationship that was both physical and reciprocal. He felt he was part of a “two man universe,” seeing the world for the first time—“seeing it clear, and seeing it whole.” He wandered along the shipping piers on the Hudson River, where gay men cruised, with a notebook that he treated as a diary and as an endless letter to Vincze. “To watch life with the eyes of a homosexual is the greatest thing in the world,” Vincze had once told Sacks.

Sacks’s mother, a surgeon in London, had suspected that her son was gay when he was a teen-ager. She declared that homosexuality was an “abomination,” using the phrase “filth of the bowel” and telling him that she wished he’d never been born. They didn’t speak of the subject again. Sacks had moved to America—first to California and then, after five years, to New York—because, he wrote in his journal, “I wanted a sexual and moral freedom I felt I could never have in England.” That fall, during Yom Kippur, he decided that, rather than going to synagogue to confess “to the total range of human sin,” a ritual he’d grown up with, he’d spend the night at a bar, enjoying a couple of beers. “What I suppose I am saying, Jenö, is that I now feel differently about myself, and therefore about homosexuality as a whole,” he wrote. “I am through with cringing, and apologies, and pious wishes that I might have been ‘normal.’ ” (The Oliver Sacks Foundation shared with me his correspondence and other records, as well as four decades’ worth of journals—many of which had not been read since he wrote them.)

In early October, Sacks sent two letters to Vincze, but a week passed without a reply. Sacks asked his colleagues to search their mailboxes, in case the letter had been put in the wrong slot. Within a few days, however, he had given up on innocent explanations. He began dressing sloppily. He stopped coming to work on time. He had sex with a series of men who disgusted him.

After two weeks, Vincze, who was living in Berlin, sent a letter apologizing for his delayed reply and reiterating his love. He explained that he was so preoccupied by thoughts of Sacks that he felt as if he were living in a “Klaudur,” a German word that Vincze defined as a “spiritual cell.” He seems to have misspelled Klausur , which refers to an enclosed area in a monastery, but Sacks kept using the misspelled word, becoming obsessed with it. “It ramifies in horrible associations,” he wrote Vincze. “The closing of a door. Klaudur, claustrophobia, the sense of being shut in.” Sacks had long felt as if he were living in a cell, incapable of human contact, and this word appeared to be all he needed to confirm that the condition was terminal. The meaning of the word began morphing from “spiritual cell” to “psychotic cage.”

Two people looking at their dog chewing a cigar while rolling around in poker chips and playing cards.

“He just got back from his poker game.”

Cartoon by Liana Finck

The intimacy Sacks had rejoiced in now seemed phony, a “folie à deux”—a two-person delusion. His doubts intensified for a month, then he cut off the relationship. “I must tear you out of my system, because I dare not be involved ,” he told Vincze, explaining that he barely remembered how he looked, or the sound of his voice. “I hope I will not be taken in like this again, and that—conversely—I will have the strength and clarity of mind to perceive any future such relationships as morbid at their inception, and to abort the folly of their further growth.”

Two months later, Sacks felt himself “slipping down the greased path of withdrawal, discontent, inability to make friends, inability to have sex, etc. etc. towards suicide in a New York apartment at the age of 32.” He took enormous amounts of amphetamines, to the point of hallucinating. A family friend, a psychiatrist who worked with Anna Freud, urged him to find a psychoanalyst. She wrote him that his homosexuality was “a very ‘secondary phenomenon’ ”: he was attracted to men as “a substitute for veering uncertainties of what/whom you could love other than as ‘idealizations’ of yourself.” A few weeks later, he started therapy with Leonard Shengold, a young psychiatrist who was deeply immersed in Manhattan’s psychoanalytic culture. “I think he is very good, and he has at least a very considerable local reputation,” Sacks wrote his parents, who helped to pay for the sessions, three times a week.

Sacks had elevated yet hazy ambitions at the time: he wanted to be a novelist, but he also wanted to become the “Galileo of the inward,” he told a mentor, and to write the neurological equivalent of Sigmund Freud’s “ Interpretation of Dreams .” He worked in wards with chronically ill and elderly patients who had been warehoused and neglected, and his prospects within academic medicine looked dim. “Have you published anything lately?” his father wrote him, in 1968. “Or have you found yourself temperamentally incapacitated from doing so?”

When Sacks began therapy, “my initial and ultimate complaint was of fixity —a feeling of not-going ,” he wrote in his journal. He regarded Shengold as “a sort of analytic machine.” But gradually Sacks came to feel that “I love him, and need him; that I need him—and love him.” He had planned to stay in New York City only for a few years, but he kept delaying his return to England so that he could reach “a terminable point in my analysis.” Shengold, who would eventually publish ten books about psychoanalysis, wrote that therapy requires a “long period of working through”—a term he defined as the “need to repeat emotional conflicts over and over in life” until the patient has the “freedom to own what is there to be felt.”

Sacks saw Shengold for half a century. In that time, Sacks became one of the world’s most prominent neurologists and a kind of founding father of medical humanities—a discipline that coalesced in the seventies, linking healing with storytelling. But the freedom that Shengold’s analysis promised was elusive. After Vincze, Sacks did not have another relationship for forty-four years. He seemed to be doing the “working through” at a remove—again and again, his psychic conflicts were displaced onto the lives of his patients. He gave them “some of my own powers , and some of my phantasies too,” he wrote in his journal. “I write out symbolic versions of myself.”

During Sacks’s neurology internship, in San Francisco, his childhood friend Eric Korn warned him that the residents at his hospital could sense he was gay. “For God’s sake, exercise what seems to you immoderate caution,” Korn wrote, in 1961. “Compartmentalize your life. Cover your tracks. Don’t bring in the wrong sort of guests to the hospital, or sign your name and address to the wrong sort of register.” He encouraged Sacks to read “Homosexuality: Disease or Way of Life?,” a best-selling book by Edmund Bergler, who argued that homosexuality was an “illness as painful, as unpleasant and as disabling as any other serious affliction,” but one that psychoanalysis could cure. “The book is full of interest,” Korn wrote. “He claims a potential 100% ‘cures’ (a term he chooses to employ because he knows it teases) which is worth investigating perhaps.”

Freud characterized homosexuality as a relatively normal variant of human behavior, but when psychoanalysis came to the United States, in the postwar years, homophobia took on new life. The historian Dagmar Herzog has described how, in the U.S., “reinventing psychoanalysis and reinventing homophobia went hand in hand.” Faced with men who persisted in their love for other men, American analysts commonly proposed celibacy as a stopgap solution. In the historian Martin Duberman’s memoir “ Cures ,” he writes that his psychoanalyst instructed him to “take the veil”—live celibately—so that he could be cured of his desire for men. Duberman agreed to these terms. The best he could get, he thought, was sublimation: instead of enjoying an “affective life,” he would make “some contribution to the general culture from which I was effectively barred.” Sacks, who was closeted until he was eighty, also followed this course.

Shengold had portraits of Charles Dickens, William Shakespeare, and Sigmund Freud in his office, on the Upper East Side. Like Sacks, he came from a literary Jewish family. He seemed deeply attuned to Sacks’s creative life, which took the form of ecstatic surges of literary inspiration followed by months of sterility and depression. “Do your best to enjoy and to work—it is the power of your mind that is crucial ,” Shengold wrote when Sacks was on a visit with his family in England. Sacks wrote in his journal that he’d dreamed he overheard Shengold telling someone, “Oliver is lacking in proper self-respect; he has never really appreciated himself, or appreciated others’ appreciation of him. And yet, in his way, he is not less gifted than Auden was.” Sacks woke up flushed with embarrassment and pleasure.

Oliver Sacks on the street

Sacks in 1987. He became the modern master of the case study. “I write out symbolic versions of myself,” he wrote. Photograph by Lowell Handler

Unlike many of his contemporaries, Shengold was not a doctrinaire thinker, but he was still susceptible to psychoanalytic fashions. Reflecting on how he might have viewed living openly as a gay man at that time, Shengold’s daughter, Nina, told me, “I don’t know that was a door that Dad necessarily had wide open.” In several books and papers, Shengold, a prolific reader of Western literature, tried to understand the process by which troubled people sublimate their conflicts into art. In his 1988 book, “ Halo in the Sky: Observations on Anality and Defense ,” Shengold wrote about the importance of transforming “anal-sadistic drives”—he used the anus as a metaphor for primitive, dangerous impulses—into “adaptive and creative ‘making.’ ” When Sacks read the book, he wrote in his journal that it “made me feel I was ‘lost in anality’ (whatever this means).”

Before Vincze, Sacks had been in love with a man named Mel Erpelding, who once told him, Sacks wrote, that he “oozed sexuality, that it poured out through every pore, that I was alive and vibrant with sexuality (a positive-admiring way of putting things), but also that I was reeking and toxic with it.” (Erpelding, who ended up marrying a woman, never allowed his relationship with Sacks to become sexual.) In his early years of therapy, in the late sixties, Sacks resolved that he would give up both drugs and sex. It’s doubtful that Shengold encouraged his celibacy, but he may have accepted that sexual abstinence could be productive, at least for a time. Richard Isay, the first openly gay member of the American Psychoanalytic Association, said that, in the seventies, he’d “rationalized that maturity and mental health demanded the sublimation of sexual excitement in work.” Sacks told a friend, “Shengold is fond of quoting Flaubert’s words ‘the mind has its erections too.’ ”

For Sacks, writing seemed almost physiological, like sweating—an involuntary response to stimuli. He routinely filled a whole journal in two days. “Should I then put down my pen , my interminable Journal (for this is but a fragment of the journal I have kept all my life),” he asked, “and ‘start living’ instead?” The answer was almost always no. Sometimes Sacks, who would eventually publish sixteen books, wrote continuously in his journal for six hours. Even when he was driving his car, he was still writing—he set up a tape recorder so that he could keep developing his thoughts, which were regularly interrupted by traffic or a wrong turn. Driving through Manhattan one day in 1975, he reflected on the fact that his closets, stuffed with pages of writing, resembled a “grave bursting open.”

By the late sixties, Sacks had become, he wrote, “almost a monk in my asceticism and devotion to work.” He estimated that he produced a million and a half words a year. When he woke up in the middle of the night with an erection, he would cool his penis by putting it in orange jello. He told Erpelding, “I partly accept myself as a celibate and a cripple, but partly—and this is . . . the wonder of sublimation—am able to transform my erotic feelings into other sorts of love—love for my patients, my work, art, thought.” He explained, “I keep my distance from people, am always courteous, never close. For me (as perhaps for you) there is almost no room, no moral room.”

“I have some hard ‘confessing’ to do—if not in public, at least to Shengold—and myself,” Sacks wrote in his journal, in 1985. By then, he had published four books—“ Migraine ,” “ Awakenings ,” “ A Leg to Stand On ,” and “ The Man Who Mistook His Wife for a Hat ”—establishing his reputation as “our modern master of the case study,” as the Times put it. He rejected what he called “pallid, abstract knowing,” and pushed medicine to engage more deeply with patients’ interiority and how it interacted with their diseases. Medical schools began creating programs in medical humanities and “narrative medicine,” and a new belief took hold: that an ill person has lost narrative coherence, and that doctors, if they attend to their patients’ private struggles, could help them reconstruct a new story of their lives. At Harvard Medical School, for a time, students were assigned to write a “book” about a patient. Stories of illness written by physicians (and by patients) began proliferating, to the point that the medical sociologist Arthur Frank noted, “ ‘Oliver Sacks’ now designates not only a specific physician author but also a . . . genre—a distinctively recognizable form of storytelling.”

But, in his journal, Sacks wrote that “a sense of hideous criminality remains (psychologically) attached” to his work: he had given his patients “powers (starting with powers of speech) which they do not have.” Some details, he recognized, were “pure fabrications.” He tried to reassure himself that the exaggerations did not come from a shallow place, such as a desire for fame or attention. “The impulse is both ‘purer’—and deeper,” he wrote. “It is not merely or wholly a projection —nor (as I have sometimes, ingeniously-disingenuously, maintained) a mere ‘sensitization’ of what I know so well in myself. But (if you will) a sort of autobiography .” He called it “ symbolic ‘exo-graphy.’ ”

Sacks had “misstepped in this regard, many many times, in ‘Awakenings,’ ” he wrote in another journal entry, describing it as a “source of severe, long-lasting, self-recrimination.” In the book , published in 1973, he startled readers with the depth of his compassion for some eighty patients at Beth Abraham Hospital, in the Bronx, who had survived an epidemic of encephalitis lethargica, a mysterious, often fatal virus that appeared around the time of the First World War. The patients had been institutionalized for decades, in nearly catatonic states. At the time, the book was met with silence or skepticism by other neurologists—Sacks had presented his findings in a form that could not be readily replicated, or extrapolated from—but, to nonspecialists, it was a masterpiece of medical witnessing. The Guardian would name it the twelfth-best nonfiction book of all time.

Child watches parent draw their signature at a credit card machine.

“My handwriting is better than your finger-writing.”

Cartoon by William Haefeli

Sacks spent up to fifteen hours a day with his patients, one of the largest groups of post-encephalitic survivors in the world. They were “mummified,” like “living statues,” he observed. A medicine called L-dopa, which elevates the brain’s dopamine levels, was just starting to be used for Parkinson’s disease, on an experimental basis, and Sacks reasoned that his patients, whose symptoms resembled those of Parkinson’s, could benefit from the drug. In 1969, within days of giving his patients the medication, they suddenly “woke up,” their old personalities intact. Other doctors had dismissed these patients as hopeless, but Sacks had sensed that they still had life in them—a recognition that he understood was possible because he, too, felt as if he were “buried alive.”

In “ Awakenings ,” Sacks writes about his encounters with a man he calls Leonard L. “What’s it like being the way you are?” Sacks asks him the first time they meet. “Caged,” Leonard replies, by pointing to letters of the alphabet on a board. “Deprived. Like Rilke’s ‘Panther’ ”—a reference to a poem by Rainer Maria Rilke about a panther pacing repetitively in cramped circles “around a center / in which a mighty will stands paralyzed.”

When Sacks was struggling to write his first book, “ Migraine ,” he told a friend that he felt like “Rilke’s image of the caged panther, stupefied, dying, behind bars.” In a letter to Shengold, he repeated this image. When Sacks met Leonard, he jotted down elegant observations in his chart (“Quick and darting eye movements are at odds with his general petrified immobility”), but there is no mention of Leonard invoking the Rilke poem.

In the preface to “ Awakenings ,” Sacks acknowledges that he changed circumstantial details to protect his patients’ privacy but preserved “what is important and essential—the real and full presence of the patients themselves.” Sacks characterizes Leonard as a solitary figure even before his illness: he was “continually buried in books, and had few or no friends, and indulged in none of the sexual, social, or other activities common to boys of his age.” But, in an autobiography that Leonard wrote after taking L-dopa, he never mentions reading or writing or being alone in those years. In fact, he notes that he spent all his time with his two best friends—“We were inseparable,” he writes. He also recalls raping several people. “We placed our cousin over a chair, pulled down her pants and inserted our penises into the crack,” he writes on the third page, in the tone of an aging man reminiscing on better days. By page 10, he is describing how, when he babysat two girls, he made one of them strip and then “leaped on her. I tossed her on her belly and pulled out my penis and placed it between her buttocks and started to screw her.”

Photo of a man

Leonard Shengold, Sacks’s psychoanalyst. Photograph courtesy Nina Shengold

In “ Awakenings ,” Sacks has cleansed his patient’s history of sexuality. He depicts him as a man of “most unusual intelligence, cultivation, and sophistication”—the “ ‘ideal’ patient.” L-dopa may have made Leonard remember his childhood in a heightened sexual register—his niece and nephew, who visited him at the hospital until his death, in 1981, told me that the drug had made him very sexual. But they said that he had been a normal child and adolescent, not a recluse who renounced human entanglement for a life of the mind.

Sacks finished writing “Awakenings” rapidly in the weeks after burying his mother, who’d died suddenly, at the age of seventy-seven. He felt “a great open torrent—and release ,” he wrote in his journal. “It seems to be surely significant that ‘Awakenings’ finally came forth from me like a cry after the death of my own mother.” He referred to the writing of the book as his “Great Awakening,” the moment he “came out.” He doesn’t mention another event of significance: his patients had awakened during the summer of the Stonewall riots, the beginning of the gay-rights movement.

Shengold once told Sacks that he had “never met anyone less affected by gay liberation.” (Shengold supported his own son when he came out as gay, in the eighties.) Sacks agreed with the characterization. “I remain resolutely locked in my cell despite the dancing at the prison gates,” he said, in 1984.

In “Awakenings,” his patients are at first overjoyed by their freedom; then their new vitality becomes unbearable. As they continue taking L-dopa, many of them are consumed by insatiable desires. “L-DOPA is wanton, egotistical power,” Leonard says in the book. He injures his penis twice and tries to suffocate himself with a pillow. Another patient is so aroused and euphoric that she tells Sacks, “My blood is champagne”—the phrase Sacks used to describe himself when he was in love with Vincze. Sacks begins tapering his patients’ L-dopa, and taking some of them off of it completely. The book becomes a kind of drama about dosage: an examination of how much aliveness is tolerable, and at what cost. Some side effects of L-dopa, like involuntary movements and overactivity, have been well documented, but it’s hard not to wonder if “Awakenings” exaggerates the psychological fallout—Leonard becomes so unmanageable that the hospital moves him into a “punishment cell”—as if Sacks is reassuring himself that free rein of the libido cannot be sustained without grim consequence.

After “Awakenings,” Sacks intended his next book to be about his work with young people in a psychiatric ward at Bronx State Hospital who had been institutionalized since they were children. The environment reminded Sacks of a boarding school where he had been sent, between the ages of six and nine, during the Second World War. He was one of four hundred thousand children evacuated from London without their parents, and he felt abandoned. He was beaten by the headmaster and bullied by the other boys. The ward at Bronx State “exerted a sort of spell on me,” Sacks wrote in his journal, in 1974. “I lost my footing of proper sympathy and got sucked, so to speak, into an improper ‘perilous condition’ of identification to the patients.”

Shengold wrote several papers and books about a concept he called “soul murder”—a category of childhood trauma that induces “a hypnotic living-deadness, a state of existing ‘as if’ one were there.” Sacks planned to turn his work at Bronx State into a book about “ ‘SOUL MURDER’ and ‘SOUL SURVIVAL,’ ” he wrote. He was especially invested in two young men on the ward whom he thought he was curing. “The miracle-of-recovery started to occur in and through their relation to me (our relation and feelings to each other , of course),” he wrote in his journal. “We had to meet in a passionate subjectivity, a sort of collaboration or communication which transcended the Socratic relation of teacher-and-pupil.”

In a spontaneous creative burst lasting three weeks, Sacks wrote twenty-four essays about his work at Bronx State which he believed had the “beauty, the intensity, of Revelation . . . as if I was coming to know, once again, what I knew as a child, that sense of Dearness and Trust I had lost for so long.” But in the ward he sensed a “dreadful silent tension.” His colleagues didn’t understand the attention he was lavishing on his patients—he got a piano and a Ping-Pong table for them and took one patient to the botanical garden. Their suspicion, he wrote in his journal, “centred on the unbearability of my uncategorizability.” As a middle-aged man living alone—he had a huge beard and dressed eccentrically, sometimes wearing a black leather shirt—Sacks was particularly vulnerable to baseless innuendo. In April, 1974, he was fired. There had been rumors that he was molesting some of the boys.

That night, Sacks tore up his essays and then burned them. “Spite! Hate! Hateful spite!” he wrote in his journal shortly after. “And now I am empty—empty handed, empty hearted, desolate.”

The series of events was so distressing that even writing about it in his journal made Sacks feel that he was about to die. He knew that he should shrug off the false accusations as “vile idle gossip thrown by tiddlers and piddlers,” he wrote. But he couldn’t, because of “the parental accusation which I have borne—a Kafka-esque cross, guilt without crime, since my earliest days.”

The historian of medicine Henri Ellenberger observed that psychiatry owes its development to two intertwined dynamics: the neuroses of its founders—in trying to master their own conflicts, they came to new insights and forms of therapy—and the prolonged, ambiguous relationships they had with their patients. The case studies of these relationships, Ellenberger wrote, tended to have a distinct arc: psychiatrists had to unravel their patients’ “pathogenic secret,” a hidden source of hopelessness, in order to heal them.

Sacks’s early case studies also tended to revolve around secrets, but wonderful ones. Through his care, his patients realized that they had hidden gifts—for music, painting, writing—that could restore to them a sense of wholeness. The critic Anatole Broyard, recounting his cancer treatment in the Times Magazine in 1990, wrote that he longed for a charismatic, passionate physician, skilled in “empathetic witnessing.” In short, he wrote, a doctor who “would resemble Oliver Sacks.” He added, “He would see the genius of my illness.”

It speaks to the power of the fantasy of the magical healer that readers and publishers accepted Sacks’s stories as literal truth. In a letter to one of his three brothers, Marcus, Sacks enclosed a copy of “ The Man Who Mistook His Wife for a Hat ,” which was published in 1985, calling it a book of “fairy tales.” He explained that “these odd Narratives—half-report, half-imagined, half-science, half-fable, but with a fidelity of their own—are what I do, basically, to keep MY demons of boredom and loneliness and despair away.” He added that Marcus would likely call them “confabulations”—a phenomenon Sacks explores in a chapter about a patient who could retain memories for only a few seconds and must “ make meaning, in a desperate way, continually inventing, throwing bridges of meaning over abysses,” but the “bridges, the patches, for all their brilliance . . . cannot do service for reality.”

Sacks was startled by the success of the book, which he had dedicated to Shengold, “my own mentor and physician.” It became an international best-seller, routinely assigned in medical schools. Sacks wrote in his journal,

Guilt has been much greater since ‘Hat’ because of (among other things)

My lies,

falsification

He pondered the phrase “art is the lie that tells the truth,” often attributed to Picasso, but he seemed unconvinced. “I think I have to thrash this out with Shengold—it is killing me, soul-killing me,” he wrote. “My ‘cast of characters’ (for this is what they become) take on an almost Dickensian quality.”

Sacks once told a reporter that he hoped to be remembered as someone who “bore witness”—a term often used within medicine to describe the act of accompanying patients in their most vulnerable moments, rather than turning away. To bear witness is to recognize and respond to suffering that would otherwise go unseen. But perhaps bearing witness is incompatible with writing a story about it. In his journal, after a session with a patient with Tourette’s syndrome, Sacks describes the miracle of being “enabled to ‘feel’—that is, to imagine, with all the powers of my head and heart—how it felt to be another human being.” Empathy tends to be held up as a moral end point, as if it exists as its own little island of good work. And yet it is part of a longer transaction, and it is, fundamentally, a projection. A writer who imagines what it’s like to exist as another person must then translate that into his own idiom—a process that Sacks makes particularly literal.

“I’ll tell you what you are saying,” Sacks told a woman with an I.Q. of around 60 whose grandmother had just died. “You want to go down below and join your dead grandparents down in the Kingdom of Death.” In the conversation, which Sacks recorded, the patient becomes more expressive under the rare glow of her doctor’s sustained attention, and it’s clear that she is fond of him. But he is so excited about her words (“One feels that she is voicing universal symbols,” he says in a recording, “symbols which are infinite in meaning”) that he usurps her experience.

“I know, in a way, you don’t feel like living,” Sacks tells her, in another recorded session. “Part of one feels dead inside, I know, I know that. . . . One feels that one wants to die, one wants to end it, and what’s the use of going on?”

“I don’t mean it in that way,” she responds.

“I know, but you do, partly,” Sacks tells her. “I know you have been lonely all your life.”

Tourists take photos next to giant casserole dish.

Cartoon by Michael Maslin

The woman’s story is told, with details altered, in a chapter in “Hat” titled “Rebecca.” In the essay, Rebecca is transformed by grief for her grandmother. She reminds Sacks of Chekhov’s Nina, in “The Seagull,” who longs to be an actress. Though Nina’s life is painful and disappointing, at the end of the play her suffering gives her depth and strength. Rebecca, too, ends the story in full flower. “Rather suddenly, after her grandmother’s death,” Sacks writes, she becomes decisive, joining a theatre group and appearing to him as “a complete person, poised, fluent,” a “natural poet.” The case study is presented as an ode to the power of understanding a patient’s life as a narrative, not as a collection of symptoms. But in the transcripts of their conversations—at least the ones saved from the year that followed, as well as Sacks’s journals from that period—Rebecca never joins a theatre group or emerges from her despair. She complains that it’s “better that I shouldn’t have been born,” that she is “useless,” “good for nothing,” and Sacks vehemently tries to convince her that she’s not. Instead of bearing witness to her reality, he reshapes it so that she, too, awakens.

Some of the most prominent nonfiction writers of Sacks’s era ( Joseph Mitchell , A. J. Liebling , Ryszard Kapuściński ) also took liberties with the truth, believing that they had a higher purpose: to illuminate the human condition. Sacks was writing in that spirit, too, but in a discipline that depends on reproducible findings. The “most flagrant example” of his distortions, Sacks wrote in his journal, was in one of the last chapters of “Hat,” titled “The Twins,” about twenty-six-year-old twins with autism who had been institutionalized since they were seven. They spend their days reciting numbers, which they “savored, shared” while “closeted in their numerical communion.” Sacks lingers near them, jotting down the numbers, and eventually realizes that they are all prime. As a child, Sacks used to spend hours alone, trying to come up with a formula for prime numbers, but, he wrote, “I never found any Law or Pattern for them—and this gave me an intense feeling of Terror, Pleasure, and—Mystery.” Delighted by the twins’ pastime, Sacks comes to the ward with a book of prime numbers which he’d loved as a child. After offering his own prime number, “they drew apart slightly, making room for me, a new number playmate, a third in their world.” Having apparently uncovered the impossible algorithm that Sacks had once wished for, the twins continue sharing primes until they’re exchanging ones with twenty digits. The scene reads like a kind of dream: he has discovered that human intimacy has a decipherable structure, and identified a hidden pattern that will allow him to finally join in.

Before Sacks met them, the twins had been extensively studied because of their capacity to determine the day of the week on which any date in the calendar fell. In the sixties, two papers in the American Journal of Psychiatry provided detailed accounts of the extent of their abilities. Neither paper mentioned a gift for prime numbers or math. When Sacks wrote Alexander Luria, a Russian neuropsychologist, about his work with the twins, in 1973, he also did not mention any special mathematical skills. In 2007, a psychologist with a background in learning theory published a short article in the Journal of Autism and Developmental Disorders , challenging Sacks’s assertion that these twins could spontaneously generate large prime numbers. Because this is not something that humans can reliably do, Sacks’s finding had been widely cited, and was theoretically “important for not only psychologists but also for all scientists and mathematicians,” the psychologist wrote. (The psychologist had contacted Sacks to ask for the title of his childhood book of prime numbers, because he couldn’t find a book of that description, but Sacks said that it had been lost.) Without pointing to new evidence, another scientist wrote in Sacks’s defense, describing his case study as “the most compelling account of savant numerosity skills” and arguing, “This is an example of science at the frontier, requiring daring to advance new interpretations of partial data.”

After the publication of “Hat,” when Sacks was fifty-two years old, he wrote his friend Robert Rodman, a psychoanalyst, that “Shengold suggested, with some hesitancy, some months ago, that I should consider going deeper with him.” He added, “He also observes that I don’t complain, say, of sexual deprivation—though this is absolute.” At first, Sacks was worried that Shengold was preparing to dismiss him from treatment: “I’ve done all I can for you—now manage on your own!” Then he felt hopeful that he didn’t need to assume that “boredom-depression-loneliness-cutoffness” would define the rest of his life. He was also moved that, after twenty years, Shengold still considered him “worth extra work.”

But Sacks was shaken by the idea that they’d only been skimming the surface. He looked back through his notebooks and noticed “a perceptible decline in concern and passion,” which he felt had also dulled the quality of his thought. “Is the superficiality of my work, then, due to superficiality of relationships—to running away from whatever has deeper feeling and meaning?” he asked Rodman. “Is this perhaps spoken of, in a camouflaged way, when I describe the ‘superficialization’ of various patients?” As an example, he referenced an essay in “Hat” about a woman with a cerebral tumor. She was intelligent and amusing but seemed not to care about anyone. “Was this the ‘cover’ of some unbearable emotion?” he writes in the essay.

Sacks felt that Shengold was the reason he was still alive, and that he should go further with him. “What have I to lose?” he asked Rodman. But, he wrote, “what one has to lose, of course, may be just that quasi-stable if fragile ‘functioning’ . . . so there is reason to hesitate.” Going deeper would also mean more fully submitting to someone else’s interpretation, experiencing what he asked of his own patients; Rodman proposed that Sacks was “afraid of the enclosure of analysis, of being reduced and fixed with a formulated phrase.”

Two men standing together in a field

Sacks and his partner, Bill Hayes. Photograph courtesy Oliver Sacks Foundation

In the early eighties, Lawrence Weschler, then a writer for The New Yorker , began working on a biography of Sacks. Weschler came to feel that Sacks’s homosexuality was integral to his work, but Sacks didn’t want his sexuality mentioned at all, and eventually asked him to stop the project. “I have lived a life wrapped in concealment and wracked by inhibition, and I can’t see that changing now,” he told Weschler. In his journal, Sacks jotted down thoughts to share with Weschler on the subject: “My ‘sex life’ (or lack of it) is, in a sense irrelevant to the . . . sweep of my mind .” In another entry, he wrote that the Freudian term “sublimation” diminished the process he’d undergone. When he was still having sex, as a young man in California, he used to sheath his body in leather gear, so he was “totally encased, enclosed,” his real self sealed in a kind of “black box.” He wrote, “I have, in a sense , ‘outgrown’ these extraordinary, almost convulsive compulsions—but this detachment has been made possible by incorporating them into a vast and comprehending view of the world.” (Weschler became close friends with Sacks, and, after Sacks died, published a “biographical memoir” titled “And How Are You , Dr. Sacks?”)

It’s unclear whether Sacks did “go deeper” with Shengold. In the late eighties, Sacks wrote in his journal that he was “scared, horrified (but, in an awful way, accepting or complaisant) about my non-life.” He likened himself to a “pithed and gutted creature.” Rather than living, he was managing a kind of “homeostasis.”

In 1987, Sacks had an intense friendship with a psychiatrist named Jonathan Mueller, with whom he briefly fell in love. Mueller, who was married to a woman, told me that he did not realize Sacks had romantic feelings for him. Sacks eventually moved on. But he felt that the experience had altered him. “I can read ‘love stories’ with empathy and understanding—I can ‘ enter into them ’ in a way which was impossible before,” he wrote in his journal. He perceived, in a new light, what it meant for his patients in “Awakenings” to glimpse the possibility of “liberation”: like him, he wrote, they were seeking “not merely a cure but an indemnification for the loss of their lives.”

By the nineties, Sacks seemed to ask less of himself, emotionally, in relation to his patients. He had started working with Kate Edgar, who’d begun as his assistant but eventually edited his writing, organized his daily life, and became a close friend. (Shengold had encouraged Sacks to find someone to assist with his work. “The secretary is certainly an important ‘ego-auxiliary,’ ” he wrote him in a letter.) Edgar was wary about the way Sacks quoted his patients—they were suspiciously literary, she thought—and she checked to make sure he wasn’t getting carried away. She spent hours with some of his patients, and, she told me, “I never caught him in anything like that, which actually surprises me.”

Weschler told me that Sacks used to express anxiety about whether he’d distorted the truth. Weschler would assure him that good writing is not a strict account of reality; there has to be space for the writer’s imagination. He said he told Sacks, “Come on, you’re extravagantly romanticizing how bad you are—just as much as you were extravagantly romanticizing what the patient said. Your mother’s accusing voice has taken over.” Weschler had gone to Beth Abraham Hospital to meet some of the patients from “Awakenings” and had been shaken by their condition. “There’s a lot of people shitting in their pants, drooling—the sedimentation of thirty years living in a warehouse,” he said. “His genius was to see past that, to the dignity of the person. He would talk to them for an hour, and maybe their eyes would brighten only once—the rest of the time their eyes were cloudy—but he would glom onto that and keep talking.”

After “Hat,” Sacks’s relationship with his subjects became more mediated. Most of them were not his patients; many wrote to him after reading his work, recognizing themselves in his books. There was a different power dynamic, because these people already believed that they had stories to tell. Perhaps the guilt over liberties he had taken in “Hat” caused him to curb the impulse to exaggerate. His expressions of remorse over “making up, ‘enhancing,’ etc,” which had appeared in his journals throughout the seventies and eighties, stopped. In his case studies, he used fewer and shorter quotes. His patients were far more likely to say ordinary, banal things, and they rarely quoted literature. They still had secret gifts, but they weren’t redeemed by them; they were just trying to cope.

In “ An Anthropologist on Mars ,” from 1992, a book of case studies about people compensating for, and adapting to, neurological conditions, some of the richest passages are the ones in which Sacks allows his incomprehension to become part of the portrait. In a chapter called “Prodigies,” he wants badly to connect with a thirteen-year-old boy named Stephen, who is autistic and has an extraordinary ability to draw, but Stephen resists Sacks’s attempts at intimacy. He will not allow himself to be romanticized, a refusal that Sacks ultimately accepts: “Is Stephen, or his autism, changed by his art? Here, I think, the answer is no.” In this new mode, Sacks is less inclined to replace Stephen’s unknowable experience with his own fantasy of it. He is open about the discomfort, and even embarrassment, of his multiple failures to reach him: “I had hoped, perhaps sentimentally, for some depth of feeling from him; my heart had leapt at the first ‘Hullo, Oliver!’ but there had been no follow-up.”

Mort Doran, a surgeon with Tourette’s syndrome whom Sacks profiled in “Anthropologist,” told me that he was happy with the way Sacks had rendered his life. He said that only one detail was inaccurate—Sacks had written that the brick wall of Doran’s kitchen was marked from Doran hitting it during Tourette’s episodes. “I thought, Why would he embellish that? And then I thought, Maybe that’s just what writers do.” Doran never mentioned the error to Sacks. He was grateful that Sacks “had the gravitas to put it out there to the rest of the world and say, ‘These people aren’t all nuts or deluded. They’re real people.’ ”

The wife in the title story of “Hat” had privately disagreed with Sacks about the portrayal of her husband, but for the most part Sacks appeared to have had remarkable relationships with his patients, corresponding with them for years. A patient called Ray, the subject of a 1981 piece about Tourette’s syndrome, told me that Sacks came to his son’s wedding years after his formal treatment had ended. Recalling Sacks’s death, he found himself suddenly crying. “Part of me left,” he said. “Part of my self was gone.”

A year after “Awakenings” was published, Sacks broke his leg in Norway, and Leonard L. and his mother wrote him a get-well letter. Thirty-two patients added their names, their signatures wavering. “Everybody had been counting the days for your return, so you can imagine the turmoil when they heard the news,” Leonard’s mother wrote. She explained that “most of the patients are not doing so well without your help and interest.” She added that Leonard “isn’t doing too well either.” When Leonard learned that Sacks wouldn’t be back, she said, “he shed enough tears to fill a bucket.”

Sacks spoke of “animating” his patients, as if lending them some of his narrative energy. After living in the forgotten wards of hospitals, in a kind of narrative void, perhaps his patients felt that some inaccuracies were part of the exchange. Or maybe they thought, That’s just what writers do. Sacks established empathy as a quality every good doctor should possess, enshrining the ideal through his stories. But his case studies, and the genre they helped inspire, were never clear about what they exposed: the ease with which empathy can slide into something too creative, or invasive, or possessive. Therapists—and writers—inevitably see their subjects through the lens of their own lives, in ways that can be both generative and misleading.

In his journal, reflecting on his work with Tourette’s patients, Sacks described his desire to help their illness “reach fruition,” so that they would become floridly symptomatic. “With my help and almost my collusion, they can extract the maximum possible from their sickness—maximum of knowledge, insight, courage,” he wrote. “Thus I will FIRST help them to get ill, to experience their illness with maximum intensity; and then, only then , will I help them get well!” On the next line, he wrote, “IS THIS MONSTROUS?” The practice came from a sense of awe, not opportunism, but he recognized that it made him complicit, as if their illness had become a collaboration. “An impulse both neurotic and intellectual (artistic) makes me get the most out of suffering ,” he wrote. His approach set the template for a branch of writing and thinking that made it seem as if the natural arc of illness involved insight and revelation, and even some poetry, too.

In his journals, Sacks repeatedly complained that his life story was over. He had the “feeling that I have stopped doing, that doing has stopped, that life itself has stopped, that it is petering out in a sort of twilight of half-being,” he wrote, in 1987. His journals convey a sense of tangible boredom. He transcribed long passages from philosophers and theologists (Simone Weil, Søren Kierkegaard, Gottfried Wilhelm Leibniz, Dietrich Bonhoeffer) and embarked on disquisitions on the best definition of reality, the “metabolism of grace,” the “deep mystery of incubation.” His thoughts cast outward in many directions—notes for a thousand lectures—then tunnelled inward to the point of non-meaning. “Where Life is Free, Immaterial, full of Art,” he wrote, “the laws of life, of Grace, are those of Fitness .”

Sacks proposed various theories for why he had undergone what he called “psychic death.” He wondered if he had become too popular, merely a fuzzy symbol of compassionate care. “Good old Sacks—the House Humanist,” he wrote, mocking himself. He also considered the idea that his four decades of analysis were to blame. Was it possible, he wrote, that a “vivisection of inner life, however conceived, however subtle and delicate, may in fact destroy the very thing it examines?” His treatment with Shengold seemed to align with a life of “homeostasis”—intimacy managed through more and more language, in a contained, sterile setting, on Monday and Wednesday mornings, from 6:00 to 6:45 A . M . They still referred to each other as “Dr. Sacks” and “Dr. Shengold.” Once, they ran into each other at a chamber concert. They were a few rows apart, but they didn’t interact. Occasionally, Shengold told his children that he “heard from the couch” about a good movie or play, but he never shared what happened in his sessions. They inferred that Sacks was their father’s patient after reading the dedication to him in “Hat.”

As Sacks aged, he felt as if he were gazing at people from the outside. But he also noticed a new kind of affection for humans—“homo sap.” “They’re quite complex (little) creatures (I say to myself),” he wrote in his journal. “They suffer, authentically, a good deal. Gifted, too. Brave, resourceful, challenging.”

Perhaps because love no longer appeared to be a realistic risk—he had now entered a “geriatric situation”—Sacks could finally confess that he craved it. “I keep being stabbed by love,” he wrote in his journal. “A look. A glance. An expression. A posture.” He guessed that he had at least five, possibly ten, more years to live. “I want to, I want to ••• I dare not say. At least not in writing.”

In 2008, Sacks had lunch with Bill Hayes, a forty-seven-year-old writer from San Francisco who was visiting New York. Hayes had never considered Sacks’s sexuality, but, as soon as they began talking, he thought, “Oh, my God, he’s gay,” he told me. They lingered at the table for much of the afternoon, connecting over their insomnia, among other subjects. After the meal, Sacks wrote Hayes a letter (which he never sent) explaining that relationships had been “a ‘forbidden’ area for me—although I am entirely sympathetic to (indeed wistful and perhaps envious about) other people’s relationships.”

A year later, Hayes, whose partner of seventeen years had died of a heart attack, moved to New York. He and Sacks began spending time together. At Sacks’s recommendation, Hayes started keeping a journal, too. He often wrote down his exchanges with Sacks, some of which he later published in a memoir, “Insomniac City.”

“It’s really a question of mutuality, isn’t it?” Sacks asked him, two weeks after they had declared their feelings for each other.

“Love?” Hayes responded. “Are you talking about love?”

“Yes,” Sacks replied.

Sacks began taking Hayes to dinner parties, although he introduced him as “my friend Billy.” He did not allow physical affection in public. “Sometimes this issue of not being out became very difficult,” Hayes told me. “We’d have arguments, and I’d say things like ‘Do you and Shengold ever talk about why you can’t come out? Or is all you ever talk about your dreams?’ ” Sacks wrote down stray phrases from his dreams on a whiteboard in his kitchen so that he could report on them at his sessions, but he didn’t share what happened in therapy.

Kate Edgar, who worked for Sacks for three decades, had two brothers who were gay, and for years she had advocated for gay civil rights, organizing Pride marches for her son’s school. She intentionally found an office for Sacks in the West Village so that he would be surrounded by gay men living openly and could see how normal it had become. She tended to hire gay assistants for him, for the same reason. “So I was sort of plotting on that level for some years,” she told me.

In 2013, after being in a relationship with Hayes for four years—they lived in separate apartments in the same building—Sacks began writing a memoir, “On the Move,” in which he divulged his sexuality for the first time. He recounts his mother’s curses upon learning that he was gay, and his decades of celibacy—a fact he mentions casually, without explanation. Edgar wondered why, after so many years of analysis, coming out took him so long, but, she said, “Oliver did not regard his relationship with Shengold as a failure of therapy.” She said that she’d guessed Shengold had thought, “This is something Oliver has to do in his own way, on his own time.” Shengold’s daughter, Nina, said that, “for my dad to have a patient he loved and respected finally find comfort in identifying who he’d been all his life—that’s growth for both of them.”

A few weeks after finishing the manuscript, Sacks, who’d had melanoma of the eye in 2005, learned that the cancer had come back, spreading to his liver, and that he had only months to live. He had tended toward hypochondria all his life, and Edgar thought that the diagnosis might induce a state of chronic panic. Since he was a child, Sacks had had a horror of losing things, even irrelevant objects. He would be overcome by the “feeling that there was a hole in the world ,” he wrote in his journal, and the fear that “I might somehow fall through that hole-in-the-world, and be absolutely, inconceivably lost.” Edgar had dealt for decades with his distress over lost objects, but she noticed that now, when he misplaced things, he didn’t get upset. He had an uncharacteristic ease of being.

In the summer of 2015, before Shengold went on his annual summer break, Sacks said to Edgar, “If I’m alive in September when Shengold returns, I’m not sure I need to go back to my sessions.” They had been seeing each other for forty-nine years. Sacks was eighty-two; Shengold was eighty-nine.

When Sacks was struggling with his third book, “ A Leg to Stand On ,” which was about breaking his leg and his frustration that his doctors wouldn’t listen to him, he wrote in his journal that Shengold had suggested (while apologizing for the corniness of the phrase) that the book should be “a message of love”—a form of protest against the indifference that so many patients find in their doctors. Shengold may have been giving Sacks permission to see their own relationship—the one place in which Sacks felt an enduring sense of recognition and care—as a hidden subject of the book. Extending Shengold’s idea, Sacks wrote, of his book, “The ‘moral’ center has to do with . . . the irreducible ultimate in doctor-patient relations.”

In August, two weeks before Sacks died, he and Shengold spoke on the phone. Shengold was with his family at a cottage in the Finger Lakes region of central New York, where he spent every summer. Nina told me, “We all gathered in the living room of that little cottage and put my father on speakerphone. Oliver Sacks was clearly on his deathbed—he was not able to articulate very well. Sometimes his diction was just gone. Dad kept shaking his head. He said, ‘I can’t understand you. I’m so sorry, I can’t understand you.’ ” At the end of the call, Shengold told Sacks, “It’s been the honor of my life to work with you,” and said, “Goodbye, Oliver.” Sacks responded, “Goodbye, Leonard.” It was the first time they had ever used each other’s first names. When they hung up, Shengold was crying.

After Sacks died, Shengold started closing down his practice. “It was the beginning of the end for him,” his son David told me. “He had lost most of his colleagues. He was really the last of his generation.” Nina said, “I do think part of why my father lived so long and was able to work so long was because of that relationship. That feeling of affection and kindred spirit was lifesaving.”

In “Awakenings,” when describing how Leonard L.—his “ ‘ideal’ patient”—initially responded to L-dopa, Sacks characterizes him as “a man released from entombment” whose “predominant feelings at this time were feelings of freedom, openness, and exchange with the world.” He quotes Leonard saying, “I have been hungry and yearning all my life . . . and now I am full.” He also says, “I feel saved. . . . I feel like a man in love. I have broken through the barriers which cut me off from love.’ ”

For years, Sacks had tested the possibility of awakenings in others, as if rehearsing, or outsourcing, the cure he had longed to achieve with Shengold. But at the end of his life, like an inside-out case study, he inhabited the story he’d imagined for his patients. “All of us entertain the idea of another sort of medicine . . . which will restore us to our lost health and wholeness,” he wrote, in “Awakenings.” “We spend our lives searching for what we have lost; and one day, perhaps, we will suddenly find it.” ♦

A Lisp Interpreter Implemented in Conway's Game of Life (2021)

Hacker News
woodrush.github.io
2025-12-13 03:06:53
Comments...
Original Article

A screenshot of the Lisp in Life architecture.

Lisp in Life is a Lisp interpreter implemented in Conway’s Game of Life.

The entire pattern is viewable on the browser here .

To the best of my knowledge, this is the first time a high-level programming language was interpreted in Conway’s Game of Life.

Running Lisp on the Game of Life

Lisp is a language with a simple and elegant design, having an extensive ability to express sophisticated ideas as simple programs. Notably, the powerful feature of macros could be used to modify the language’s syntax to write programs in a highly flexible way. For example, macros can be used to introduce new programming paradigms to the language, as demonstrated in object-oriented-like.lisp (which can actually be evaluated by the interpreter, although complex programs take quite a long time to finish running), where a structure and syntax similar to classes in Object Oriented Programming is constructed. Despite the expressibility of Lisp, it is the world’s second oldest high-level programming language introduced in 1958, only to be preceded by Fortran.

Conway’s Game of Life is a cellular automaton proposed in 1970. Despite it having a very simple set of rules, it is known to be Turing Complete. Lisp in Life demonstrates this fact in a rather straightforward way.

How can simple systems allow human thoughts to be articulated and be expanded? With the expressibility of Lisp and the basis of Conway’s Game of Life, Lisp in Life provides an answer to this question.

Input and Output

The Lisp program is provided by editing certain cells within the pattern to represent the ASCII-encoding of the Lisp program. The pattern directly reads this text and evaluates the results. You can also load your own Lisp program into the pattern and run it. The standard output is written at the bottom end of the RAM module, which can be easily located and directly examined in a Game of Life viewer. The Lisp implementation supports lexical closures and macros, allowing one to write Lisp programs in a Lisp-like taste, as far as the memory limit allows you to.

The Lisp interpreter is written in C. Using the build system for this project, you can also compile your own C11-compatible C code and run in on Conway’s Game of Life.

Previous Work

As previously mentioned, to the best of my knowledge, this is the first time a high-level programming language was interpreted in Conway’s Game of Life.

The entry featuring Universal Computers in LifeWiki has a list of computers created in the Game of Life. Two important instances not mentioned in this entry are the Quest For Tetris (QFT) Project created by the authors of the QFT project, and APGsembly created by Adam P. Goucher. All of these work are designed to run an assembly language and are not designed to interpret a high-level language per se.

An example of a compiled high-level language targeted for the Game of Life is Cogol by the QFT project. Cogol is compiled to the assembly language QFTASM, targeted for the QFT architecture. Although Cogol is targeted for the QFT architecture, it requires compilation to QFTASM for the code to be run in the QFT architecture.

In Lisp in Life, a modified version of the QFT architecture is first created for improving the pattern’s runtime. Modifications include introducing a new cascaded storage architecture for the ROM, new opcodes, extending the ROM and RAM address space, etc. The Lisp source code is then written into the computer’s RAM module as its raw binary ASCII format. The Conway’s Game of Life pattern directly reads, parses, and evaluates this Lisp source code to produce its output. This feature of allowing a Conway’s Game of Life pattern to evaluate a high-level programming language expressed as a string of text is a novel feature that was newly achieved in this project.

Video

Here is a YouTube video showing Lisp in Life in action:

YouTube video of Lisp in Life.

Screenshots

An overview of the entire architecture.

An overview of the entire architecture.

An overview of the CPU and its surrounding units.

An overview of the CPU and its surrounding modules. On the top are the ROM modules, with the lookup module on the right, and the value modules on the left. On the bottom left is the CPU. On the bottom right is the RAM module.

This pattern is the VarLife version of the architecture. VarLife is an 8-state cellular automaton defined in the Quest For Tetris (QFT) Project, which is used as an intermediate layer to create the final Conway’s Game of Life pattern. The colors of the cells indicate the 8 distinct states of the VarLife rule.

The architecture is based on Tetris8.mc in the original QFT repository . Various modifications were made to make the pattern compact, such as introducing a new lookup table architecture for the ROM, removing and adding new opcodes, expanding the ROM and RAM address space, etc.

The Conway's Game of Life version of the architecture, converted from the VarLife pattern.

The Conway’s Game of Life version of the architecture, converted from the VarLife pattern. What appears to be a single cell in this image is actually an OTCA metapixel zoomed away to be shown 2048 times smaller.

A close-up view of a part of the ROM module in the Conway's Game of Life version.

A close-up view of a part of the ROM module in the Conway’s Game of Life version. Each pixel in the previous image is actually this square-shaped structure shown in this image. These structures are OTCA metapixels , which can be seen to be in the On and Off meta-states in this image. The OTCA Metapixel is a special Conway’s Game of Life pattern that can emulate cellular automatons with customized rules. The original VarLife pattern is simulated this way so that it can run in Conway’s Game of Life.

The OTCA Metapixel simulating Life in Life can be seen in this wonderful video by Phillip Bradbury: https://www.youtube.com/watch?v=xP5-iIeKXE8

A video of the RAM module of the computer in the VarLife rule in action.

A video of the RAM module in the VarLife rule in action.

The computer showing the results of the computation of `(print (* 3 14))`.

The computer showing the results of the following Lisp program:

(define mult (lambda (m n)
  (* m n)))

(print (mult 3 14))

The result is 42 , shown in binary ascii format ( 0b110100 , 0b110010 ), read in bottom-to-up order.

As shown in this image, the standard output of the Lisp program gets written at the bottom end of the RAM module, and can be directly viewed in a Game of Life viewer. This repository also contains scripts that run on Golly to decode and view the contents of the output as strings.

How is it Done?

The build flow of Lisp in Life.

The Lisp interpreter , written in C, is compiled to an assembly language for a CPU architecture implemented in the Game of Life, which is a modification of the computer used in the Quest For Tetris (QFT) project. The compilation is done using an extended version of ELVM (the Esoteric Language Virtual Machine). The Game of Life backend for ELVM was implemented by myself.

Generating a small enough pattern that runs in a reasonable amount of time required a lot of effort. This required optimizations and improvements in every layer of the project; a brief summary would be:

  • The C Compiler layer - adding the computed goto feature to the C compiler, preserving variable symbols to be used after compilation, etc.
  • The C layer (the Lisp interpreter ) - using a string hashtable and binary search for Lisp symbol lookup, minimization of stack region usage with union memory structures, careful memory region map design, etc.
  • The QFTASM layer - writing a compiler optimizer to optimize the length of the assembly code
  • The VarLife layer (the CPU architecture) - creating a lookup table architecture for faster ROM access, expanding the size and length of the RAM module, adding new opcodes, etc.
  • The Game of Life layer - Hashlife -specific optimization

A more detailed description of the optimizations done in this project is available in the Implementation Details section.

Conversion from VarLife to Conway’s Game of Life

VarLife is an 8-state cellular automaton defined in the Quest For Tetris (QFT) Project. It is used as an intermediate layer to generate the final Conway’s Game of Life pattern; the computer is first created in VarLife, and then converted to a Game of Life pattern.

When converting VarLife to Conway’s Game of Life, each VarLife cell is mapped to an OTCA Metapixel (OTCAMP). The conversion from VarLife to the Game of Life is done in a way so that the behavior of the states of the VarLife pattern matches exactly with the meta-states of the OTCA Metapixels in the converted Game of Life pattern. Therefore, it is enough to verify the behavior of the VarLife pattern to verify the behavior of the Game of Life pattern.

Due to the use of OTCA Metapixels, each VarLife cell becomes extended to a 2048x2048 Game of Life cell, and 1 VarLife generation requires 35328 Game of Life generations. Therefore, the VarLife patterns run significantly faster than the Game of Life (GoL) version.

Additional details on VarLife are available in the Miscellaneous section.

Pattern Files

Program VarLife Pattern Conway’s Game of Life Pattern
print.lisp QFT_print.mc QFT_print_metafied.mc
lambda.lisp QFT_lambda.mc QFT_lambda_metafied.mc
printquote.lisp QFT_printquote.mc QFT_printquote_metafied.mc
factorial.lisp QFT_factorial.mc QFT_factorial_metafied.mc
z-combinator.lisp QFT_z-combinator.mc QFT_z-combinator_metafied.mc
backquote-splice.lisp QFT_backquote-splice.mc QFT_backquote-splice_metafied.mc
backquote.lisp QFT_backquote.mc QFT_backquote_metafied.mc
object-oriented-like.lisp QFT_object-oriented-like.mc QFT_object-oriented-like_metafied.mc
primes-print.lisp QFT_primes-print.mc QFT_primes-print_metafied.mc
primes.lisp QFT_primes.mc QFT_primes_metafied.mc

Pattern files preloaded with various Lisp programs are available here. Detailed statistics such as the running time and the memory consumption are available in the Running Times and Statistics section.

The patterns can be simulated on the Game of Life simulator Golly .

The VarLife patterns can be simulated on Golly as well. To run the VarLife patterns, open Golly and see File -> Preferences -> Control, and Check the “Your Rules” directory. Open the directory, and copy https://github.com/woodrush/QFT-devkit/blob/main/QFT-devkit/Varlife.rule to the directory.

Descriptions of the Lisp Programs

  • object-oriented-like.lisp : This example creates a structure similar to classes in Object-Oriented Programming, using closures.

    • The class has methods and field variables, where each instance carries distinct and persistent memory locations of their own. The example instantiates two counters and concurrently modifies the value held by each instance.
    • New syntaxes for instantiation and method access, (new classname) and (. instance methodname) , are introduced using macros and functions.

    The Lisp interpreter’s variable scope and the macro feature is powerful enough to manage complex memory management, and even providing a new syntax to support the target paradigm.

  • printquote.lisp : A simple demonstration of macros.

  • factorial.lisp : A simple demonstration of recursion with the factorial function.

  • z-combinator.lisp : Demonstration of the Z Combinator to implement a factorial function using anonymous recursion .

  • backquote-splice.lisp : Implements the backquote macro used commonly in Lisp to construct macros. It also supports the unquote and unquote-splice operations, each written as ~ and ~@ .

  • primes.lisp : Prints a list of prime numbers up to 20. This example highlights the use of the while syntax.

The contents of print.lisp is quite straightforward - it calculates and prints the result of 3 * 14 . backquote.lisp and primes-print.lisp are similar to backquote-splice.lisp and primes.lisp, mainly included for performance comparisons. backquote.lisp doesn’t implement the unquote-splice operation, and demonstrates some more examples. primes-print.lisp reduces the number of list operations to save memory usage.

Details of the Lisp Interpreter

Special Forms and Builtin Functions

  • define
  • if
  • quote
  • car, cdr
  • cons
  • list
  • atom
  • print
  • progn
  • while
  • lambda, macro
  • eval
  • eq
  • +, -, *, /, mod, <, >

Lexical Closures

This Lisp interpreter supports lexical closures. The implementation of lexical closures is powerful enough to write an object-oriented-like code as shown in object-oriented-like.lisp , where classes are represented as lexical closures over the field variables and the class methods.

Macros

This Lisp interpreter supports macros. Lisp macros can be thought as a function that receives code and returns code. Following this design, macros are treated exacly the same as lambdas, except that it takes the arguments as raw S-expressions, and evaluates the result twice (the first time to build the expression, and the second time to actually evaluate the builded expression).

Running Times and Statistics

VarLife Patterns

Lisp Program and Pattern (VarLife) #Halting Generations (VarLife) Running Time (VarLife) Memory Usage (VarLife)
print.lisp [ pattern ] 105,413,068 (exact) 1.159 mins 5.0 GiB
lambda.lisp [ pattern ] 700,000,000 2.966 mins 12.5 GiB
printquote.lisp [ pattern ] 800,000,000 3.424 mins 12.5 GiB
factorial.lisp [ pattern ] 1,000,000,000 5.200 mins 17.9 GiB
z-combinator.lisp [ pattern ] 1,700,000,000 9.823 mins 23.4 GiB
backquote-splice.lisp [ pattern ] 4,100,000,000 20.467 mins 27.5 GiB (max.)
backquote.lisp [ pattern ] 4,100,000,000 21.663 mins 27.5 GiB (max.)
object-oriented-like.lisp [ pattern ] 4,673,000,000 22.363 mins 27.5 GiB (max.)
primes-print.lisp [ pattern ] 8,880,000,000 27.543 mins 27.5 GiB (max.)
primes.lisp [ pattern ] 9,607,100,000 38.334 mins 27.5 GiB (max.)

Conway’s Game of Life (GoL) Patterns

Lisp Program and Pattern (GoL) #Halting Generations (GoL) Running Time (GoL) Memory Usage (GoL)
print.lisp [ pattern ] 3,724,032,866,304 382.415 mins 27.5 GiB (max.)
lambda.lisp [ pattern ] 24,729,600,000,000 1372.985 mins 27.5 GiB (max.)
printquote.lisp [ pattern ] 28,262,400,000,000 1938.455 mins 27.5 GiB (max.)
factorial.lisp [ pattern ] 35,328,000,000,000 3395.371 mins 27.5 GiB (max.)
z-combinator.lisp [ pattern ] 60,057,600,000,000 - -
backquote-splice.lisp [ pattern ] 144,844,800,000,000 - -
backquote.lisp [ pattern ] 144,844,800,000,000 - -
object-oriented-like.lisp [ pattern ] 165,087,744,000,000 - -
primes-print.lisp [ pattern ] 313,712,640,000,000 - -
primes.lisp [ pattern ] 339,399,628,800,000 - -

Common Statistics

Lisp Program #QFT CPU Cycles QFT RAM Usage (Words)
print.lisp 4,425 92
lambda.lisp 13,814 227
printquote.lisp 18,730 271
factorial.lisp 28,623 371
z-combinator.lisp 58,883 544
backquote-splice.lisp 142,353 869
backquote.lisp 142,742 876
object-oriented-like.lisp 161,843 838
primes-print.lisp 281,883 527
primes.lisp 304,964 943

The running times for each program are shown above. The Hashlife algorithm used for the simulation requires a lot of memory in exchange of speedups. The simulations were run on a 32GB-RAM computer, with Golly’s memory usage limit set to 28000 MB, and the default base step to 2 (configurable from the preferences). The memory usage was measured by Ubuntu’s activity monitor. “(max.)” shows where the maximum permitted memory was used. The number of CPU cycles and the QFT memory usage was obtained by running the QFTASM interpreter on the host PC. The QFT memory usage shows the number of RAM addresses that were written at least once. The memory usage is measured in words, which is 16 bits in this architecture.

All of the VarLife patterns can actually be run on a computer. The shortest running time is about 1 minute for print.lisp . A sophisticated program such as object-oriented-like.lisp can even run in about 22 minutes.

On the other hand, the Game of Life patterns take significantly more time than the VarLife patterns, but for short programs it can be run in a moderately reasonable amount of time. For example, print.lisp finishes running in about 6 hours in the Game of Life pattern. As mentioned in the “Conversion from VarLife to Conway’s Game of Life” section, since the Game of Life pattern emulates the behavior of the VarLife pattern using OTCA Metapixels, the behavior of the Game of Life patterns can be verified by running the VarLife patterns.

Tests

There are tests to check the behavior of the Lisp interpreter. There is a test for checking the QFTASM-compiled Lisp interpreter using the QFTASM interpreter, and a test for checking the GCC-compiled Lisp interpreter on the host pc. To run these tests, use the following commands:

git submodule update --init --recursive # Required for building the source

make test             # Run the tests for the QFTASM-compiled Lisp interpreter, using the QFTASM interpreter
make test_executable  # Run the tests for the executable compiled by GCC

Running make test requires Hy , a Clojure-like Lisp implemented in Python available via pip install hy . Some of the tests compare the output results of Hy and the output of the QFTASM Lisp interpreter.

The tests were run on Ubuntu and Mac.

Building from Source

This section explains how to load the Lisp interpreter (written in C) to the Game of Life pattern, and also how to load a custom Lisp program into the pattern to run it on Game of Life.

Please see build.md from the GitHub repository.

Implementation Details

This section describes the implementation details for the various optimizations for the QFT assembly and the resulting Game of Life pattern.

The C Compiler layer

  • Added the computed goto feature to ELVM
    • This was merged into the original ELVM project.
  • Modified the compiler to preserve and output memory address symbols and program address symbols, for their usage in the compiler optimization tool in the QFTASM layer
    • This allows to use memheader.eir , so that symbols used in the C source can be referenced in the ELVM assembly layer using the same variable symbols.

The ELVM Assembly layer

  • Wrote the QFTASM backend for ELVM
    • This was merged into the original ELVM project.
  • Added further improvements to the QFTASM backend:
    • Let the ELVM assembly’s memory address space match QFT’s native memory address space
      • Originally, the ELVM assembly had to convert its memory address every time when a memory access occurs.
    • Support new opcodes added in the improved QFT architecture

The C layer (the implementation of the Lisp interpreter)

Usage of binary search and hashtables for string representations and comparisons

By profiling the GCC-compiled version of the Lisp interpreter, it was found that the string table lookup process was a large performance bottleneck. This was a large room for optimization.

The optimized string lookup process is as follows. First, when the Lisp parser accepts a symbol token, it creates a 4-bit hash of the string with the checksum of the ASCII representation of the string. The hash points to a hashtable that holds the root of a binary search tree for string comparison. Each node in the tree holds the string of the symbol token, and two nodes that are before and after the token in alphabetical order. When a query symbol token arrives in the parsing phase, a node with a matching token is returned, or a new node for the token is added into this binary tree if the token does not exist yet. This allows for each distinct symbol in the S-expression to have a distinct memory address.

In the interpretation phase, since each distinct symbol has a distinct memory address, and every string required for the Lisp program has already been parsed, string comparison can be done by simply comparing the memory address of the tokens. Since the interpreter only uses string equality operations for string comparison, simply checking for integer equality suffices for string comparison, speeding up the interpretation phase. Since the hash key is 4 bits long, this allows for reducing 4 searches in the binary tree compared to using a single binary tree.

Usage of jump hash tables for the special form evaluation procedure searches

There are 17 distinct procedures for evaluating the special forms in the Lisp interpreter, define , if , quote , car , cdr , cons , atom , print , progn , while , { lambda , macro }, eval , eq , { + , - , * , / , mod }, { < , > }, list , and lambda/macro invocations (when if the token is not a special form). Using an if statement to find the corresponding procedure for a given token becomes a linear search for the token comparisons. To speed up this search process, a hash table is created for jumping to the corresponding procedures. Since the memory addresses for the special forms can be determined before parsing the Lisp program, all of the symbols for the special forms have a fixed memory address. Therefore, the hash key can be created by subtracting an offset to the symbol’s memory address, to point to a hashtable that is created near the register locations. This hashtable is provided in memheader.eir . When the hash key is larger than the regions of this hashtable, it means that the symbol is not a special form, so the evaluation jumps to the lambda/macro invocation procedure.

The Lisp implementation has 3 distinct value types, ATOM , INT , and LAMBDA . Each value only consumes one QFT byte of memory; the ATOM value holds the pointer to the symbol’s string hashtable, the INT value holds the signed integer value, and LAMBDA holds a pointer to the Lambda struct, as well as its subtype information, of either LAMBDA , MACRO , TEMPLAMBDA and TEMPMACRO . (The TEMPLAMBDA and TEMPMACRO subtypes are lambda and macro types that recycles its argument value memory space every time it is called, but is unused in the final lisp programs.) Since the RAM’s address space is only 10 bits, there are 6 free bits that can be used for addresses holding pointers. Therefore, the value type and subtype information is held in these free bits. This makes the integer in the Lisp implementation to be a 14-bit signed integer, ranging from -8192 to 8191.

Minimization of Stack Region Usage

Since the C compiler used in this project does not have memory optimization features, this has to be done manually within the C source code. This led to the largest reason why the interpreter’s source code seems to be obfuscated.

One of the largest bottlenecks for memory access was stack region usage. Every time a stack region memory access occurs, the assembly code performs memory address offset operations to access the stack region. This does not happen when accessing the heap memory, since there is only one heap region used in the entire program, so the pointers for global variables can be hard-coded by the assembler. Therefore, it is favorable optimization-wise to use the heap memory as much as possible.

One way to make use of this fact is to use as much global variables as possible. Since registers and common RAM memory share the same memory space, global variables can be accessed with a speed comparable to registers (However, since the physical location of the RAM memory slot within the pattern affects the I/O signal arrival time, and the registers have the most smallest RAM addresses, i.e. they are the closest to the CPU unit, the registers have the fastest memory access time).

Another method of saving memory was to use union memory structures to minimize the stack region usage. In the C compiler used in this project, every time a new variable is introduced in a function, the function’s stack region usage (used per call) is increased to fit all of the variables. This happens even when two variables never appear at the same time. Therefore, using the fact that some variables never appear simultaneously, unions are used for every occurence of such variables, so that they can use a shared region within the stack space. This led to minimization of the stack region usage. Since the stack region is only 233 hextets (1 byte in the QFT RAM is 16 bits) large, this allowed to increase the number of nested function calls, especially the nested calls of eval which evaluates the S-expressions. Since the S-expressions have a list structure, and eval becomes nested when lambdas are called in the Lisp program, this optimization was significant for allowing more sophisticated Lisp programs to be run in the architecture.

The QFTASM layer

The QFT assembly generated by the C compiler has a lot of room for optimization. I therefore created a compiler optimization tool to reduce the QFTASM assembly size.

Constant folding

Immediate constant expressions such as ADD 1 2 destination is folded to a MOV operation.

MOV folding

The QFT assembly code can be splitted into subregions by jump operations, such that:

  • Each subregion doesn’t contain any jump operations
  • Each subregion ends with a jump operation
  • Every jump operation in the assembly is guaranteed to jump to the beginning of a subregion, and never to the middle of any subregion

The last guarantee where jumps never occur in the middle of a subregion is provided by the C compiler. The ELVM assembly’s program counter is designed so that it increases only when a jump instruction appears. This makes an ELVM program counter to point to a sequence of multiple instructions, instead of a single instruction. Since the ELVM assembly uses the ELVM program counter for its jump instructions, it is guaranteed that the jump instructions in the QFT assembly never jump to the middle of any subregion, and always jumps to a beginning of a subregion.

In each subregion, the dependency graph for the memory address is created. If a memory address becomes written but is later overwritten without becoming used in that subregion at all, the instruction to write to that memory address is removed. Since it is guaranteed that jump operations never jump to the middle of any subregion, it is guaranteed that the overwritten values can be safely removed without affecting the outcome of the program. The MOV folding optimization makes use of this fact to remove unnecessary instructions.

This folding process is also done with dereferences; if a dereferenced memory address is written, and the address is overwritten without being used at all, and the dereference source is unoverwritten at all during this process, the instruction for writingto the dereferenced memory address is removed.

Jump folding

If the destination of a conditional or fixed-destination jump instruction points to another jump instruction with a fixed destination, the jump destination is folded to the latter jump instruction’s destination.

A similar folding is done when a fixed jump instruction points to a conditional jump instruction, where the fixed jump instruction is replaced by the latter conditional jump instruction.

The Varlife layer (the computer architecture)

Created the with a lookup table structure for the ROM module

In this image of the CPU and its surrounding modules, the two modules on the top are the ROM modules. The original ROM module had one table, with the memory address as the key and the instruction as the value. I recreated the ROM module to add a lookup table layer, where each distinct instruction (not the opcodes, but the entire instruction including the values used within) holds a distinct serial integer key. The ROM module on the right accepts a program counter address and returns the instruction key for the program counter. The module on the left accepts the instruction key and returns the actual bits of the instruction as the output. This allows for dictionary compression to be performed to the ROM data, saving a lot of space. Since the instructions are 45 bits and the instruction keys are only 10 bits, the instruction key table is 1/4 the size of the original ROM module. Although the ROM size is 3223 for the entire Lisp interpreter, there were only 616 distinct instructions in the Lisp interpreter, making the size of the instruction table be 616 ROM units high, effectively reducing the ROM module size altogether.

The ROM module features another form of compression, where absence of cells are used to represent 0-valued bits within the instruction. Below is a close-up look of the ROM value module:

The ROM value module

Notice that some cells on the left are absent, despite the table being expected to be a rectangular shape. This is because absent cells do not emit any signals, hence effectively emitting 0-valued bits as the output. To use this fact, all of the instructions are first alphabetically ordered at table creation time, so that instructions that start with trailing zeroes become located higher in the table (further from the signal source). This allows for a maximum number of cells to be replaced with absent units to represent 0-valued bits. In fact, the instruction for no-ops is represented as all zeroes, so all of the units in the value module are replaced by absent cells. The no-op instruction appears a lot of times immediately after the jump operation, due to the fact that the QFT architecture has a branch delay when invoking a jump instruction, requiring a no-op instruction to be present to compensate for the delay.

Added new optimized instructions to the ALU, and removed unused ones

I removed the AND , OR , SL (shift left), SRL (shift right logical), and the SRA (shift right arithmetical) opcodes, and added the SRU (shift right unit) and SRE (shift right eight) opcodes to the architecture. Since there already were opcodes for XOR (bitwise-xor) and ANT (bitwise-and-not), AND and OR , which were not used much in the interpreter, could be replaced by these opcodes. The bitshift operations had significantly larger patterns than the other opcodes, being more than 10 times larger than the other opcodes. These were reduced to a fixed-size shift operations which could be implemented in the same sizes as the other opcodes. Since the shift left opcode can be replaced by consecutively adding its own value, effectively multiplying by powers of 2, the opcode was safely removed. The main reason for the original bitshift units being large was due to the shift amounts being dependent on the values of the RAM. Converting a binary value to a physical (in-pattern) shift amount required a large pattern. On the other hand, shifting a fixed value could be implemented by a significantly more simpler pattern. The shift right eight instruction is mainly used for reading out the standard input, where each ASCII character in the input string is packed into one 16-bit RAM memory address.

This resulted in a total of exactly 8 opcodes, ANT , XOR , SRE , SRU , SUB , ADD , MLZ , and MNZ . Since this can fit in 3 bits, the opcode region for the instruction value was reduced by 1 bit. Since the RAM module is 10 bits, and the third value of the instruction is always the writing destination of the RAM, and the first instruction can also be made so that it becomes the reading source address of the RAM, this allows for an additional 6*2=12 bits to be reduced from the instruction length. These altogether has reduced the ROM word size from 58 to 45 bits, reducing nearly 1/4 of the original instruction size.

Extended the ROM and RAM address space from 9,7-bit to 12,10-bit

The original QFT architecture had a ROM and RAM address space of 9 and 7 bits. I extended the ROM and RAM address space to 12 and 10 bits, respectively. This was not a straightforward task as it first seemed, since the signal arrival timings between the modules had to be carefully adjusted in order for the signals to line up correctly. This involved reverse-engineering and experimenting undocumented VarLife pattern units used in the original QFT architecture. The same held for when redesigning other parts of the architecture.

Reducing the Standard Input Size

Since each byte of the RAM module can be ordered arbitrarily in the CPU’s architecture, the RAM is arranged so that the standard output is written at the very bottom of the RAM module, and proceeds upwards. Therefore, the contents of the RAM can easily be observed in a Game of Life viewer by directly examining the bottom of the RAM module.

Since RAM has 16 bits of memory per memory address, it allows to fit two ASCII-encoded characters per one address. Therefore, the standard input is read out by reading two characters per address. For the standard output, one character is written to one address for aesthetic reasons, so that the characters can be directly observed in a Game of Life viewer the pattern more easily. Also, for the standard output to proceed upwards within the RAM module pattern, the memory pointer for the standard output proceeds backwards in the memory space, while the pointer for the standard input proceeds forwards in the memory space.

The Game of Life layer

Optimizing the Game of Life layer mainly revolved around understanding the Macrocell format for representing and saving Game of Life patterns, and the Hashlife algorithm. The Macrocell format uses quadtrees and memoization for compressing repeated patterns. Since the final Game of Life pattern is an array of OTCA metapixels which are 2048x2048 large, and even has repeated patterns in the VarLife layer (meaning that there are repeated configurations of OTCA metapixels), this compression reduces the file size for the QFT pattern significantly. The best example that let me understand the Macrocell format was an example provided by Adam P. Goucher in this thread in Golly’s mailing list.

The Hashlife algorithm also uses quadtrees and memoization to speed up the Game of Life simulations. This algorithm makes use of the fact that the same pattern in a same time frame influences only a fixed extent of its surrounding regions, hence allowing for memoization.

As for optimization, I first noticed that the QFT pattern had a 1-pixel high pattern concatenated to the entire pattern. The original QFT pattern in the original QFT repository was carefully designed so that it is composed of 8x8-sized pattern units. Therefore, most of the patterns can be represented by 8x8 tiles. However, since the 1-pixel high pattern at the top creates an offset that shifts away the pattern from this 8x8 grid, it causes the pattern to have fewer repeated patterns if interpreted from the corner of its bounding box, causing the memoization to work inefficiently. I therefore tried putting a redundant cell (which does not interfere with the rest of the pattern) to realign the entire pattern to its 8x8 grid, which actually slightly reduced the resulting Macrocell file size from the original one. Although I didn’t compare the running times, since the Hashlife algorithm uses memoization over repeated patterns as well, I expect this optimization to at least slightly contribute to the performance of the simulation.

Another optimization was improving the metafier script used to convert VarLife patterns to Game of Life ( MetafierV3.py ). The original script used a square region to fit the entire pattern to create the quadtree representation. However, since the Lisp in Life VarLife pattern is 968 pixels wide but 42354 pixels high, it tried to allocate a 65536x65536-sized integer array, which was prohibitively large to run. I modified the script so that it uses a rectangular region, where absent regions of the quadtree are represented as absent cells. Although this is very straightforward with the knowledge of the Macrocell format, it was difficult at first until I became fond of the algorithms surrounding the Game of Life.

Memory Region Map and the Phases of Operation

The memory region map of Lisp in Life.

The memory region map is carefully designed to save space. This is best described with the operation phases of the interpreter.

Phase 0: Precalculations

Various precalculations are done after the interpreter starts running. The construction of the string interning hashtable for reserved atoms such as define , quote , etc. are done in this phase. For the GCC-compiled interpreter, some variables that are defined in the QFT memory header are defined in the C source.

Since the outcome of these precalculations are always the same for any incoming Lisp program, this phase is done on the host PC, and the results are saved as ramdump.csv during the QFTASM compile time. The results are then pre-loaded into the RAM when the VarLife and Game of Life patterns are created. This allows to saves some CPU cycles when running the interpreter.

As explained earlier, the QFT architecture holds register values in the RAM. There are 11 registers, which are placed in the addresses from 0 to 10.

The reserved values in the image include strings such as reserved atoms and the destinations of the jump hashtable used for evaluation. The rest of the region is used for storing global variables in the interpreter’s C source code.

Phase 1: Parsing

The Lisp program provided from the standard input is parsed into S-expressions, which is written into the heap region.

Notice that the string interning hashtables are created in the later end of the stack region. This is because these hashtables are only used during the parsing phase, and can be overwritten during the evaluation phase. For most Lisp programs including the ones in this repository, the stack region does not grow far enough to overwrite these values. This allows to place 3 growing memory regions during the parsing phase, the stack region used for nested S-expressions, the heap region which stores the parsed S-expressions, and the string interning hashtables when new strings are detected within the Lisp program. Newly detected strings such as variable names in the Lisp program are also written into the heap region.

The heap region is also designed so that it overwrites the standard input as it parses the program. Since older parts of the program can be discarded once it is parsed, this allows to naturally free the standard input region which save a lot of space after parsing. The standard input also gets overwritten by the Standard output if the output is long enough. However, due to this design, long programs may have trouble at parsing, since the input may be overwritten too far and get deleted before it is parsed. A workaround for this is to use indentation which places the program further ahead into the memory, which will prevent the program from being overwritten from the growing heap region. For all of the programs included in this repository, this is not an issue and the programs become successfully parsed.

Phase 2: Evaluation

By this time, all of the contents of the stack region and what is ahead of the head of the heap region can be overwritten in the further steps. Note that a similar issue with the standard input happens with the standard output - when too many Lisp objects are created during runtime, it may overwrite the existing standard output, or may simply exceed the heap region and proceed into the stack region. Since the heap region is connected to the later end of the stack region, this may be safe if the standard output is carefully handled, but the interpreter will eventually start overwriting values of the stack region if the heap continues to grow.

Miscellaneous

How can a 2-state OTCA Metapixel emulate the behavior of an 8-state VarLife pattern?

This is one of the most interesting ideas in the original QFT project to make the QFT architecture possible. As explained in the original QFT post , the 8 states of VarLife are actually a mixture of 4 different birth/survival rules with binary states. This means that each VarLife cell can only transition between two fixed states, and the birth/survival rule for that cell does not change at any point in time. Moreover, the OTCA Metapixel is designed so that each metapixel can carry its own birth/survival rules. Therefore, each VarLife cell can be enoded into an OTCA Metapixel by specifying its birth/survival rule and the binary state. This means that the array of OTCA Metapixels in the metafied pattern is actually a mixture of metapixels with different birth/survival rules, arranged in a way so that it makes the computation possible.

Halting Time

After the program counter is set to 65535 and the program exits, no more ROM and RAM I/O signals become apparent in the entire module. This makes the VarLife pattern becomes completely stationary, where every pattern henceforth becomes completely identical. Defining this as the halting time for the calculation, the pattern for print.lisp halts at exactly 105,413,068 VarLife generations.

The halting time for the Game of Life patterns are defined similarly for the meta-states of the OTCA Metapixels. Since OTCA Metapixels never become stationary, the Game of Life states do not become stationary after the halting time, but the meta-states of the OTCA Metapixels will become stationary after the halting time.

For the VarLife pattern of print.lisp , by generation 105,387,540, the value 65535 gets written to the program counter. At generation 105,413,067, the last signal becomes just one step from disappearing, and at generation 105,413,068 and onwards, the pattern becomes completely stationary and every pattern becomes identical to each other. In the Game of Life version, since the OTCA Metapixel continues running indefinitely, the pattern does not become completly stationary, but the meta-states of the OTCA Metapixels will become completely stationary, since it is an emulation of the VarLife pattern. Note that the halting times for programs other than print.lisp is just a sufficient number of generations, and not the exact values.

The required number of generations per CPU cycle depends on many factors such as the ROM and RAM addresses and the types of opcodes, since the arriving times of the I/O signals depend on factors such as these as well. This makes the number of generations required for the program to halt become different between each program. For example, print.lisp has a rate of 23822.16 generations per CPU cycle (GpC), but z-combinator.lisp has a rate of 28870.81 GpC, and primes-print.lisp has 31502.43 GpC. 23822.16 GpC is in fact insufficient for z-combinator.lisp to finish running, and 28870.81 is also insufficient for primes-print.lisp to finish running.

Miscellaneous Screenshots

A close-up view of a part of the ROM module in the Conway's Game of Life version.

The ALU unit in the CPU. From the left are the modules for the ANT , XOR , SRE , SRU , SUB , ADD , MLZ , and the MNZ opcodes.

The SRE and the SRU opcodes were newly added for this project.

Credits

The CPU architecture used in this project was originally created by the members of the Quest For Tetris (QFT) project, and was later optimized and modified by Hikaru Ikuta for the Lisp in Life project. The VarLife cellular automaton rule was also defined by the members of the QFT project. The metafier for converting VarLife patterns to Conway’s Game of Life patterns was written by the members of the QFT project, and was later modified by Hikaru Ikuta to support the pattern size of the Lisp in Life architecture. The assembly language for the QFT architecture, QFTASM, was also originally designed by the members of the QFT project, and was later modified by Hikaru Ikuta for this project for achieving a feasible running time. The Lisp interpreter was written by Hikaru Ikuta. The compilation of the interpreter’s C source code to the ELVM assembly is done using an extended version of 8cc written by Rui Ueyama from Google. The compilation from the ELVM assembly to QFTASM is done by an extended version of ELVM (the Esoteric Language Virtual Machine), a project by Shinichiro Hamaji from Preferred Networks, Inc. The Game of Life backend for ELVM was written by Hikaru Ikuta, and was later further extended by Hikaru for the Lisp in Life project.

The Coming Need for Formal Specification

Lobsters
benjamincongdon.me
2025-12-13 03:00:36
Comments...
Original Article

In late 2022, I had a conversation with a senior engineer on the coming problem of “what to do when AI is writing most of the code”. His opinion, which I found striking at the time, was that engineers would transition from writing mostly “implementation” code, to mostly writing tests and specifications.

I remember thinking at the time that this was prescient. With three years of hindsight, it seems like things are trending in a different direction. I thought that the reason that testing and specifications would be useful was that AI agents would be struggling to “grok” coding for quite some time, and that you’d need to have robust specifications such that they could stumble toward correctness.

In reality, AI written tests were one of the first tasks I felt comfortable delegating. Unit tests are squarely in-distribution for what the models have seen on all public open source code. There’s a lot of unit tests in open source code, and they follow predictable patterns. I’d expect that the variance of implementation code – and the requirement for out-of-distribution patterns – is much higher than testing code. The result is that models are now quite good at translating English descriptions into quite crisp test cases. 1

System Design

There exists a higher level problem of holistic system behavior verification, though. Let’s take a quick diversion into systems design to see why.

System design happens on multiple scales. You want systems to be robust – both in their runtime, and their ability to iteratively evolve. This nudges towards decomposing systems into distinct components, each of which can be internally complicated but exposes a firm interface boundary that allows you to abstract over this internal complexity.

If we design things well, we can swap out parts of our system without disrupting other parts or harming the top-level description of what the system does. We can also perform top-down changes iteratively – adding new components, and retiring old ones, at each level of description of the system.

This all requires careful thinking of how to build these interfaces and component boundaries in such a way that (1) there is a clean boundary between components and (2) that stringing all the components together actually produces the desired top-level behavior.

To do this effectively, we require maps of various levels of description of the system’s territory . My conjecture is that code is not a good map for this territory .

To be clear, I’ve found a lot of value in throwing out system diagram maps and looking directly at the code territory when debugging issues. However, code-level reasoning is often not the best level of abstraction to use for reasoning about systems. This is for a similar reason that “modeling all the individual molecules of a car” is not a great way to estimate that car’s braking distance.

LLMs have increasingly longer context windows, so one could naively say “just throw all the code in the context and have it work it out”. Perhaps. But this is still just clearly not the most efficient way to reason about large-scale systems.

Formal Verification

The promise of formal verification is that we can construct provably composable maps which still match the ground-level territory. Formal verification of code allows you to specify a system using mathematical proofs, and then exhaustively prove that a system is correct. As an analogy: unit tests are like running an experiment. Each passing test is an assertion that, for the conditions checked, the code is correct. There could still exist some untested input that would demonstrate incorrect behavior. You only need one negative test to show the code is incorrect, but only a provably exhaustive set of inputs would be sufficient to show the code is fully correct. Writing a formal verification of a program is more like writing a proof. Writing a self-consistent proof is sufficient to show that the properties you’ve proven always hold.

I saw Martin Kleppmann’s “Prediction: AI will make formal verification go mainstream” right after I posted “The Decline of the Software Drafter?” , which became the inspiration for this post. Kleppmann’s argument is that, just as the cost of generating code is coming down, so too will the cost of formal verification of code:

For example, as of 2009, the formally verified seL4 microkernel consisted of 8,700 lines of C code, but proving it correct required 20 person-years and 200,000 lines of Isabelle code – or 23 lines of proof and half a person-day for every single line of implementation. Moreover, there are maybe a few hundred people in the world (wild guess) who know how to write such proofs, since it requires a lot of arcane knowledge about the proof system.

If formal verification becomes vastly cheaper, then we can afford to verify much more software. But on top of that, AI also creates a need to formally verify more software: rather than having humans review AI-generated code, I’d much rather have the AI prove to me that the code it has generated is correct. If it can do that, I’ll take AI-generated code over handcrafted code (with all its artisanal bugs) any day!

I’ve long been interested in formal verification tools like TLA+ and Rocq (née Coq). I haven’t (yet) been able to justify to myself spending all that much time on them. I think that’s changing: the cost of writing code is coming down dramatically . The cost of reviewing and maintaining it is also coming down, but at a slower rate. I agree with Kleppmann that we need systematic tooling for dealing with this mismatch.

Wishcasting a future world, I would be excited to see something like:

  • One starts with a high-level system specification, in English.
  • This specification is spun out into multiple TLA+ models at various levels of component specificity.
  • These models would allow us to determine the components that are load-bearing for system correctness.
  • The most critical set of load-bearing components are implemented with a corresponding formal verification proof, in something like Rocq .
  • The rest of the system components are still audited by an LLM to ensure they correctly match the behavior of their associated component in the TLA+ spec.

The biggest concern to me related to formal verification is the following two excerpts, first from Kleppmann, and then from Hillel Wayne, a notable proponent of TLA+:

There are maybe a few hundred people in the world (wild guess) who know how to write such proofs, since it requires a lot of arcane knowledge about the proof system. – Martin Kleppmann

TLA+ is one of the more popular formal specification languages and you can probably fit every TLA+ expert in the world in a large schoolbus. – Hillel Wayne

For formal verification to be useful in practice, at least some of the arcane knowledge of its internals will need to be broadly disseminated. Reviewing an AI-generated formal spec of a problem won’t be useful if you don’t have enough knowledge of the proof system to poke holes in what the AI came up with.

I’d argue that undergraduate Computer Science programs should allocate some of their curriculum to formal verification. After all, students should have more time on their hands as they delegate implementation of their homework to AI agents.

The Paris Climate Treaty Changed the World. Here’s How

Portside
portside.org
2025-12-13 03:00:21
The Paris Climate Treaty Changed the World. Here’s How barry Fri, 12/12/2025 - 22:00 ...
Original Article

Today marks the 10th anniversary of the Paris climate treaty, one of the landmark days in climate-action history. Attending the conference as a journalist, I watched and listened and wondered whether 194 countries could ever agree on anything at all, and the night before they did, people who I thought were more sophisticated than me assured me they couldn’t. Then they did. There are a lot of ways to tell the story of what it means and where we are now, but any version of it needs respect for the complexities, because there are a lot of latitudes between the poles of total victory and total defeat.

I had been dreading the treaty anniversary as an occasion to note that we have not done nearly enough, but in July I thought we might be able celebrate it. Because, on 23 July, the international court of justice handed down an epochal ruling that gives that treaty enforceable consequences it never had before. It declares that all nations have a legal obligation to act in response to the climate crisis, and, as Greenpeace International put it, “obligates states to regulate businesses on the harm caused by their emissions regardless of where the harm takes place. Significantly, the court found that the right to a clean, healthy and sustainable environment is fundamental for all other human rights, and that intergenerational equity should guide the interpretation of all climate obligations.” The Paris treaty was cited repeatedly as groundwork for this decision.

Ralph Regenvanu, Vanuatu’s special envoy for climate, said of the decision: “I choose my words carefully when I say that this may well be the most consequential case in the history of humanity.” Costa Rica’s Christiana Figueres, who presided over the negotiations that created that Paris climate treaty declared , with jubilation, on her podcast: “The reason why I am truly tearful is this is without a doubt, the most far-reaching, the most comprehensive and the most consequential legal opinion we’ve ever had.”

This case that ended in the world’s highest court began with 27 law students in the University of the South Pacific who in 2019, asked themselves what they could do about climate – and it’s not hard to imagine a “what can we do, we’re only students” or “what can we do, we’re from tiny remote nations” stance. Instead, they set out to take a case all the way to the international court of justice in The Hague, unimpeded by the conventional wisdom that they were nobody from nowhere. They needed a law firm, and they chose Blue Ocean Law firm, sticking with the Pacific island nations, with indigenous leadership, with the impacted global south. And they needed a country to be plaintiff and the island nation of Vanuatu stepped up. The unanimous court decision in favor of the litigants matters most of all in how it is implemented, either through direct cases or through its impact on nations that take notice and reduce their climate devastation before they’re brought to court.

It’s not widely known that most countries and negotiators went into the conference expecting to set a “reasonable” two-degree threshold global temperature rise we should not cross. As my friend Renato Redentor Constantino, a climate organizer in the Philippines, wrote:“The powerful exerted tremendous effort to keep a tiny number, 1.5, out of United Nations documents. 1.5 degrees centigrade represents what science advises as the maximum allowable rise in average global temperature relative to preindustrial temperature levels. It was the representatives of the mostly global-south nations of the Climate Vulnerable Forum who fought to change the threshold from 2 degrees to 1.5.”

I remember them chanting “1.5 to stay alive”, because two degrees was a death sentence for too many places and people. The officially powerless swayed the officially powerful, and 1.5 degrees was written into the treaty and has became a familiar number in climate conversations ever since. Even though we’ve crashed into that 1.5 threshold, far better that it be set there than at 2 degrees, in which case we might well be complacent in the face of even more destructive temperature rise.

It takes far more than storytelling to get where we need to go, but how we tell the stories is crucial. I asked the climate policy expert Leah Stokes of UC Santa Barbara about the impact of Paris and she told me: “When small island nations pushed for 1.5 degrees as the target, they also requested the IPCC [intergovernmental panel on climate change] write a special report on what policy would be required to get there. That report came out in October 2018, and rocked around the world with headlines like ‘we have 12 years’. It changed the entire policy conversation to be focused on cutting pollution in half by 2030. Then, when it came time to design a climate package, Biden made it clear that his plan was to try to meet that target. You can draw a line between small islands’ fierce advocacy through to the passage of the largest climate law in American history.”

That’s how change often works, how an achievement ripples outward, how the indirect consequences matter as well as the direct ones. The Biden administration tried to meet the 1.5 degree target with the most ambitious US climate legislation ever, the Build Back Better Act that passed Congress after much pressure and conflict as the Inflation Reduction Act. Rumors of the Inflation Reduction Act’s death are exaggerated; some pieces of its funding and implementation are still in effect, and it prompted other nations to pursue more ambitious legislation. In the US, state and local climate efforts, have not been stopped by the Trump administration. Globally not nearly enough has been done to stop deforestation, slash fossil-fuel subsidies, and redesign how we live, move, and consume.

The renewables revolution is a bright spot. It’s often overlooked because it’s incremental, technical, economic, and dispersed, and even its major milestones don’t receive nearly the recognition they should. When the Paris treaty was signed, renewables were overall more expensive than fossil fuel, and were not widely implemented. But the drop in cost and spread of solar has outstripped virtually all predictions. The energy-policy group Ember reports : “Record solar power growth and stagnating fossil fuels in 2025 show how clean power has become the driving force in the power sector. Historically a growth segment, fossil power now appears to be entering a period of stagnation and managed decline.” The International Energy Agency notes another 2025 landmark: “The electricity sector is now the largest energy employer, surpassing fuel supply for the first time, as the age of electricity gathers pace.”

Anyone who in 2015 accurately prophesied what the energy landscape would look like in 2025 would have been thought to be ridiculous, delusional, or crazy (just like anyone who said in, say, 1995 that the UK would close its last coal-fired plant in 2024 would have been). 2025 is the year that renewables outstripped coal as an energy source. Ancillary developments like battery storage technology and design improvements and innovations have led to widespread renewables adoption from Denmark (which gets only 10% of its electricity from fossil fuels) to Texas to Pakistan (where small-scale solar panels from China have led something of an energy revolution). Solar power is now so cheap and abundant in Australia that electricity is going to be free for three hours in the middle of the day.

Problems that the enemies of climate action liked to cite, such as the intermittency of sun and wind, have been addressed with battery storage. California now often produces more than 100% of its electricity needs through renewables, led by solar, in the daytime. The excess goes into the batteries so that the state literally runs on sunshine at night. California uses 44% less natural gas to produce electricity than it did two years ago. China is reducing its emissions because it’s speedily transitioning to renewables; earlier this fall, in the United Nations, for the first time it made an actual commitment to reduction targets; and for the last eighteen months its CO2 emissions have been flat or falling.

Is this good enough? Far from it, but we are, as they say, “bending the curve”: before Paris the world was headed for 4 degrees of warming; it’s now headed for 2.5 degrees, which should only be acceptable as a sign that we have bent it and must bend more and faster. In the best-case scenario, the world’s leaders and powers would have taken the early warnings about climate change seriously and we’d be on the far side of a global energy transition, redesign of how we live, and protection of oceans, rainforests, and other crucial climate ecosystems. But thanks to valiant efforts by the climate movement and individual leaders and nations, we’re not in the worst-case scenario either. Landmarks like the Paris treaty and the Vanuatu victory matter, as do the energy milestones, and there’s plenty left to fight for. For decades and maybe centuries it has has been too late to save everything, but it will never be too late to save anything.

Rebecca Solnit is a Guardian US columnist. She is the author of Orwell’s Roses and co-editor with Thelma Young Lutunatabua of the climate anthology Not Too Late: Changing the Climate Story from Despair to Possibility.

The Guardian is globally renowned for its coverage of politics, the environment, science, social justice, sport and culture. Scroll less and understand more about the subjects you care about with the Guardian's brilliant email newsletters , free to your inbox.

1300 Still Images from the Animated Films of Hayao Miyazaki's Studio Ghibli

Hacker News
www.ghibli.jp
2025-12-13 02:56:49
Comments...
Original Article

現在、劇場公開中の「 君たちはどう生きるか 」の場面写真 14枚を、本日から提供いたします。

これまでの分と同様に、常識の範囲でご自由にお使いください。

常識の範囲でご自由にお使い下さい。

Mamdani’s Child Care Plan Is Audacious. Here’s How It Could Work.

Portside
portside.org
2025-12-13 02:40:38
Mamdani’s Child Care Plan Is Audacious. Here’s How It Could Work. barry Fri, 12/12/2025 - 21:40 ...
Original Article

"Children's Blocks" | by lobo235 (CC BY 2.0)

No major American city has ever built a universal child care system. That means that nearly three-quarters of American parents who are looking for a way to take care of their children are struggling to find it. At the same time, costs have exploded: Day care now runs more than twice what it did just before the pandemic.

Most politicians don’t even try to enact universal systems — the cost and complexity are daunting, and child care has long been seen as a private family problem, not a public responsibility. But Zohran Mamdani ran on such a plan — and New Yorkers made him their next mayor.

Many parts of Mr. Mamdani’s agenda have been dismissed as unrealistic, and his child care program often tops that list. He has promised free care for every child from 6 weeks to 5 years old and pledged to offer child care workers wages “at parity,” in the campaign’s words, with public school teachers . Critics say it will cost too much and prove impossible to build at scale. A poll from this fall captured the skepticism: 71 percent of likely New York City voters supported his pitch for universal child care, but only about 50 percent of those surveyed thought he could actually deliver it.

Having reported on child care policy around the country over the past 10 years, I think many people are looking at Mr. Mamdani’s plan all wrong. It will not be easy to implement, but if he learns from the mistakes that have derailed past efforts, he could pull off something remarkable. He has the opportunity to change the lives of hundreds of thousands of New Yorkers with young children, many of whom pay over $20,000 a year to send them to day care and preschool. More than that, he could offer a powerful example to leaders all over the country.

Universal child care need not be a pipe dream in America — something we envy the Danes and the Swedes for but never imagine having for ourselves. It can and should be as fundamental to a city’s infrastructure as transit or housing, as essential for attracting workers and residents as any investment a mayor can make. After all, for most parents of young children, child care isn’t optional — it’s what makes holding down a job possible.

Mr. Mamdani’s child care bid comes at a moment of unusual political openness to the idea. When the pandemic shut down child care options for millions of Americans, it stranded parents and employers alike. In the years since, a growing coalition of economists and business leaders has come to see child care as an integral part of economic growth — not a handout, but a way to keep workers in the labor force and families in cities .

That openness crosses party lines. Polling this year found that a majority of Republicans now say the federal government spends too little on programs that benefit children — a notable change for a party long skeptical of new social spending. Candidates for governor in Georgia and Wisconsin are running on universal child care plans, and affording child care is now a question that presidential candidates from both parties get asked about in debates.

The first thing Mr. Mamdani will need to get right: Any new system has to include the full range of child care options families rely on. That means not just day care centers and public school classrooms but also home-based businesses — small day care operations run out of private residences — and more informal arrangements with family and neighbors.

When people imagine universal child care, they often picture massive new facilities going up across a city. That’s not sufficient. “Child care infrastructure exists, and it exists in the neighborhoods that need it most,” said Jessica Sager, the co-founder of All Our Kin , a nonprofit that supports home-based child care providers. Most New York City home-based providers don’t receive city subsidies or support, she said — and bringing them into the system could unlock thousands of slots for families who need them.

New York has learned the importance of such investments the hard way. When Mayor Bill de Blasio started rolling out universal pre-K for 4-year-olds in 2014, his administration funneled nearly all the new money to child care centers . The result was that home-based providers, who relied on revenue from preschool-age children to stay afloat, lost a critical part of their enrollment. Many child care businesses collapsed, and providers quit the field . The United Federation of Teachers chapter that represents home-based providers shrank from 28,000 providers in 2007 to just 12,000 today.

Connecticut offers a better model. Its Early Start program , which launched this summer, includes infants and toddlers and guarantees home-based providers both a minimum amount of funding and even a voice in how the program is run.

It’s a hopeful sign that Mr. Mamdani’s pick for first deputy mayor was the budget director under Mr. de Blasio, who helped secure funding for universal pre-K, and that one of his transition co-chairs played a critical role in expanding that program; they should know what worked and what didn’t. Mr. Mamdani seems keen on a mixed-delivery system, having said during his campaign that he envisions subsidizing “families who prefer to have a trusted neighbor or relative take care of their child.”

Getting the design of such a program right is only half the battle. To deliver on all this, Mr. Mamdani will have to move boldly — but not too fast. In 1997, Quebec tried to implement universal child care in three years. The rush led the province to cut corners on quality , and the fallout has given skeptics ammunition ever since. Vice President JD Vance has cited Quebec’s rocky rollout as evidence that universal child care isn’t worth pursuing. Just last month, The Economist cited Quebec in a misleading piece on the “harm” of universal care. Early stumbles cast a long shadow.

There’s a danger in the other direction, too. In the few states that have made real investments in child care, leaders have been too quick to claim “mission accomplished” when plenty of families still don’t have good care options for their kids. In a country like the United States, where caregiving has long been devalued, no child care system can survive without sustained attention and investment, year after year.

You can see this problem play out in wages for child care workers. Child care is one of the country’s lowest-paid jobs , though Washington, D.C., has tried to change that locally. A few years ago, the city established what it called a pay equity fund to bring child care workers’ salaries closer to those of public school teachers, supplementing their wages via a new tax on the city’s highest earners. By many measures, D.C.’s program has been a success: Child care workers saw significant pay increases , funded by the city, that enabled day care centers to hire more staff members and care for more children . But when budget pressures hit, the supposed dedicated funding became a political football . The funding has not kept up with the program, which has created uncertainty about its future. For workers who had finally started to feel fairly compensated, the whiplash has been demoralizing and destabilizing.

New Mexico is perhaps the most instructive example of a premature victory lap. The state has earned glowing national praise for its governor’s commitment to make all families eligible for state child care subsidies. But eligibility is not care. Even in 2023, before this latest expansion, only a quarter of eligible children under 6 were receiving aid — and while enrollment had surged among middle-income families, it had fallen among families below the poverty line. Moreover, this spring, legislators quietly diverted some child care money to a behavioral health program, illustrating the competing budget pressures politicians face.

In New York, much media coverage has focused on how Mr. Mamdani will pay for his child care plan. And it’s an important question, since the plan will cost an estimated $6 billion or more, and federal Medicaid cuts are threatening to blow a hole in the state budget.

But there are reasons for cautious optimism. Mr. Mamdani can’t fund a program this size without Albany, and Gov. Kathy Hochul has signaled strong interest in expanding child care across New York. Mr. Mamdani has shown a willingness to negotiate with her in return — even if she’s not willing to raise taxes on the wealthy, as he’d prefer . And they’ve found common ground elsewhere: They both seem open to loosening certain regulations on child care providers .

For decades, the United States has told families that child care is their burden to navigate. Mr. Mamdani made a bet that New Yorkers were ready for a different answer. If he can deliver, he’ll give the rest of the country a much-needed blueprint to follow, too.

Rachel Cohen Booth is a senior policy correspondent for Vox. She is working on a book, forthcoming from Harmony, about individual agency and social change.

Get the best of the New York Times in your Inbox with a free newsletter. Gain unlimited access to all of The Times with a digital subscription .

Operation Condor: A Network of Transnational Repression 50 Years Later

Portside
portside.org
2025-12-13 02:30:58
Operation Condor: A Network of Transnational Repression 50 Years Later barry Fri, 12/12/2025 - 21:30 ...
Original Article
Operation Condor: A Network of Transnational Repression 50 Years Later Published

Washington, D.C., November 26, 2025 - On General Augusto Pinochet’s 60th birthday, November 25, 1975, four delegations of Southern Cone secret police chieftains gathered in Santiago, Chile, at the invitation of the Chilean intelligence service, DINA. Their meeting—held at the War College building on la Alameda, Santiago’s downtown thoroughfare—was called “to establish something similar to INTERPOL,” according to the confidential meeting agenda, “but dedicated to Subversion.” During the three-day meeting, the military officials from Argentina, Bolivia, Chile, Paraguay and Uruguay agreed to form “a system of collaboration” to identify, track, capture and eliminate leftist opponents of their regimes. As the conference concluded on November 28, a member of the Uruguayan delegation rose to toast the Chileans for convening the meeting and proposed naming the new organization after the host country’s national bird, the condor. According to secret minutes of the meeting, there was “unanimous approval.”

Chilean records refer to Condor as “Sistema Condor.” CIA intelligence reports called it Operation Condor. It was, as John Dinges writes in his comprehensive history, The Condor Years , an agency of “cross-border repression, teams went far beyond the frontiers of the member countries to launch assassination missions and other criminal operations in the United States, Mexico and Europe.” His investigation documented 654 victims of kidnapping, torture and disappearance during Condor’s active operationa l period in the Southern Cone between 1976 and 1980. A subdivision of Condor codenamed “Teseo”—for Theseus, the heroic warrior king of Greek mythology—established an international death squad unit based in Buenos Aires that launched 21 operations in Europe and elsewhere against opponents of the military regimes.

On the 50th anniversary of the secret inauguration of Operation Condor, the National Security Archive is posting a selection of documents that record the dark history of transnational repression under the Condor system. The selected records include:

  • The only known DINA document on the inaugural meeting—the “Closing Statement of the First Inter-American Meeting of National Intelligence”—which summarized the agreement between the original five Condor nations.
  • The first declassified CIA document to name “CONDOR” as a “cooperative arrangement” against subversion. The heavily censored CIA document, dated June 25, 1976, provides initial intelligence on the 2nd Condor meeting held from May 31 to June 2 in Santiago. It was the first in a flurry of CIA intelligence cables in the summer of 1976 on Condor’s evolution from an intelligence sharing collaboration to a transnational system of disappearance and assassination. “The subjects covered at the meeting,” this CIA report noted, “were more sweeping than just the exchange of information on terrorism and subversion.”
  • A CIA translation of the “Teseo” agreement—an extraordinary document that bureaucratically records the procedures, budgets, working hours, and operational rules for selecting, organizing and dispatching death squads to eliminate targeted enemies of the Southern Cone regimes. The “Teseo” operations base would be located “at Condor 1 (Argentina).” Each member country was expected to donate $10,000 to offset operational costs, and dues of $200 would be paid “prior to the 30th of each month” for maintenance expenses of the operations center. Expenses for agents on assassination missions abroad were estimated at $3,500 per person for ten days “with an additional $1000 first time out for clothing allowance.”
  • A CIA report on how the Teseo unit will select targets “to liquidate” in Europe and who will know about these missions. The source of the CIA intelligence suggests that “in Chile, for instance, Juan Manuel Contreras Sepulveda, chief of the Directorate of National Intelligence (DINA) the man who originated the entire Condor concept and has been the catalyst in bringing it into being, will coordinate details and target lists with Chilean President Augusto Pinochet Ugarte.”
  • The first briefing paper for Secretary of State Henry Kissinger alerting him to the existence of Operation Condor and the political ramifications for the United States. In a lengthy August 3, 1973, report from his deputy Harry Shlaudeman, Kissinger is informed that the security forces of the Southern Cone “have established Operation Condor to find and kill terrorists…in their own countries and in Europe. Brazil is cooperating short of murder operations."
  • CIA memoranda, written by the chief of the Western Hemisphere division, Ray Warren, sounding the alarm on Condor’s planned missions in Europe, and expressing concern that the CIA will be blamed for Condor’s assassinations abroad. One memo indicates that the CIA has taken steps to preempt the missions by alerting French counterparts that Condor operatives planned to murder specific individuals living in Paris.
  • The completely unredacted FBI “Chilbom” report, written by FBI attaché Robert Scherrer one week after the car bomb assassination of former Chilean Ambassador Orlando Letelier and Ronni Moffitt in downtown Washington, D.C. It was this FBI report that resulted in the revelation of the existence of the Condor system in 1979, when its author, FBI attaché Robert Scherrer, testified at a trial of several Cuban exiles who assisted the Chilean secret police in assassinating Letelier and Moffitt.
  • The first Senate investigative report on Condor based on CIA documents and briefings written in early 1979 by Michael Glennon, a staff member of the Senate Foreign Relations Subcommittee on International Operations. The draft report was never officially published but was leaked to columnist Jack Anderson; a copy was eventually obtained by John Dinges and Saul Landau and used in their book, Assassination on Embassy Row . A declassified copy was released as part of the Obama-authorized Argentina Declassification Project in 2019.

“These documents record the dark history of multilateral repression and state-sponsored terrorism in the Southern Cone—a history that defined those violent regimes of the past,” notes Peter Kornbluh, author of The Pinochet File: A Declassified Dossier on Atrocity and Accountability . “Fifty years after Condor’s inauguration, these documents provide factual evidence of coordinated human rights atrocities that can never be denied, whitewashed or justified.”

After many years of investigations and resulting trials, it is now clear that Condor may have backfired on its perpetrators, according to John Dinges, whose updated and expanded edition of The Condor Years was published in Spanish in 2021 as Los Años del Condor: Operaciones Internacionales de asesinato en el Cono Sur . “It is a kind of historic irony,” Dinges notes, “that the international crimes of the dictatorships spawned investigations, including one resulting in Pinochet’s arrest in London, that would eventually bring hundreds of the military perpetrators to justice. Moreover, because Condor’s most notorious crime was in Washington, D.C., the United States government unleashed the FBI to prosecute DINA and the Chilean regime.”

Other documents on Condor discovered in the archives of member states such as Uruguay can be found on this special website— https://plancondor.org/ —established to record the history of Condor’s human rights atrocities and hold those who committed them accountable for their crimes.

Special thanks to Carlos Osorio whose years of work documenting Operation Condor made this posting possible.

Read the documents.

Founded in 1985 by journalists and scholars to check rising government secrecy, the National Security Archive combines a unique range of functions: investigative journalism center, research institute on international affairs, library and archive of declassified U.S. documents ("the world's largest nongovernmental collection" according to the Los Angeles Times), leading non-profit user of the U.S. Freedom of Information Act, public interest law firm defending and expanding public access to government information, global advocate of open government, and indexer and publisher of former secrets.

VACUUM Is a Lie: About Your Indexes

Lobsters
boringsql.com
2025-12-13 02:06:05
Comments...
Original Article

There is common misconception that troubles most developers using PostgreSQL: tune VACUUM or run VACUUM, and your database will stay healthy. Dead tuples will get cleaned up. Transaction IDs recycled. Space reclaimed. Your database will live happily ever after.

But there are couple of dirty "secrets" people are not aware of. First of them being VACUUM is lying to you about your indexes .

The anatomy of storage 🔗

When you delete a row in PostgreSQL, it is just marked as a 'dead tuple'. Invisible for new transactions but still physically present. Only when all transactions referencing the row are finished, VACUUM can come along and actually remove them - reclamining the space in the heap (table) space.

To understand why this matters differently for tables versus indexes, you need to picture how PostgreSQL actually stores your data.

Your table data lives in the heap - a collection of 8 KB pages where rows are stored wherever they fit. There's no inherent order. When you INSERT a row, PostgreSQL finds a page with enough free space and slots the row in. Delete a row, and there's a gap. Insert another, and it might fill that gap - or not - they might fit somewhere else entirely.

This is why SELECT * FROM users without an ORDER BY can return rows in order initially, and after some updates in seemingly random order, and that order can change over time. The heap is like Tetris. Rows drop into whatever space is available, leaving gaps when deleted.

Heap Page

When VACUUM runs, it removes those dead tuples and compacts the remaining rows within each page. If an entire page becomes empty, PostgreSQL can reclaim it entirely.

And while indexes are on surface the same collection of 8KB pages, they are different. A B-tree index must maintain sorted order - that's the whole point of their existence and the reason why WHERE id = 12345 is so fast. PostgreSQL can binary-search down the tree instead of scanning every possible row. You can learn more about the fundamentals of B-Tree Indexes and what makes them fast .

But if the design of the indexes is what makes them fast, it's also their biggest responsibility. While PostgreSQL can fit rows into whatever space is available, it can't move the entries in index pages to fit as much as possible.

Leaf Page

VACUUM can remove dead index entries. But it doesn't restructure the B-tree. When VACUUM processes the heap, it can compact rows within a page and reclaim empty pages. The heap has no ordering constraint - rows can be anywhere. But B-tree pages? They're locked into a structure. VACUUM can remove dead index entries, yes.

Many developers assume VACUUM treats all pages same. No matter whether they are heap or index pages. VACUUM is supposed to remove the dead entries, right?

Yes. But here's what it doesn't do - it doesn't restructure the B-tree .

What VACUUM actually does

  • Removes dead tuple pointers from index pages
  • Marks completely empty pages as reusable
  • Updates the free space map

What VACUUM cannot do :

  • Merge sparse pages together (can do it for empty pages)
  • Reduce tree depth
  • Deallocate empty-but-still-linked pages
  • Change the physical structure of the B-tree

Your heap is Tetris, gaps can get filled. Your B-tree is a sorted bookshelf. VACUUM can pull books out, but can't slide the remaining ones together. You're left walking past empty slots every time you scan.

The experiment 🔗

Let's get hands-on and create a table, fill it, delete most of it and watch what happens.

CREATE EXTENSION IF NOT EXISTS pgstattuple;
CREATE TABLE demo (id integer PRIMARY KEY, data text);

-- insert 100,000 rows
INSERT INTO demo (id, data)
SELECT g, 'Row number ' || g || ' with some extra data'
FROM generate_series(1, 100000) g;

ANALYZE demo;

At this point, our index is healthy. Let's capture the baseline:

SELECT
    relname,
    pg_size_pretty(pg_relation_size(oid)) as file_size,
    pg_size_pretty((pgstattuple(oid)).tuple_len) as actual_data
FROM pg_class
WHERE relname IN ('demo', 'demo_pkey');
relname  | file_size | actual_data
-----------+-----------+-------------
demo      | 7472 kB   | 6434 kB
demo_pkey | 2208 kB   | 1563 kB

Now remove some data, 80% to be precise - somewhere in the middle:

DELETE FROM demo WHERE id BETWEEN 10001 AND 90000;

The goal is to simulate a common real-world pattern: data retention policies, bulk cleanup operations, or the aftermath of a data migration gone wrong.

VACUUM demo;

SELECT
    relname,
    pg_size_pretty(pg_relation_size(oid)) as file_size,
    pg_size_pretty((pgstattuple(oid)).tuple_len) as actual_data
FROM pg_class
WHERE relname IN ('demo', 'demo_pkey');
relname  | file_size | actual_data
-----------+-----------+-------------
demo      | 7472 kB   | 1278 kB
demo_pkey | 2208 kB   | 1563 kB

The table shrunk significantly, while index remained unchanged. You now have 20,000 rows indexed by a structure build to handle 100,000. Please, also notice file_size remain unchanged. VACUUM doesn't return space to the OS, it only marks pages as reusable within PostgreSQL.

This experiment is really an extreme case, but demonstrates the problem.

Understanding page states 🔗

Leaf pages have several states:

Full page (>80% density) , when the page contains many index entries, efficiently utilizing space. Each 8KB page read returns substantial useful data. This is optimal state.

Partial page (40-80% density) with some wasted space, but still reasonably efficient. Common at tree edges or after light churn. Nothing to be worried about.

Sparse page (<40% density) is mostly empty. You're reading an 8KB page to find a handful of entries. The I/O cost is the same as a full page, but you get far less value.

Empty page (0% density) with zero live entries, but the page still exists in the tree structure. Pure overhead. You might read this page during a range scan and find absolutely nothing useful.

A note on fillfactor 🔗

You might be wondering how can fillfactor help with this? It's the setting you can apply both for heap and leaf pages, and controls how full PostgreSQL packs the pages during the data storage. The default value for B-tree indexes is 90% . This leaves 10% of free space on each leaf page for future insertions.

CREATE INDEX demo_index ON demo(id) WITH (fillfactor = 70);

A lower fillfactor (like 70%) leaves more room, which can reduce page splits when you're inserting into the middle of an index - useful for tables random index column inserts or those with heavily updated index columns.

But if you followed carefully the anatomy of storage section, it doesn't help with the bloat problem. Quite the oppossite. If you set lower fillfactor and then delete majority of your rows, you actually start with more pages, and bigger chance to end up with more sparse pages than partial pages.

Leaf page fillfactor is about optimizing for updates and inserts. It's not a solution for deletion or index-column update bloat.

Why the planner gets fooled 🔗

PostgreSQL's query planner estimates costs based on physical statistics, including the number of pages in an index.

EXPLAIN ANALYZE SELECT * FROM demo WHERE id BETWEEN 10001 AND 90000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
  Index Scan using demo_pkey on demo  (cost=0.29..29.29 rows=200 width=41) (actual time=0.111..0.112 rows=0 loops=1)
    Index Cond: ((id >= 10001) AND (id <= 90000))
  Planning Time: 1.701 ms
  Execution Time: 0.240 ms
(4 rows)

While the execution is almost instant, you need to look behind the scenes. The planner estimated 200 rows and got zero. It traversed the B-tree structure expecting data that doesn't exist. On a single query with warm cache, this is trivial. Under production load with thousands of queries and cold pages, you're paying I/O cost for nothing. Again and again.

If you dig further you discover much bigger problem.

SELECT relname, reltuples::bigint as row_estimate, relpages as page_estimate
FROM pg_class 
WHERE relname IN ('demo', 'demo_pkey');
relname  | row_estimate | page_estimate
-----------+--------------+---------------
demo      |        20000 |           934
demo_pkey |        20000 |           276

The relpages value comes from the physical file size divided by the 8 KB page size. PostgreSQL updates it during VACUUM and ANALYZE, but it reflects the actual file on disk - not how much useful data is inside. Our index file is still 2.2 MB (276 pages × 8 KB), even though most pages are empty.

The planner sees 276 pages for 20,000 rows and calculates a very low rows-per-page ratio. This is when planner can come to conclusion - this index is very sparse - let's do a sequential scan instead . Oops.

"But wait," you say, "doesn't ANALYZE fix statistics?"

Yes and no. ANALYZE updates the row count estimate. It will no longer think you have 100,000 rows but 20,000. But it does not shrink relpages, because that reflects the physical file size on disk. ANALYZE can't change that.

The planner now has accurate row estimates but wildly inaccurate page estimates. The useful data is packed into just ~57 pages worth of entries, but the planner doesn't know that.

cost = random_page_cost × pages + cpu_index_tuple_cost × tuples

With a bloated index:

  • pages is oversize (276 instead of ~57)
  • The per-page cost gets multiplied by empty pages
  • Total estimated cost is artificially high

The hollow index 🔗

We can dig even more into the index problem when we look at internal stats:

SELECT * FROM pgstatindex('demo_pkey');
-[ RECORD 1 ]------+--------
version            | 4
tree_level         | 1
index_size         | 2260992
root_block_no      | 3
internal_pages     | 1
leaf_pages         | 57
empty_pages        | 0
deleted_pages      | 217
avg_leaf_density   | 86.37
leaf_fragmentation | 0

Wait, what? The avg_leaf_density is 86% and it looks perfectly healthy. That's a trap. Due to the hollow index (we removed 80% right in the middle) we have 57 well-packed leaf pages, but the index still contains 217 deleted pages.

This is why avg_leaf_density alone is misleading. The density of used pages looks great, but 79% of your index file is dead weight.

The simplest way to spot index bloat is comparing actual size to expected size.

SELECT
    c.relname as index_name,
    pg_size_pretty(pg_relation_size(c.oid)) as actual_size,
    pg_size_pretty((c.reltuples * 40)::bigint) as expected_size,
    round((pg_relation_size(c.oid) / nullif(c.reltuples * 40, 0))::numeric, 1) as bloat_ratio
FROM pg_class c
JOIN pg_index i ON c.oid = i.indexrelid
WHERE c.relkind = 'i' 
  AND c.reltuples > 0
  AND c.relname NOT LIKE 'pg_%'
  AND pg_relation_size(c.oid) > 1024 * 1024  -- only indexes > 1 MB
ORDER BY bloat_ratio DESC NULLS LAST;
index_name | actual_size | expected_size | bloat_ratio
------------+-------------+---------------+-------------
demo_pkey  | 2208 kB     | 781 kB        |         2.8

A bloat_ratio of 2.8 means the index is nearly 3x larger than expected. Anything above 1.8 - 2.0 deserves investigation.

We filter to indexes over 1 MB - bloat on tiny indexes doesn't matter that much. Please, adjust the threshold based on your environment; for large databases, you might only care about indexes over 100 MB.

But here comes BIG WARNING : pgstatindex() we used earlier physically reads the entire index. On a 10 GB index, that's 10 GB of I/O. Don't run it against all indexes on a production server - unless you know what you are doing!

REINDEX 🔗

How to actually fix index bloat problem? REINDEX is s straightforward solution as it rebuilds the index from scratch.

REINDEX INDEX CONCURRENTLY demo_pkey ;

After which we can check the index health:

SELECT * FROM pgstatindex('demo_pkey');
-[ RECORD 1 ]------+-------
version            | 4
tree_level         | 1
index_size         | 466944
root_block_no      | 3
internal_pages     | 1
leaf_pages         | 55
empty_pages        | 0
deleted_pages      | 0
avg_leaf_density   | 89.5
leaf_fragmentation | 0

And

SELECT
    relname,
    pg_size_pretty(pg_relation_size(oid)) as file_size,
    pg_size_pretty((pgstattuple(oid)).tuple_len) as actual_data
FROM pg_class
WHERE relname IN ('demo', 'demo_pkey');
relname  | file_size | actual_data
-----------+-----------+-------------
demo      | 7472 kB   | 1278 kB
demo_pkey | 456 kB    | 313 kB

Our index shrunk from 2.2 MB to 456 KB - 79% reduction (not a big surprise though).

As you might have noticed we have used CONCURRENTLY to avoid using ACCESS EXCLUSIVE lock. This is available since PostgreSQL 12+, and while there's an option to omit it - the pretty much only reason to do so is during planned maintenance to speed up the index rebuild time.

pg_squeeze 🔗

If you look above at the file_size of our relations, we have managed to reclaim the disk space for the affected index (it was REINDEX after all), but the table space was not returned back to the operating system.

That's where pg_squeeze shines. Unlike trigger-based alternatives, pg_squeeze uses logical decoding, resulting in lower impact on your running system. It rebuilds both the table and all its indexes online, with minimal locking:

CREATE EXTENSION pg_squeeze;

SELECT squeeze.squeeze_table('public', 'demo');

The exclusive lock is only needed during the final swap phase, and its duration can be configured. Even better, pg_squeeze is designed for regular automated processing - you can register tables and let it handle maintenance whenever bloat thresholds are met.

pg_squeeze makes sense when both table and indexes are bloated, or when you want automated management. REINDEX CONCURRENTLY is simpler when only indexes need work.

There's also older tool pg_repack - for a deeper comparison of bloat-busting tools, see article The Bloat Busters: pg_repack vs pg_squeeze .

VACUUM FULL (The nuclear option) 🔗

VACUUM FULL rewrites the entire table and all indexes. While it fixes everything it comes with a big but - it requires an ACCESS EXCLUSIVE lock - completely blocking all reads and writes for the entire duration. For a large table, this could mean hours of downtime.

Generally avoid this in production . Use pg_squeeze instead for the same result without the downtime.

When to act, and when to chill 🔗

Before you now go and REINDEX everything in sight, let's talk about when index bloat actually matters.

B-trees expand and contract with your data . With random insertions affecting index columns - UUIDs, hash keys, etc. the page splits happen constantly. Index efficiency might get hit at occassion and also settle around 70 - 80% over different natural cycles of your system usage. That's not bloat. That's the tree finding its natural shape for your data.

The bloat we demonstrated - 57 useful pages drowning in 217 deleted ones - is extreme. It came from deleting 80% of contiguous data. You won't see this from normal day to day operations.

When do you need to act immediately:

  • after a massive DELETE (retention policy, GDPR purge, failed migration cleanup)
  • bloat_ratio exceeds 2.0 and keeps climbing
  • query plans suddenly prefer sequential scans on indexed columns
  • index size is wildly disproportionate to row count

But in most cases you don't have to panic. Monitor weekly and when indexes bloat ratio continously grow above warning levels, schedule a REINDEX CONCURRENTLY during low traffic period.

Index bloat isn't an emergency until it is. Know the signs, have the tools ready, and don't let VACUUM's silence fool you into thinking everything's fine.

Conclusion 🔗

VACUUM is essential for PostgreSQL. Run it. Let autovacuum do its job. But understand its limitations: it cleans up dead tuples, not index structure.

The truth about PostgreSQL maintenance is that VACUUM handles heap bloat reasonably well, but index bloat requires explicit intervention. Know when your indexes are actually sick versus just breathing normally - and when to reach for REINDEX.

VACUUM handles heap bloat. Index bloat is your problem. Know the difference.

Junichi Uekawa: I was wondering if there was some debian thread and noticed maybe something is broken in my mail setup.

PlanetDebian
www.netfort.gr.jp
2025-12-13 01:41:02
I was wondering if there was some debian thread and noticed maybe something is broken in my mail setup. The amount of emails I am receiving seems to be very small. ...
Original Article

13 Dec 2025 (Sat)

10:41:02 # Life I was wondering if there was some debian thread and noticed maybe something is broken in my mail setup. The amount of emails I am receiving seems to be very small.

In Cover-Up, Laura Poitras Investigates Seymour Hersh

Portside
portside.org
2025-12-13 01:33:05
In Cover-Up, Laura Poitras Investigates Seymour Hersh barry Fri, 12/12/2025 - 20:33 ...
Original Article

Laura Poitras, the journalist and documentary filmmaker, has the rare distinction of having won a Pulitzer Prize (for her reporting on the National Security Agency and Edward Snowden), an Academy Award (for Citizenfour , about Snowden), and the Venice Film Festival’s Golden Lion (for All the Beauty and the Bloodshed , about the artist and activist Nan Goldin and the opioid crisis). Her new film, Cover-Up , examines the career of Seymour M. Hersh—known for his investigative journalism on My Lai and Abu Ghraib, as well as other, more contested scoops—using archival material and exclusive access to Hersh’s notes to probe how accountability journalism is made.

Cover-Up , which opens Friday in theaters in select cities ahead of its Netflix premiere on December 26, returns Poitras to the familiar intersection of government secrecy, source protection, and the journalistic imperative to reveal what powerful institutions want hidden. Codirected with Mark Obenhaus, the film is a political thriller that offers a candid and illuminating look at Hersh’s career and methods, what makes him tick, and why investigative journalism matters today. Our conversation, which took place at the New York office of Poitras’s production company Praxis Films, has been edited for length and clarity.

AJG: Let me start by observing that Cover-Up is a valuable addition to the small canon of journalism movies. Yet I see it as more closely in conversation with fiction films than with other documentaries.

LP: It’s interesting that you reference that, because we definitely were talking about fiction films when we began—we being myself, Amy Foote, and Peter Bowman, the editors—in terms of how we were going to approach its feel and aesthetics. And the films that we were referring to were the 1970s paranoia thrillers, brilliant films that were very skeptical and critical of the state and state power. Alan J. Pakula was at the top of the list. All the President’s Men and The Parallax View were the ones that we constantly referred to. I make nonfiction, but I’m often making films that deal with threats and dangers of the state. So it’s not like I’m stealing from the genre of fiction, but rather, I think fiction steals from real life.

This is a long-gestating project. You’ve said that you first thought about training your camera on Seymour Hersh after reading his Abu Ghraib coverage in 2004. Tell me more about that.

Twenty years ago, when I was preparing to go to Iraq, where I spent eight months documenting the US occupation and the war, I felt very strongly that we were living in a landscape where legacy journalism was failing the public in terms of its coverage of the lead-up to this war, its coverage of the Bush era, the “war on terror,” and how it was reporting on Guantánamo Bay prison and torture. All this nightmarish stuff was happening that we knew was happening, and by and large, legacy media was copying the government’s press releases, even to the point of respected news organizations having editorial guidelines not to use the word “torture” to describe the CIA’s torture. It was kind of staggering, and Sy was doing something very different. He was reporting at The New Yorker and asking: What is actually going on? Why are we going to Iraq? And he was saying there was no connection between the 9/11 attacks and Al Qaeda and Iraq, but he was drawing the connection between Dick Cheney and Halliburton and all the money that was there and using the sort of emergency laws that emerged after 9/11 to get in all these policies that the right had been dreaming of for years.

The Abu Ghraib story broke in April 2004, and I traveled to Iraq about a month later. I had already made the decision to go, but then when I saw those photographs, it was just a level of horror that I could not imagine existed. The film that I ended up making there, My Country, My Country, sort of began at Abu Ghraib because I managed to talk my way into the prison that summer when Iraqis were inspecting it because it was such an international scandal.

When I came back, I reached out to Sy and we met. I sort of laugh about entering his office. There should have been Twilight Zone music playing, because it really was like going back in time, with all the yellow notepads that you see in the film. It was like time had stopped in the 1970s in that office.

What was your idea for Cover-Up in 2005?

Back then, I was proposing to make a film that would follow him in real time, so more observational: Sy meeting sources or in editorial meetings at The New Yorker, throwing things at the editors and threatening to quit, which Amy Davidson Sorkin joked he would do on a regular basis.

He entertained the idea, but after I left, he called me, and he was like: No way. I can’t risk my sources. You know: My sources are too sensitive, and there’s no way a camera can be around. So it was a hard no, but a very gracious hard no. But we stayed in touch.

And nearly twenty years later, Sy tells you he’s ready for his close-up. Why do you think that was?

He certainly was aware of the reporting I did with Edward Snowden. I think he felt that I was doing something that he felt some kind of kinship to in terms of being a bit of an outsider, a bit of a thorn in the side of the government.

I know that he and his wife, Liz, saw All the Beauty and the Bloodshed . I think what maybe resonated with them—without speaking for them, which makes me a little bit nervous—is that it’s a portrait of Nan but also a larger critique of social structures and systems. And this is what I’m trying to do in all my films. I’m not interested in making biopics, but I do tell stories about individuals who are confronting power structures.

Did you go into the film with a clear sense of the story you wanted to tell?

From the very beginning, we were interested in Hersh’s reporting, but also in the patterns we could see across half a century, and particularly around atrocities, cover-ups, impunity, and the role of investigative journalism in circumventing that cycle.

I think one of the reasons why we made this film was because we felt there’s a crisis in investigative journalism, because it’s hard, it’s costly, often comes with legal threats, and often takes a lot of time, and that’s harder to do if you don’t have the architecture to support it.

Did Sy have veto power?

Sy didn’t ask for editorial control, but we did update him to make sure that we weren’t missing things. But he wasn’t difficult. Once he had finally relented to be part of this project, he was just all in.

How did you decide when to get personal?

I was guided by a few things. One was what informed his reporting and what was his motivation, and that’s what we definitely felt about his growing up in Chicago, with his parents being immigrants coming from Eastern Europe, the silence in the house, his dad dies young, and he’s asked to take over the family’s dry-cleaning shop, and nobody was giving him opportunities. And then he sort of stumbled into journalism and found his passion and his love for truth-telling. And so of course, that needed to be in the film. But as I mentioned, I’m definitely very anti-biopic, and I’m also very anti-experts. The people we talked to had to have direct knowledge of the reporting, like Amy Davidson Sorkin, who was his editor at The New Yorker on Abu Ghraib.

There’s a lot of talk these days about bias in the media and how journalists need to be neutral. Can you do investigative journalism from a position of neutrality?

I absolutely believe in certain principles in journalism, like that it should be fact-driven and it should be interrogating power, disclosing any conflicts of interest, if they exist. But I also think that we need to use words to describe what’s happening, and so going back to the Bush era, when news organizations didn’t use the word torture to describe torture, that is lying. It’s not a neutral position, it’s a position that is aligning with the nation-state and asking the press to capitulate to whatever that agenda is. And I think that that’s really dangerous, because you lose trust. So if we’re talking about what’s happening in Gaza, I think we have to use the word “genocide,” because if you look at the evidence of what’s there, that’s what we’re seeing. I don’t think that that’s biased. That is looking at two years of dropping American-taxpayer-funded bombs on a population.

What’s interesting about Sy’s body of work and his career is that his big stories are evidentiary. They present evidence that shows atrocities. We’re talking about the My Lai massacre or Abu Ghraib torture, CIA surveillance on protest movements or involvement in Chile and coups all over the world. In his best stories, he delivers the facts. But he’s never been quiet about his worldview and saying that he was against the Vietnam War and that it was a catastrophe.

I’m sure some people could watch Cover-Up and say, clearly, you’re not neutral. While not a hagiography, the film clearly celebrates Hersh’s achievements. I can imagine someone else doing a film about Sy where he’s this muckraker who’s out to make America look bad.

One of the things that speaks most highly about Sy’s body of work is that regardless of what administration has been in power, he’s gotten under their skin. He went after JFK, he went after Johnson, he went after Nixon, he went after Reagan, Carter, Obama, Biden, and now Trump. I believe in that kind of equal-opportunity adversarial journalism.

And about the hagiography thing you raised: it was important in the film to also include times when Sy got it wrong. Because we always felt like it was our job to talk about the times when he got it wrong or got played or got too close to power. And those mistakes happen in the field of journalism. Probably Mark and I had more of an obligation than most to ask about some of the stories where he made mistakes, because we knew him well and felt close to his body of work.

Was Sy reluctant to discuss his mistakes?

He didn’t exactly welcome it, but he was fine. I mean, ultimately, he would have had zero respect for us if we didn’t, which doesn’t mean that those were his favorite days.

Do you think it’s become more difficult to get the truth out in this age of extreme polarization? Part of me even wonders whether a My Lai–style report published today would have the sort of impact it did fifty-five years ago.

I refuse the notion that we’re in a post-fact world. I believe that people are very aware of if they can’t pay rent or afford healthcare or education for their kids. Those are facts that people understand. Yes, some trust has been eroded. And I think it’s been eroded by the public being lied to by our governments and by the press sometimes. But I’m not willing to concede that we shouldn’t care about what’s happening in the world, or that people don’t care. I mean, journalists in Gaza are dying every day to get out information about what’s happening. I think they are reaching the public. Whether or not they’re actually causing governments to change is the real problem.

Do you consider Cover-Up a hopeful film?

I don’t know if “hope” is the right word. You know, all of my films have protagonists that are really getting under the skin of power, whether that’s government or corporate power. And that offers the idea that it’s possible that an individual or small group of people can change how we understand the world. That’s a powerful message when people are feeling a lot of despair.

Columbia Journalism Review ’s mission is to be the intellectual leader in the rapidly changing world of journalism. It is the most respected voice on press criticism, and it shapes the ideas that make media leaders and journalists smarter about their work. Through its fast-turn analysis and deep reporting, CJR is an essential venue not just for journalists, but also for the thousands of professionals in communications, technology, academia, and other fields reliant on solid media industry knowledge. Get the CJR email newsletter. Join CJR.

[$] The state of the kernel Rust experiment

Linux Weekly News
lwn.net
2025-12-13 01:19:08
The ability to write kernel code in Rust was explicitly added as an experiment — if things did not go well, Rust would be removed again. At the 2025 Maintainers Summit, a session was held to evaluate the state of that experiment, and to decide whether the time had come to declare the result to be a...
Original Article

The page you have tried to view ( The state of the kernel Rust experiment ) is currently available to LWN subscribers only.

Reader subscriptions are a necessary way to fund the continued existence of LWN and the quality of its content.

If you are already an LWN.net subscriber, please log in with the form below to read this content.

Please consider subscribing to LWN . An LWN subscription provides numerous benefits, including access to restricted content and the warm feeling of knowing that you are helping to keep LWN alive.

(Alternatively, this item will become freely available on December 25, 2025)

The Checkerboard

Hacker News
99percentinvisible.org
2025-12-13 00:50:47
Comments...
Original Article

In 2019, hunters Brad Cape and Phil Yeomans were scouting for elk in southeast Wyoming when they came across a rocky peak that seemed perfect for elk hunting, a suspicion only heightened by its name: Elk Mountain. But finding a way onto Elk Mountain would turn out to be extremely difficult, and whether Brad and Phil succeeded would have lasting consequences for the future of land use everywhere in the U.S. because the single largest obstacle preventing the hunters from making it onto the mountain wasn’t the elevation or the topography. It was that the mountain was on a special type of land known as “the checkerboard”.

The checkerboard is a pattern of land ownership, unique to the American West, found in huge areas from New Mexico all the way up to Washington. On a map, these particular areas resemble a checkerboard, but instead of alternating black and white squares, checkerboarded land alternates between single square-mile parcels of public land and square mile parcels of private land.

In Railroaded: The Transcontinentals and the Making of Modern America , the Stanford historian Richard White explains that  the checkerboard was created at the tail-end of the Civil War, when the U.S. government gave the railroad companies long corridors of land—up to eighty miles wide—on which to build new rail lines and encourage westward migration. But almost all of this land was given away in alternating, one-square-mile sections. This checkerboard pattern allowed the government to keep all the undeveloped sections in between and wait for them to go up in value before turning around and selling them to developers. Most checkerboarded land today, regardless of who owns the private squares now, is descended from these initial railroad grants.

But the checkerboard would pose a problem for Brad and Phil. You can’t pass through private property without the landowner’s permission, so the public squares in the checkerboard are often very difficult to access, and Elk Mountain was no different. The private half of the checkerboard belonged to a ranch, and the ranch’s owner, a billionaire pharmaceutical executive, wasn’t allowing strangers to cross his land. So when they came back to hunt in the area in 2020, Brad and Phil and some other hunting buddies decided to try something called corner crossing.

To understand corner crossing, think about a literal checkerboard. In a checkers game, a piece that starts on black needs to stay on black. So the pieces only ever make diagonal movements, crossing from the corner of one square to another. Moving through checkerboarded land works in the same way. To avoid the ranch’s property, all Brad and Phil and the others had to do was move around like a checkers piece. They’d start on public land and then make sure to stay on public land, by crossing into new squares diagonally, at the corners where all those public squares touch.

The hunters hiked from a public road towards the checkerboard’s nearest approachable corner, where they found two no-trespassing signs, along with a couple of posts with a chain strung between them, obstructing the one spot where they could legally cross. So they grabbed hold of the top of the posts and swung their feet around, making absolutely sure they didn’t touch private property. From that point on, they stayed entirely on public land inside the checkerboard, corner crossing from one public square to another as they hunted for elk on Elk Mountain.

But in the middle of their hunt, a manager for the ranch approached them and insisted that touching the ranch’s posts counted as trespassing. So when they came back to hunt Elk Mountain the next year, Brad brought a ladder that unfolded to a specific height, length and width, allowing the hunters to go right over the t-posts and across the corner, all without ever touching the ranch’s property.

But this didn’t placate the ranch’s owner. He had the ranch’s manager keep contacting the authorities until eventually the county attorney charged the hunters with criminal trespass. The chance of jail time was slim, so the hunters could have ended things there by paying a small fine and promising to stay away from Elk Mountain and go hunt elk somewhere else. But the hunters believed the public should have the right to access public land—including in the checkerboard. So instead of paying the fine, the hunters decided to fight the case.

The resulting five-year legal battle, which grew to include two criminal charges and a multimillion-dollar civil case, revolved around the central question of whether corner crossing is or should be legal, and with it, effectively who really controlled millions of acres of public land. Along the way, the stakes attracted private landowners, public land users, lobbying groups on both sides of the divide, and the national media. Eventually the case landed before the U.S. Tenth Circuit Court of Appeals. The court ruled in favor of the hunters , saying that the public was owed its half of the deal that the government had struck with the railroads a century and a half earlier.

The Tenth Circuit’s decision won’t bring total closure. Its decision only affects six western states, and the U.S. Supreme Court refused to take up the case, which means that, for now, the status of corner crossing and public land access in the other 44 states remains murky. It’s unlikely Brad and Phil will be involved in whatever comes next. One thing is for sure though: they’re eager to go back to hunt Elk Mountain.

OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI

Hacker News
simonwillison.net
2025-12-12 23:30:19
Comments...
Original Article

12th December 2025

One of the things that most excited me about Anthropic’s new Skills mechanism back in October is how easy it looked for other platforms to implement. A skill is just a folder with a Markdown file and some optional extra resources and scripts, so any LLM tool with the ability to navigate and read from a filesystem should be capable of using them. It turns out OpenAI are doing exactly that, with skills support quietly showing up in both their Codex CLI tool and now also in ChatGPT itself.

Skills in ChatGPT

I learned about this from Elias Judin this morning. It turns out the Code Interpreter feature of ChatGPT now has a new /home/oai/skills folder which you can access simply by prompting:

Create a zip file of /home/oai/skills

I tried that myself and got back this zip file . Here’s a UI for exploring its content ( more about that tool ).

Screenshot of file explorer. Files skills/docs/render_docsx.py and skills/docs/skill.md and skills/pdfs/ and skills/pdfs/skill.md - that last one is expanded and reads: # PDF reading, creation, and review guidance  ## Reading PDFs - Use pdftoppm -png $OUTDIR/$BASENAME.pdf $OUTDIR/$BASENAME to convert PDFs to PNGs. - Then open the PNGs and read the images. - pdfplumber is also installed and can be used to read PDFs. It can be used as a complementary tool to pdftoppm but not replacing it. - Only do python printing as a last resort because you will miss important details with text extraction (e.g. figures, tables, diagrams).  ## Primary tooling for creating PDFs - Generate PDFs programmatically with reportlab as the primary tool. In most cases, you should use reportlab to create PDFs. - If there are other packages you think are necessary for the task (eg. pypdf, pyMuPDF), you can use them but you may need topip install them first. - After each meaningful update—content additions, layout adjustments, or style changes—render the PDF to images to check layout fidelity:   - pdftoppm -png $INPUT_PDF $OUTPUT_PREFIX - Inspect every exported PNG before continuing work. If anything looks off, fix the source and re-run the render → inspect loop until the pages are clean.  ## Quality expectations - Maintain a polished, intentional visual design: consistent typography, spacing, margins, color palette, and clear section breaks across all pages. - Avoid major rendering issues—no clipped text, overlapping elements, black squares, broken tables, or unreadable glyphs. The rendered pages should look like a curated document, not raw template output. - Charts, tables, diagrams, and images must be sharp, well-aligned, and properly labeled in the PNGs. Legends and axes should be readable without excessive zoom. - Text must be readable at normal viewing size; avoid walls of filler text or dense, unstructured bullet lists. Use whitespace to separate ideas. - Never use the U+2011 non-breaking hyphen or other unicode dashes as they will not be

So far they cover spreadsheets, docx and PDFs. Interestingly their chosen approach for PDFs and documents is to convert them to rendered per-page PNGs and then pass those through their vision-enabled GPT models, presumably to maintain information from layout and graphics that would be lost if they just ran text extraction.

Elias shared copies in a GitHub repo . They look very similar to Anthropic’s implementation of the same kind of idea, currently published in their anthropics/skills repository.

I tried it out by prompting:

Create a PDF with a summary of the rimu tree situation right now and what it means for kakapo breeding season

Sure enough, GPT-5.2 Thinking started with:

Reading skill.md for PDF creation guidelines

Then:

Searching rimu mast and Kākāpō 2025 breeding status

It took just over eleven minutes to produce this PDF , which was long enough that I had Claude Code for web build me a custom PDF viewing tool while I waited.

Here’s ChatGPT’s PDF in that tool .

Screenshot of my tool. There is a URL at the top, a Load PDF button and pagination controls. Then the PDF itself is shown, which reads: Rimu mast status and what it means for the kākāpō breeding season Summary as of 12 December 2025 (Pacific/Auckland context) Kākāpō breeding is tightly linked to rimu (Dacrydium cupressinum) mast events: when rimu trees set and ripen large amounts of fruit, female kākāpō are much more likely to nest, and more chicks can be successfully raised. Current monitoring indicates an unusually strong rimu fruiting signal heading into the 2025/26 season, which sets the stage for a potentially large breeding year in 2026.^1,2 Key numbers at a glance Kākāpō population (official DOC count) 237 birds alive Breeding trigger (rimu fruiting)>10% of rimu branch tips bearing fruit Forecast rimu fruiting for 2026 (DOC monitoring) Around 50–60% fruiting across breeding islands¹Breeding-age females (DOC 2025 planning figure)About 87 females (potentially nearly all could nest)

(I am very excited about Kākāpō breeding season this year .)

The reason it took so long is that it was fastidious about looking at and tweaking its own work. I appreciated that at one point it tried rendering the PDF and noticed that the macrons in kākāpō were not supported by the chosen font, so it switched to something else:

ChatGPT screenshot. Analyzed image. There's an image of a page of PDF with obvious black blocks on some of the letters in the heading. It then says: Fixing font issues with macrons. The page is showing black squares for words like "kākāpō," probably because Helvetica can't handle macrons. I'll switch to a font that supports them, such as DejaVu Sans or Noto Sans. I'll register both regular and bold fonts, then apply them to the document. I'll update the footer to note the issue with Helvetica. Time to rebuild the PDF!

Skills in Codex CLI

Meanwhile, two weeks ago OpenAI’s open source Codex CLI tool landed a PR titled feat: experimental support for skills.md . The most recent docs for that are in docs/skills.md .

The documentation suggests that any folder in ~/.codex/skills will be treated as a skill.

I used Claude Opus 4.5’s skill authoring skill to create this skill for creating Datasette plugins , then installed it into my Codex CLI skills folder like this:

git clone https://github.com/datasette/skill \
  ~/.codex/skills/datasette-plugin

You have to run Codex with the --enable skills option. I ran this:

cd /tmp
mkdir datasette-cowsay
cd datasette-cowsay
codex --enable skills -m gpt-5.2

Then prompted:

list skills

And Codex replied:

- datasette-plugins — Writing Datasette plugins using Python + pluggy (file: /Users/simon/.codex/skills/datasette-plugin/SKILL.md)
- Discovery — How to find/identify available skills (no SKILL.md path provided in the list)

Then I said:

Write a Datasette plugin in this folder adding a /-/cowsay?text=hello page that displays a pre with cowsay from PyPI saying that text

It worked perfectly! Here’s the plugin code it wrote and here’s a copy of the full Codex CLI transcript , generated with my terminal-to-html tool .

You can try that out yourself if you have uvx installed like this:

uvx --with https://github.com/simonw/datasette-cowsay/archive/refs/heads/main.zip \
  datasette

Then visit:

http://127.0.0.1:8001/-/cowsay?text=This+is+pretty+fun

Screenshot of that URL in Firefox, an ASCII art cow says This is pretty fun.

Skills are a keeper

When I first wrote about skills in October I said Claude Skills are awesome, maybe a bigger deal than MCP . The fact that it’s just turned December and OpenAI have already leaned into them in a big way reinforces to me that I called that one correctly.

Skills are based on a very light specification, if you could even call it that, but I still think it would be good for these to be formally documented somewhere. This could be a good initiative for the new Agentic AI Foundation ( previously ) to take on.

OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI

Simon Willison
simonwillison.net
2025-12-12 23:29:51
One of the things that most excited me about Anthropic's new Skills mechanism back in October is how easy it looked for other platforms to implement. A skill is just a folder with a Markdown file and some optional extra resources and scripts, so any LLM tool with the ability to navigate and read fro...
Original Article

12th December 2025

One of the things that most excited me about Anthropic’s new Skills mechanism back in October is how easy it looked for other platforms to implement. A skill is just a folder with a Markdown file and some optional extra resources and scripts, so any LLM tool with the ability to navigate and read from a filesystem should be capable of using them. It turns out OpenAI are doing exactly that, with skills support quietly showing up in both their Codex CLI tool and now also in ChatGPT itself.

Skills in ChatGPT

I learned about this from Elias Judin this morning. It turns out the Code Interpreter feature of ChatGPT now has a new /home/oai/skills folder which you can access simply by prompting:

Create a zip file of /home/oai/skills

I tried that myself and got back this zip file . Here’s a UI for exploring its content ( more about that tool ).

Screenshot of file explorer. Files skills/docs/render_docsx.py and skills/docs/skill.md and skills/pdfs/ and skills/pdfs/skill.md - that last one is expanded and reads: # PDF reading, creation, and review guidance  ## Reading PDFs - Use pdftoppm -png $OUTDIR/$BASENAME.pdf $OUTDIR/$BASENAME to convert PDFs to PNGs. - Then open the PNGs and read the images. - pdfplumber is also installed and can be used to read PDFs. It can be used as a complementary tool to pdftoppm but not replacing it. - Only do python printing as a last resort because you will miss important details with text extraction (e.g. figures, tables, diagrams).  ## Primary tooling for creating PDFs - Generate PDFs programmatically with reportlab as the primary tool. In most cases, you should use reportlab to create PDFs. - If there are other packages you think are necessary for the task (eg. pypdf, pyMuPDF), you can use them but you may need topip install them first. - After each meaningful update—content additions, layout adjustments, or style changes—render the PDF to images to check layout fidelity:   - pdftoppm -png $INPUT_PDF $OUTPUT_PREFIX - Inspect every exported PNG before continuing work. If anything looks off, fix the source and re-run the render → inspect loop until the pages are clean.  ## Quality expectations - Maintain a polished, intentional visual design: consistent typography, spacing, margins, color palette, and clear section breaks across all pages. - Avoid major rendering issues—no clipped text, overlapping elements, black squares, broken tables, or unreadable glyphs. The rendered pages should look like a curated document, not raw template output. - Charts, tables, diagrams, and images must be sharp, well-aligned, and properly labeled in the PNGs. Legends and axes should be readable without excessive zoom. - Text must be readable at normal viewing size; avoid walls of filler text or dense, unstructured bullet lists. Use whitespace to separate ideas. - Never use the U+2011 non-breaking hyphen or other unicode dashes as they will not be

So far they cover spreadsheets, docx and PDFs. Interestingly their chosen approach for PDFs and documents is to convert them to rendered per-page PNGs and then pass those through their vision-enabled GPT models, presumably to maintain information from layout and graphics that would be lost if they just ran text extraction.

Elias shared copies in a GitHub repo . They look very similar to Anthropic’s implementation of the same kind of idea, currently published in their anthropics/skills repository.

I tried it out by prompting:

Create a PDF with a summary of the rimu tree situation right now and what it means for kakapo breeding season

Sure enough, GPT-5.2 Thinking started with:

Reading skill.md for PDF creation guidelines

Then:

Searching rimu mast and Kākāpō 2025 breeding status

It took just over eleven minutes to produce this PDF , which was long enough that I had Claude Code for web build me a custom PDF viewing tool while I waited.

Here’s ChatGPT’s PDF in that tool .

Screenshot of my tool. There is a URL at the top, a Load PDF button and pagination controls. Then the PDF itself is shown, which reads: Rimu mast status and what it means for the kākāpō breeding season Summary as of 12 December 2025 (Pacific/Auckland context) Kākāpō breeding is tightly linked to rimu (Dacrydium cupressinum) mast events: when rimu trees set and ripen large amounts of fruit, female kākāpō are much more likely to nest, and more chicks can be successfully raised. Current monitoring indicates an unusually strong rimu fruiting signal heading into the 2025/26 season, which sets the stage for a potentially large breeding year in 2026.^1,2 Key numbers at a glance Kākāpō population (official DOC count) 237 birds alive Breeding trigger (rimu fruiting)>10% of rimu branch tips bearing fruit Forecast rimu fruiting for 2026 (DOC monitoring) Around 50–60% fruiting across breeding islands¹Breeding-age females (DOC 2025 planning figure)About 87 females (potentially nearly all could nest)

(I am very excited about Kākāpō breeding season this year .)

The reason it took so long is that it was fastidious about looking at and tweaking its own work. I appreciated that at one point it tried rendering the PDF and noticed that the macrons in kākāpō were not supported by the chosen font, so it switched to something else:

ChatGPT screenshot. Analyzed image. There's an image of a page of PDF with obvious black blocks on some of the letters in the heading. It then says: Fixing font issues with macrons. The page is showing black squares for words like "kākāpō," probably because Helvetica can't handle macrons. I'll switch to a font that supports them, such as DejaVu Sans or Noto Sans. I'll register both regular and bold fonts, then apply them to the document. I'll update the footer to note the issue with Helvetica. Time to rebuild the PDF!

Skills in Codex CLI

Meanwhile, two weeks ago OpenAI’s open source Codex CLI tool landed a PR titled feat: experimental support for skills.md . The most recent docs for that are in docs/skills.md .

The documentation suggests that any folder in ~/.codex/skills will be treated as a skill.

I used Claude Opus 4.5’s skill authoring skill to create this skill for creating Datasette plugins , then installed it into my Codex CLI skills folder like this:

git clone https://github.com/datasette/skill \
  ~/.codex/skills/datasette-plugin

You have to run Codex with the --enable skills option. I ran this:

cd /tmp
mkdir datasette-cowsay
cd datasette-cowsay
codex --enable skills -m gpt-5.2

Then prompted:

list skills

And Codex replied:

- datasette-plugins — Writing Datasette plugins using Python + pluggy (file: /Users/simon/.codex/skills/datasette-plugin/SKILL.md)
- Discovery — How to find/identify available skills (no SKILL.md path provided in the list)

Then I said:

Write a Datasette plugin in this folder adding a /-/cowsay?text=hello page that displays a pre with cowsay from PyPI saying that text

It worked perfectly! Here’s the plugin code it wrote and here’s a copy of the full Codex CLI transcript , generated with my terminal-to-html tool .

You can try that out yourself if you have uvx installed like this:

uvx --with https://github.com/simonw/datasette-cowsay/archive/refs/heads/main.zip \
  datasette

Then visit:

http://127.0.0.1:8001/-/cowsay?text=This+is+pretty+fun

Screenshot of that URL in Firefox, an ASCII art cow says This is pretty fun.

Skills are a keeper

When I first wrote about skills in October I said Claude Skills are awesome, maybe a bigger deal than MCP . The fact that it’s just turned December and OpenAI have already leaned into them in a big way reinforces to me that I called that one correctly.

Skills are based on a very light specification, if you could even call it that, but I still think it would be good for these to be formally documented somewhere. This could be a good initiative for the new Agentic AI Foundation ( previously ) to take on.

Apple fixes two zero-day flaws exploited in 'sophisticated' attacks

Bleeping Computer
www.bleepingcomputer.com
2025-12-12 23:23:25
Apple has released emergency updates to patch two zero-day vulnerabilities that were exploited in an "extremely sophisticated attack" targeting specific individuals. [...]...
Original Article

Apple

Apple has released emergency updates to patch two zero-day vulnerabilities that were exploited in an “extremely sophisticated attack” targeting specific individuals.

The zero-days are tracked as CVE-2025-43529 and CVE-2025-14174 and were both issued in response to the same reported exploitation.

"Apple is aware of a report that this issue may have been exploited in an extremely sophisticated attack against specific targeted individuals on versions of iOS before iOS 26," reads Apple's security bulletin .

CVE-2025-43529 is a WebKit use-after-free remote code execution flaw that can be exploited by processing maliciously crafted web content. Apple says the flaw was discovered by Google’s Threat Analysis Group.

CVE-2025-14174 is a WebKit memory corruption flaw that could lead to memory corruption. Apple says the flaw was discovered by both Apple and Google’s Threat Analysis Group.

Devices impacted by both flaws include:

  • iPhone 11 and later

  • iPad Pro 12.9-inch (3rd generation and later)

  • iPad Pro 11-inch (1st generation and later)

  • iPad Air (3rd generation and later)

  • iPad (8th generation and later)

  • iPad mini (5th generation and later)

On Wednesday, Google fixed a mysterious zero-day flaw in Google Chrome, initially labeling it as “[N/A][466192044] High: Under coordination.”

However, Google has now updated the advisory to identify the bug as “CVE-2025-14174: Out-of-bounds memory access in ANGLE,” which is the same CVE fixed by Apple, indicating coordinated disclosure between the two companies.

Apple has not disclosed technical details about the attacks beyond saying they targeted individuals running versions of iOS before iOS 26.

As both flaws affect WebKit, which Google Chrome uses on iOS, the activity is consistent with highly targeted spyware attacks.

While these flaws were only exploited in targeted attacks, users are strongly advised to install the latest security updates promptly to reduce the risk of ongoing exploitation.

With these fixes, Apple has now patched seven zero-day vulnerabilities that were exploited in the wild in 2025, beginning with CVE-2025-24085 in January , CVE-2025-24200 in February , CVE-2025-24201 in March , and two more in April (CVE-2025-31200 and CVE-2025-31201).

In September, Apple also backported a fix for a zero-day tracked as CVE-2025-43300 to older devices running iOS 15.8.5 / 16.7.12 and iPadOS 15.8.5 / 16.7.12.

tines

Break down IAM silos like Bitpanda, KnowBe4, and PathAI

Broken IAM isn't just an IT problem - the impact ripples across your whole business.

This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.

Apple at the AWS re:Invent 2025 Keynote

Daring Fireball
www.youtube.com
2025-12-12 23:22:39
Six-minute from Amazon’s AWS re:Invent keynote last week: Payam Mirrashidi, VP, Cloud Systems & Platforms, Apple, explains how AWS Graviton helps improve developer velocity at scale. Hear Swift’s journey from the premier programming language for the Apple ecosystem to adoption by millions of...

The Raise the Age Law Is Not Actually Turning NYC Into a Wild Teen Hellhole, According to City Data

hellgate
hellgatenyc.com
2025-12-12 22:36:01
The next front in the war to roll back criminal justice reform was supposed to be juveniles. With a new mayor and a new study casting doubt on the premise, is it still?...
Original Article

This week, the Mayor's Office of Criminal Justice quietly dropped a new report indicating, once again , that the 2018 Raise the Age Law that moved 16- and 17-year-olds out of adult court and increased the age of criminal responsibility in New York state to 18 is not creating a "consequence-free" youth crime wave, as NYPD Commissioner Jessica Tisch has previously said .

According to the report , in 2024, the youth share of citywide felony and violent felony arrests was the same as it was in 2018, and recidivism was stable or decreasing in most categories. "In short, adults, not teens, have disproportionately contributed to the post‑2018 rise in felony arrests," the report says.

The exception is gun arrests for those under the age of 18, which increased by 136 percent since 2018, to 486 arrests in 2024. "While comparable figures for adults are not available, the increase in youth-specific gun incidents suggests a rising exposure to firearms for this group," the report adds.

Give us your email to read the full story

Sign up now for our free newsletters.

Sign up

Can I use HTTPS RRs?

Hacker News
www.netmeister.org
2025-12-12 22:34:13
Comments...
Original Article

December 12th, 2025

RFC9460 , defining SVCB and HTTPS Resource Records, was published in November of 2023. Two years later, however, support for these DNS records is still far from universal. Add to that the fact that the RFC defines a number of SvcParamKeys , which browsers support to different degrees and where developers disagree about the proper behavior, and you end up with no clear picture of which browsers support these records to which end.

Unfortunately even the otherwise ever so useful https://caniuse.com/ does not provide that information, although there's a feature request . In order to quickly be able to answer the question regarding the core features, I ran a few tests to observe in how far the three most popular browsers support these records. (Jump to the table below if all you care about is that list.)

AliasMode / TargetName

Support for this mode is important for anybody looking to implement aliasing of apex domains. In its simplest form, it would like this:

$ host -t https https.dotwtf.wtf
https.dotwtf.wtf has HTTP service bindings 0 www.dotwtf.wtf.
$ host alias.https.dotwtf.wtf
alias.https.dotwtf.wtf has HTTP service bindings 0 www.dotwtf.wtf.
$ 

The first is an apex alias, the second a simple AliasMode non-apex record. Neither name has either an A nor AAAA record. The expected behavior here is that the browser follows the TargetName and connects to https.dotwtf.wtf with an SNI of https.dotwtf. or alias.https.dotwtf.wtf respectively.

ALPN

The Application-Layer Protocol Negotiation (ALPN) parameters allows clients to immediately connect to the destination server using the right protocol and avoid additional round-trips. See this explanation for more details.

$ host alpn-h3.https.dotwtf.wtf
alpn-h3.https.dotwtf.wtf has address 166.84.7.99
alpn-h3.https.dotwtf.wtf has IPv6 address 2602:f977:800:0:e276:63ff:fe72:3900
alpn-h3.https.dotwtf.wtf has HTTP service bindings 1 . alpn="h3,h2"
$ 

The expected behavior here is that the client will immediately make an H3 (i.e., QUIC) connection.

ECH

This parameter is used for Encrypted Client Hello , providing the encryption public key and associated metadata needed by the client to construct the ClientHelloOuter .

$ dig +short https tls-ech.dev
1 . ech=AEn+DQBFKwAgACABWIHUGj4u+PIggYXcR5JF0gYk3dCRioBW8uJq9H4mKAAIAAEAAQABAANAEnB1YmxpYy50bHMtZWNoLmRldgAA
$ 

On the wire, this looks like so:

IP Hints

The RFC defines ipv4hint and ipv6hint parameters, but their usefulness remains opaque to me. A client MAY use the hints, but still has to perform the lookup and then enter the results in the cache. That is, in effect the only time hints are used is to cut down the time to first byte, but even that is a "MAY", not even a "SHOULD".

This also leads to some confusion amongst implementers and users when the service name has no A / AAAA records.

$ dig +short https iphints.https.dotwtf.wtf
1 . ipv4hint=166.84.7.99 ipv6hint=2602:f977:800:0:e276:63ff:fe72:3900
$ host iphints.https.dotwtf.wtf
$ 

The expectation here is that the client will use the hints to connect to the service name, although there appears to be disagreement on whether a service name has to have IP addresses outside of the hints.

There are a million other scenarios where your authority endpoint might have a different set of IPs, how to handle cache expiration, how to handle CNAMEs, conflicts if the authority endpoint itself has a different HTTPS record with different IP hints, and so on and so on.

I didn't check all permutations here, but I did check which IPs the browsers will use if they get conflicting results back:

$ for r in A AAAA HTTPS; do
> dig +short $r wrong-iphints.https.dotwtf.wtf
> done
198.51.100.188
2001:db8::8c93:2c23:262f:6ffb
1 . ipv4hint=127.0.0.1 ipv6hint=2001:db8::1
$ 

Port

This is straight forward:

$ dig +short https port4343.https.dotwtf.wtf
1 . port=4343
$ 

The expectation here is that the client will make a TLS connection to port 4343. (Note: even if we specified port 80 here, the client should still make a TLS connection.)

Browser Capabilities

Note: Chrome will not perform HTTPS lookups if alternate resolvers are configured .

Note: Firefox will not perform HTTPS lookups unless DoH is enabled.

Chrome Firefox Safari Notes
AliasMode Firefox Bug
Chromium issue
ALPN previously discussed
ECH
IP Hints (no A / AAAA) Firefox bug
Chromium issue
IP Hints (with A / AAAA) Firefox / Safari race results
Port Chromium issue

Last updated: 2025-12-12

All tests were run on macOS Sequoia 15.7.2 using:

  • Google Chrome 43.0.7499.41
  • Mozilla Firefox 146.0
  • Safari 26.1 (20622.2.11.119.1)

December 12th, 2025


Links:

Wine 11.0 RC2 – Run Windows Applications on Linux, BSD, Solaris and macOS

Hacker News
gitlab.winehq.org
2025-12-12 22:05:04
Comments...

Show HN: Tiny VM sandbox in C with apps in Rust, C and Zig

Hacker News
github.com
2025-12-12 22:02:14
Comments...
Original Article

🌱 uvm32

uvm32 is a minimalist, dependency-free virtual machine sandbox designed for microcontrollers and other resource-constrained devices. Single C file, no dynamic memory allocations, asynchronous design, pure C99.

On an STM32L0 (ARM Cortex-M0+) the required footprint is under 4KB flash/1KB RAM.

uvm32 is a RISC-V emulator, wrapped in a management interface and provided with tools to build efficient code to run in it.

What is it for?

  • As a no-frills alternative to embedded script engines ( Lua , Duktape , MicroPython , etc)
  • As a sandbox to isolate untrusted or unreliable elements of a system
  • As a way to allow development in modern systems programming languages where a compiler for the target may not be available ( rust-hello )
  • As a way to write once, run anywhere and avoid maintaining multiple software variants

Features

  • Bytecode example apps written in C, Zig, Rust and assembly
  • Non-blocking design, preventing misbehaving bytecode from stalling the host
  • No assumptions about host IO capabilities (no stdio)
  • Simple, opinionated execution model
  • Safe minimally typed FFI
  • Small enough for "if this then that" scripts/plugins, capable enough for much more
  • Aims for safety over speed, bad code running in the VM should never be able to crash the host

Although based on a fully fledged CPU emulator , uvm32 is intended for executing custom script like logic, not for simulating hardware.

How does it compare to the alternatives?

Many scripting languages and virtual machines are available for embedding in small systems and they all make tradeoffs in different dimensions.

uvm32 aims for:

  • Small footprint (suitable for embedded devices, games and apps)
  • Support well-known programming languages for VM code (with high quality dev tools)
  • Ease of integration into existing software
  • Flexibility of paradigm (event driven, polling, multi-processor)
  • Robustness against misbehaving VM code

uvm32 does not aim for:

  • Frictionless FFI (no direct function calls between host and VM code)
  • Maximum possible efficiency
  • The simplest scripting experience for VM code (a develop-compile-run cycle is expected)
  • "Batteries included" libraries to do stdio, networking, etc

Understanding this repo

uvm32 is a tiny virtual machine, all of the code is in uvm32 .

A minimal example of a host to run code in is at host-mini .

Everything else is a more advanced host example, or a sample application which could be run in a host.

Example

A simple VM host from host-mini

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include "uvm32.h"
#include "uvm32_common_custom.h"

uint8_t rom[] = { // mandel.bin
  0x23, 0x26, 0x11, 0x00, 0xef, 0x00, 0xc0, 0x00, 0xb7, 0x08, 0x00, 0x01,
  ...
  ...
};

int main(int argc, char *argv[]) {
    uvm32_state_t vmst;
    uvm32_evt_t evt;
    bool isrunning = true;

    uvm32_init(&vmst);
    uvm32_load(&vmst, rom, sizeof(rom));

    while(isrunning) {
        uvm32_run(&vmst, &evt, 100);   // num instructions before vm considered hung

        switch(evt.typ) {
            case UVM32_EVT_END:
                isrunning = false;
            break;
            case UVM32_EVT_SYSCALL:    // vm has paused to handle UVM32_SYSCALL
                switch(evt.data.syscall.code) {
                    case UVM32_SYSCALL_PUTC:
                        printf("%c", uvm32_arg_getval(&vmst, &evt, ARG0));
                    break;
                    case UVM32_SYSCALL_PRINTLN: {
                        const char *str = uvm32_arg_getcstr(&vmst, &evt, ARG0);
                        printf("%s\n", str);
                    } break;
                    case UVM32_SYSCALL_YIELD:
                    break;
                    default:
                        printf("Unhandled syscall 0x%08x\n", evt.data.syscall.code);
                    break;
                }
            break;
            case UVM32_EVT_ERR:
                printf("UVM32_EVT_ERR '%s' (%d)\n", evt.data.err.errstr, (int)evt.data.err.errcode);
            break;
            default:
            break;
        }
    }

    return 0;
}

Samples

Quickstart (docker)

The code in uvm32 to build a VM host is very portable and requires only a C compiler. However, many of the examples provided show how to build target code with different languages and tools. A Dockerfile is provided to set up the required environment.

make dockerbuild
make dockershell

Then, from inside the docker shell

make

./hosts/host/host apps/helloworld/helloworld.bin

host is the command line test VM for running samples. Run host -h for a full list of options.

More information

The best source of information is the header file uvm32/uvm32.h and the tests .

Also see doc/README.md

License

This project is licensed under the MIT License. Feel free to use in research, products and embedded devices.

Friday Squid Blogging: Giant Squid Eating a Diamondback Squid

Schneier
www.schneier.com
2025-12-12 22:00:30
I have no context for this video—it’s from Reddit—but one of the commenters adds some context: Hey everyone, squid biologist here! Wanted to add some stuff you might find interesting. With so many people carrying around cameras, we’re getting more videos of giant squid at the...
Original Article

I have no context for this video —it’s from Reddit—but one of the commenters adds some context:

Hey everyone, squid biologist here! Wanted to add some stuff you might find interesting.

With so many people carrying around cameras, we’re getting more videos of giant squid at the surface than in previous decades. We’re also starting to notice a pattern, that around this time of year (peaking in January) we see a bunch of giant squid around Japan. We don’t know why this is happening. Maybe they gather around there to mate or something? who knows! but since so many people have cameras, those one-off monster-story encounters are now caught on video, like this one (which, btw, rips. This squid looks so healthy, it’s awesome).

When we see big (giant or colossal) healthy squid like this, it’s often because a fisher caught something else (either another squid or sometimes an antarctic toothfish). The squid is attracted to whatever was caught and they hop on the hook and go along for the ride when the target species is reeled in. There are a few colossal squid sightings similar to this from the southern ocean (but fewer people are down there, so fewer cameras, fewer videos). On the original instagram video, a bunch of people are like “Put it back! Release him!” etc, but he’s just enjoying dinner (obviously as the squid swims away at the end).

As usual, you can also use this squid post to talk about the security stories in the news that I haven’t covered.

Blog moderation policy.

Tags: ,

Posted on December 12, 2025 at 5:00 PM 0 Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.

Motion (YC W20) Is Hiring Senior Staff Front End Engineers

Hacker News
jobs.ashbyhq.com
2025-12-12 21:00:54
Comments...

GNU Unifont

Hacker News
unifoundry.com
2025-12-12 20:57:34
Comments...
Original Article

GNU Unifont is part of the GNU Project. This page contains the latest release of GNU Unifont, with glyphs for every printable code point in the Unicode Basic Multilingual Plane (BMP). The BMP occupies the first 65,536 code points of the Unicode space, denoted as U+0000..U+FFFF. There is also growing coverage of the Supplementary Multilingual Plane (SMP), in the range U+010000..U+01FFFF, and of Michael Everson's ConScript Unicode Registry (CSUR) with Rebecca Bettencourt's Under-CSUR additions.

Commercial Use

A user has asked if GNU Unifont can be used with commercial (non-free) software. The answer is yes. The GNU Font Embedding Exception and the SIL OFL allow for that. See the next section for details. The main purpose of the licensing is to require derivative fonts that others create to be released to the public under the same licensing terms, not to prohibit the use of those fonts with certain software. Thus, preserving the license terms in derivative fonts provides a public benefit. The licenses also provide acknowledgement of previous Unifont contributors for their volunteer work.

Copyright, Derivative Works, and License

Thousands of Unifont glyphs are creations of individual Unifont contributors; those glyphs enjoy copyright protections of various degrees. Some of those contributions are letter forms of established alphabets while others are icon (symbol) designs such as the many animal icons which, as artistic designs, have even stronger international protections. See for example this memorandum of applicable laws of Berne Union member country Germany (where Unifont was created): Unifont Copyright Protections .

Derivative variants of Unifont are permitted under the terms of the dual license: GNU GPLv2+ with the GNU Font Embedding Exception and the SIL Open Font License version 1.1. These are free licenses. The remainder of this section provides details.

These font files are licensed under the GNU General Public License, either Version 2 or (at your option) a later version, with the exception that embedding the font in a document does not in itself constitute a violation of the GNU GPL. The full terms of the license are in LICENSE.txt .

As of Unifont version 13.0.04, the fonts are dual-licensed under the SIL Open Font License (OFL) version 1.1 and the GNU GPL 2+ with the GNU font embedding exception. The SIL OFL is available at OFL-1.1.txt .

Font Downloads

The standard font build — with and without the ConScript Unicode Registry (CSUR) / Under-CSUR Private Use Area (PUA) glyphs. Download in your favorite format:

Specialized versions — built by request:

  • PSF: A specialized PSF 1 console frame buffer font consisting of 512 glyphs for use with APL, A Programming Language, in console mode (single-user mode on GNU/Linux, etc.), mainly to support GNU APL: Unifont-APL8x16-17.0.03.psf.gz (4 kbytes)
  • HEX: All the Plane 0 glyphs in Roman's .hex format, for those who wish to experiment: unifont-17.0.03.hex.gz (1 Mbyte)
  • HEX: The above .hex file with combining circles added: unifont_sample-17.0.03.hex.gz (1 Mbyte)

On Windows or Mac OS X, unzip the .ttf.zip file or download the uncompressed .ttf file and copy the font to your Fonts folder. On Microsoft Windows, this folder is located under the Windows folder on your main disk. On a Mac, this is located under the Library folder on your main disk.

For best appearance on a Mac in a Terminal window, select Terminal from the menu, then Preferences. A Settings window will appear. Make sure that you're on the Text tab in that window. Then make sure that the "Antialias text" box is checked. The OpenType version of the font should then look fine at point sizes of 12pt and larger. The font won't look very legible in a Mac Terminal window unless you select this antialias option.

Note: BDF, PCF, and OpenType files contain dimension and spacing information for each glyph in a font. Some font rendering engines ignore this glyph information that the font file provides. This is especially true of rendering engines designed to handle monospace fonts. Unifont will not display all glyphs correctly with such software. The BDF font follows BDF version 2.1 (not version 2.2) because the the X Window System standardized on version 2.1. The PSF 1 version of Unifont is a monospace font but is limited to 512 glyphs, and is only of use with font rendering engines that support more than 256 glyphs in a console frame buffer font.

All unifont.hex sources are in the full Unifont Utilities download page.

Unifont Limitations

Unifont only stores one glyph per printable Unicode code point. This means that complex scripts with special forms for letter combinations including consonant combinations and floating vowel marks such as with Indic scripts (Devanagari, Bengali, Tamil, etc.) or letters that change shape depending upon their position in a word (Indic and Arabic scripts) will not render well in Unifont. In those cases, Unifont is only suitable as a font of last resort. Users wishing to properly render such complex scripts should use full OpenType fonts that faithfully display such alternate forms.

Drawing New Glyphs

If you would like to contribute glyphs, please email unifoundry at gmail in advance (not spelled out because of spammers). Several contributors are working on new glyphs, and it would be unfortunate to have multiple persons drawing the same glyphs.

Special Note: New Plane 2 and Plane 3 CJK Glyphs

The People's Republic of China (PRC) has a set of 15-by-16 pixel Chinese glyphs for Unicode Plane 2 and Plane 3. However, those glyphs are copyrighted and licensed for sale by the Government of the PRC, and thus they cannot be used in a free font. If you happen to have any of those copyrighted 15-by-16 pixel glyphs, please do not send them for inclusion. Unifont includes many glyphs in this range, drawn by Chinese and Japanese volunteers. More are planned for the future.

Release Notes

This latest release is part of the GNU Project. You can view the GNU Project Unifont Page on Savannah.

The theoretical maximum number of printable glyphs in the Unicode Plane 0 range is 65,536 code points minus the 2,048 surrogate pair code points, minus the 6,400 Private Use Area code points, minus the two noncharacters (U+FFFE and U+FFFF). This amounts to 57,086 assignable code points apart from the Private Use Area.

The theoretical maximum number of printable glyphs in the higher Unicode planes is 65,534; the last two code points in each plane are reserved as noncharacters.

Unifont 17.0

  • 1 November 2025 (Unifont 17.0.03)
    • 晓晓Akatsuki, Boris Zhang, Kusanagi_Sans, and others updated over 100 Chinese ideographs in Planes 0, 2, and 3:
      • Modified ideographs containing the "馬" (Horse) and "鳥" (Bird) radicals to be more balanced
      • The first batch of simplified Chinese character lists (第一批簡體字表) (August 21, 1935~February 1936)
      • Chinese Character Simplification Scheme (1956–1986)
      • The simplified character list (1986–2013), including simplified radicals and Chinese characters used for descriptions and annotations in official documents
      • The second list of the Second round of simplified Chinese characters (not officially implemented)
      • Ideographs required by the Specification of Common Modern Chinese Character Components and Components Names (现代常用字部件及部件名称规范), published in mainland China in 2009
      • Other changes; see the ChangeLog file in the main package for details.
  • 18 October 2025 (Unifont 17.0.02)
    • Plane 0:
      • Paul Hardy modififed U+1521, U+A93D, and U+FB30.
      • David Corbett modified U+2B96, U+!7CE, U+A7CF, and U+A7D2
      • 晓晓Akatsuki adjusted U+4748, U+6B25, and U+6F78 per the latest Unicode recommendations. Adjusted U+5100 to be 16 pixels tall.
    • Plane 1:
      • Paul Hardy modififed U+1E912 per Unicode 17.0.0 errata.
      • David Corbett modified U+1CEDD, U+1E6DE, U+1F778, U+1CEF0, U+1F77A, U+11DCC, U+11DCD, U+11DD6. Adjusted base height in chess glyphs U+1FA54..U+1FA57 and eye height in U+1FA55 and U+1FA57 to match eye height of knights.
      • 晓晓Akatsuki drew smaller versions of U+16FF2 and U+16FF3.
    • Plane 2:
      • For complete coverage of jf7000 0.9, Boris Zhang added U+217DA and U+21A4B; 湖 远星 added U+24259, U+249DF, and U+270FD.
      • 晓晓Akatsuki redrew U+29B9A.
    • Plane 3:
      • 晓晓Akatsuki drew U+323B0..U+32401, U+32403..U+32406, U+32409..U+32452, U+32454..U+3246A, U+3246C..U+3247D, U+3247F..U+32484, U+32486..U+3248D, and U+3248F..U+324A4.
      • Boris Zhang redrew U+2B7A4, U+2B7E8, and U+2EC06.
      • Boris Zhang drew U+32402, U+32407, U+32408, U+32453, U+3246B, U+3247E, U+32485, U+3248E, U+324D1, U+324DD, U+324DE, U+324E0, U+3251F, U+32520, U+3261E, U+32623, U+32629, U+3262C, U+32631, U+32632, U+32635, U+3263B, U+3263C, U+3263F, U+32640, U+32641, U+32644, U+32646, U+32647, U+3264F, U+32650, U+32656, U+32657, U+3265A, U+3265D, U+3265E, U+3265F, U+32660, U+32662, U+32669, U+3266F, U+32672, U+32674, U+3267B, U+3267C, U+3267D, U+32688, U+3268A, U+3268F, U+32694, U+32695, U+3269A, U+326AD, U+326BD, U+326BE, U+326C0, U+326F5, U+326FA, U+32709, U+3270B, U+32714, U+32744, U+32748, U+3274B, U+3274C, U+3274E, U+32755, U+32765, U+32768, U+32769, U+327C0, U+327C3, U+327FB, U+32800, U+32859, U+3285A, U+3285F, U+3287C, U+32901, U+32940, U+3295B, U+32985, U+32996, U+32997, U+32A3A, U+32A3B, U+32A4C, U+32A5A, U+32A72, U+32A98, U+32ACE, U+32AF9, U+32BDD, U+32BE1, U+32BE8, U+32BFF, U+32C0A, U+32C0D, U+32C0E, U+32C13, U+32C36, U+32C37, U+32C39, U+32C3B, U+32C3C, U+32C82, U+32CCA, U+32CCC, U+32CCD, U+32CD3, U+32CD4, U+32CD6, U+32CDA, U+32CDC, U+32CDD, U+32CE4, U+32CE5, U+32CE6, U+32CF3, U+32CF4, U+32CF7, U+32CFD, U+32D09, U+32D46, U+32D49, U+32D4F, U+32D50, U+32D79, U+32D89, U+32DAF, U+32DB7, U+32E2F, U+32E31, U+32E34, U+32E8B, U+32EA3, U+32ED0, U+32EF0, U+32EF2, U+32F17, U+32F1D, U+32F1E, U+32F36, U+32F49, U+32F56, U+32F58, U+32F59, U+32F6C, U+32F8C, U+32F9D, U+32FF9, U+3301A, U+33074, U+33084, U+330AC, U+330D4, U+330F1, U+330F2, U+33113, U+33157, U+331B2, U+331E7, U+331E9, U+331EA, U+331EF, U+331F0, U+331F2, U+33211, U+33213, U+33214, U+33215, U+33216, U+33217, U+3321B, U+33220, U+3323A, U+33255, U+33257, U+33261, U+33279, U+3327A, U+3327C, U+33282, U+33283, U+33287, U+3328B, U+3328E, U+3328F, U+33291, U+33293, U+33294, U+332AD, U+332BF, U+332E1, U+3331A, U+3331E, U+33366, U+33397, U+33398, U+3339A, U+3339D, U+33400, U+3340A, and U+33410.
  • 9 September 2025 (Unifont 17.0.01)
    • Plane 0:
      • David Corbett contributed the new Arabic glyphs:
        • Arabic Extended-B: U+088F
        • Riyal currency symbol: U+20C1
        • Arabic Presentation Forms-A: U+FBC3..U+FBD2, U+FD90, U+FD91, and U+FDC8..U+FDCE.
      • Paul Hardy added:
        • Telugu: U+0C5C
        • Kannada: U+0CDC
        • Combining Diacritical Marks Extended: U+1ACF..U+1ADD, and U+1AE0..U+1AEB
        • Updated U+1BBF
        • Miscellaneous Symbols and Arrows: U+2B96
        • Latin Extended-D: U+A7CE, U+A7CF, U+A7D2, U+A7D4, and U+A7F1
        • Latin Extended-E: modified U+AB4B and U+AB4C per Unicode 17.0.0 recommendation.
    • Plane 1:
      • David Corbett contributed the new Arabic Extended-C glyphs: U+10EC5..U+10EC7, U+10ED0..U+10ED8, U+10EFA, and U+10EFB.
      • Johnnie Weaver contributed:
        • U+10940..U+1095F (Sidetic)*
        • U+11DB0..U+11DEF (Tolong Siki)*
        • U+16EA0..U+16EDF (Beria Erfe)*
        • U+1E900..U+1E95F (Adlam), modified per Unicode 17.0.0 changes.
      • Paul Hardy contributed:
        • U+11B60..U+11B67 (Sharada Supplement)*
        • U+16FF2..U+16FF6 (Ideographic Symbols and Punctuation)
        • U+1CCFA..U+1CCFC and U+1CEBA..U+1CEBF (Symbols for Legacy Computing Supplement).
        • U+1CEC0..U+1DEFF (Miscellaneous Symbols Supplement)*
        • U+1E6C0..U+1E6FF (Tai Yo)*
        • U+1F6D8 (Transport and Map Symbols)
        • U+1F8D0..U+1F8D8 (Supplemental Arrows-C)
        • U+1FBFA (Symbols for Legacy Computing).
    • Plane 2:
      • Yzy32767 made these contributions:
        • Improved these glyphs in the first list of the second round of simplified Chinese characters: U+0200D3 and U+0201A8
        • Added these glyphs in the first list of the second round of simplified Chinese characters: U+20B15, U+20BB5, U+20CAD, U+219F3, U+21C52, U+22342, U+22488, U+22A83, U+2418A, U+2462F, U+26678, U+26B01, U+2A9F7, U+2BA4F, U+2BA51, U+2BBDC, U+2BCB7, U+2BDC0, U+2BE6F, U+2D026, U+2D64F, U+2D70C, U+2DCFF, and U+2E0B9
        • Fixed U+2CAD2, which wfz2020 noticed appeared as the glyph for code point U+2CA02.
    • Plane 3:
      • Yzy32767 made these contributions:
        • Improved these glyphs in the first list of the second round of simplified Chinese characters: U+030008, U+030061, U+03006C, U+03011D, U+03014A, and U+0301E3
        • Added these glyphs: U+30180..U+301E2, U+301E4..U+301FF, U+30270, U+302D9, U+302DB, U+302DC, U+302DE, U+302F7, U+302FB, U+30335, U+3033B, U+3034E, U+30370, U+30371, U+30409, U+30414, U+3043A, U+3043F, U+3044A, U+3044C, U+30450, U+3045D, U+3045E, U+304CC, U+304E2, U+304E3, U+304E8, U+304ED, U+3057A, U+305D1, U+305DD, U+305F6, U+3067D, U+306D1, U+306D3, U+306EC, U+30708, U+30776, U+30831, U+30842, U+308F2, U+3094C, U+30955, U+30969, U+30993, U+309AA, U+309AB, U+30A1B, U+30A62, U+30AA9, U+30AB1, U+30AFE, U+30B04, U+30B0A, U+30B15, U+30B43, U+30B4B, U+30B5D, U+30BF6, U+30C21, U+30C2B, U+30CBD, U+30CC7, U+30D4B, U+30D55, U+30D5F, U+30DDF, U+30DE3, U+30E01, U+30E04, U+30E70, U+30E79, U+30F03, U+30F5F, U+30F64, U+30F82, U+310AA, U+3114C, and U+31151.
    *New in Unicode 17.0.0.

Unifont 16.0

  • 31 May 2025 (Unifont 16.0.04)
    • Plane 0:
      • Paul Hardy
        • Modified the archaic Greek digamma glyphs, U+03DC and U+03DD.
        • Modified the Korean Won currency symbol, U+20A9, to only have one bar.
        • Removed Variation Selector glyphs (U+FE00..U+FE0F) from default OpenType and TrueType font builds; they remain in the sample and SBIT font builds.
      • David Corbett
        • Modified Arabic glyphs U+0610, U+0616, U+061E, U+0620, and U+0626.
        • Redrew the yeh-based glyphs in the ranges U+FC31..U+FDC7 (Arabic Presentation Forms-A) and U+FE89..U+FE8C (Arabic Presentation Forms-B).
      • Johnnie Weaver modified some Georgian Supplement glyphs (U+2D00..U+2D2F).
      • 晓晓_Akatsuki (Xiao_Akatsuki) modified U+2EB2 per Unicode updates.
    • Plane 1:
      • Paul Hardy
        • Updated Old Turkic glyph U+10C47.
        • Updated Khitan Small Script glyph U+18CCA.
        • Reverted several changes in Musical Symbols (U+1D100..U+1D1FF) for better positioning with combining characters. Thanks go out to David Corbett for requesting the changes.
        • Modified mathematical bold digamma (U+1D7CA, U+1D7CB) to match the updated digamma glyphs in Plane 0.
      • Josh Hufford contributed modified emoji glyphs U+1F602, U+1F605, U+1F606, U+1F607, U+1F609, U+1F923, and U+1FAE0.
    • Plane 14:
      • Paul Hardy Removed Variation Selector glyphs (U+E0100..U+E01EF) from default OpenType and TrueType font builds; they remain in the sample and SBIT font builds.
    • Plane 15 (CSUR/UCSUR):
      • soweli Kape [sic] and NikZapp
        • Updated Sitelen Pona (U+F1900..U+F19FF)
        • Updated Sitelen Pona Radicals (U+F1C80..U+F1C9F).
      • Paul Hardy
        • Added Titi Pula (U+F1C40..UU+F1C60)
        • Added Zbalermorna (U+F28A0..UU+F28DF).
  • 19 April 2025 (Unifont 16.0.03)
    • Plane 0:
      • David Corbett redrew some Arabic glyphs for consistency. Most of these are minor changes to baseline, i‘jam positioning, or making a derived letter match its origin letter. Code points: U+0625, U+0634, U+0673, U+06B9, U+06BC, U+0753, U+0754, U+0757, U+075C, U+0762, U+0767, U+0769, U+076A, U+076D, U+0770, U+0775, U+0776, U+0777, U+077D, U+077E, U+08A1, U+08A2, U+08A3, U+08A6, U+08A8, U+08B1, U+08BB, and U+08BC.
      • 晓晓_Akatsuki (Xiao_Akatsuki) submitted several CJK refinements from the team of 湖 远星:
        • Improved 褝 (U+891D) and 肞 (U+809E).
        • Updated to reflect current Unicode rendering: 㳽 (U+3CFD), 㸿 (U+3E3F), 䑮 (U+446E), 䒳 (U+44B3), 䕈 (U+4548), and 䩶 (U+4A76).
        • Updated as per GB18030-2022 change: 垕 (U+5795).
        • Modified to comply with the GB18030-2022 standard pertaining to character composition:
          • 姉 (U+59C9): This character is a phono-semantic character. Therefore, the right side should be "市" (U+5E02) instead of "巿" (U+5DFF).
          • 濲 (U+6FF2): This character is a variant of "瀔" (U+7014), and the "穀" (U+7A40) on the right side of "瀔" (U+7014) is a phono-semantic character, and its "semantic" part is "禾" (U+79BE), not "木" (U+6728).
          • 膥 (U+81A5): This character is a Cantonese character for "egg". Not yet (未) Become (成) Meat (肉) → Egg, so the upper left corner should be "未" (U+672A), not "末" (U+672B).
      • David Corbett:
        • Redrew some Arabic Presentation Forms: U+FD42, U+FD43, U+FD44, U+FD45, U+FDF0, U+FDF1, U+FDF4, U+FDF6, U+FDF7, U+FDFA, U+FDFB, U+FDFC, U+FE87, U+FE88, U+FEB5, and U+FEB6.
        • Modified the top serifs of two Latin fullwidth letters, U+FF44 and U+FF4B.
      Plane 1:
      • Paul Hardy added new glyphs in Egyptian Hieroglyph Format Controls (U+13430..U+1345F).
      • Paul Hardy and David Corbett made adjustments to glyphs in the Musical Symbols block (U+1D100..U+1D1FF).
      Plane 2:
      • 晓晓_Akatsuki modified U+25ED7 from 16 columns wide to 15 columns.
      • Hayden Wong contributed U+29B00..U+29CFF.
      • Cod'dte sent a corrected left-hand side of U+2EE57.
      Plane 3:
      • Luke036 has drawn a much-improved glyph for taito (U+3106C).
  • 1 December 2024 (Unifont 16.0.02)
    • Plane 0:
      • Johnnie Weaver modified the U+13C9 Cherokee and U+AB99 Cherokee Supplement glyphs.
      • 湖 远星 modified Chinese glyphs U+605C, U+6669, and U+6A37.
    • Plane 1:
      • Johnnie Weaver modified several glyphs in the ranges U+10880..U+108AF (Nabataean) and U+108E0..U+108FF (Hatran) so these scripts are now completely half-width.
      • Paul Hardy modified several Tulu-Tilagari glyphs (U+11380..U+113FF), and modified the Kawi glyph U+11F5A to resemble U+11F49 (per David Corbett's recommendations).
      • Xiao Akatsuki (晓晓 Akatsuki) fixed a missing vertical stroke in U+18B2D.
      • 湖 远星 added more space between the two halves of U+1F232.
    • Plane 2:
      • Hayden Wong made these changes:
        • Modified U+20083, U+20087, U+20089, and U+200B4 from 16 columns wide to 15 columns.
        • Added the missing glyphs in the range U+20000..U+299FF.
        • Completed U+29D00..U+29DFF.
        • Added U+2B64E, which is an incorrect variant of U+513A (儺).
      • 晓晓 Akatsuki contributed the missing glyphs in the range U+20700..U+207FF.
      • 湖 远星 modified U+28A0F, U+28B4E, U+2CB5B, and U+2CB73 from 16 columns wide to 15 columns.
      • Boris Zhang noticed that U+2C7EC was the glyph for U+2CE7C, so it was removed.
  • 10 September 2024 (Unifont 16.0.01)
    • Plane 0:
      • David Corbett added U+0897, ARABIC PEPET.
      • Paul Hardy added the new glyphs in Balinese (U+1B4E, U+1B4F, and U+1B7F), Cyrillic Extended-C (U+1C89, U+1C8A), and Latin Extended-D (U+A7CB..U+A7CD, U+A7DA..U+A7DC).
      • Johnnie Weaver :
        • Modified Cherokee glyphs U+13C9 and U+AB99.
        • Changed these glyphs to half-width: U+210E, U+210F, U+212E, U+212F, U+2300, U+2329, U+232A, U+2610, U+2611, U+2612, U+2713, U+2717, U+A728, U+A729, U+A732, U+A733, U+A734, U+A735, U+A736, U+A737, U+A738, U+A739, U+A73A, U+A73B, U+A73C, U+A73D, U+A74E, U+A74F, U+A758, U+A759, U+A771, U+A772, U+A773, U+A774, U+A775, U+A776, U+A777, U+A797, U+A7C2, and U+A7C3.
      • Boris Zhang drew the Suzhou Numerals six through nine (U+3026..U+3029).
      • Rebecca Bettencourt drew the new Control Pictures glyphs, U+2427..U+2429.
      • Yzy32767 redrew the Bopomofo glyphs (U+3105..U+312F) in a Kai (楷) style.
      • 湖 远星 contributed the new glyphs in CJK Strokes (U+31D2, U+31E4, and U+31E5).
      • Hayden Wong modified U+3862 per Unicode 15.1.0.
    • Plane 0 CSUR/UCSUR:
      • Rebecca Bettencourt contributed:
        • U+E400..U+E59F Herman Miller's previosuly missing scripts
        • U+E6D0..U+E6EF Amlin
        • Unifon glyphs U+E6FD and U+E73D, previously missing
        • U+EC70..U+ECEF Graflect.
      • Danae Dekker contributed:
        • U+EC00..U+EC2F Cylenian
        • U+EC30..U+EC6F Syrrin.
    • Plane 1:
      • Johnnie Weaver contributed:
        • U+105C0..U+105FF Todhri*
        • U+10D40..U+10D8F Garay*
        • U+11BC0..U+11BFF Sunuwar*
        • U+16D40..U+16D7F Kirat Rai*
        • Khitan Small Script (U+18BD2, U+18BFF)
        • U+1E5D0..U+1E5FF Ol Onal.*
      • David Corbett contributed:
        • Arabic Extended-C glyphs U+10EC2..U+10EC4, U+10EFC.
        • U+16100..U+1613F Gurung Khema*
      • Paul Hardy contributed:
        • U+11380..U+113FF Tulu-Tigalari*
        • U+116D0..U+116E3 Myanmar Extended-C*
        • Kawi glyph U+11F5A
        • Symbols and Pictographs Extended-A glyphs U+1FA89, U+1FA8F, U+1FABE, U+1FAC6, U+1FADC, U+1FADF, and U+1FAE9.
      • Rebecca Bettencourt contributed:
        • U+1CC00..U+1CCF9 Symbols for Legacy Computing Supplement*
        • Supplemental Arrows-C glyphs U+1F8B2..U+18BB, U+1F8C0, and U+1F8C1
        • Symbols for Legacy Computing glyphs U+1FBCB..U+1FBEF.
      • anonymous1 redrew Enclosed Ideographic Supplement glyph U+1F200.
    • Plane 2:
      • Hayden Wong contributed the new glyphs in CJK Unified Ideographs Extension B U+20020..U+2004F and U+29E00..2A0FF.
      • twuchiutann contributed the new glyphs in CJK Unified Ideographs Extension B U+20050..U+2073F.
      • Boris Zhang redrew CJK Unified Ideographs Extension D glyphs U+2B75F, U+2B76B, and Extension I glyphs U+2B7EF, U+2EC1F, U+2EC20, U+2EC21, U+2EC2F, U+2EC6F, U+2ECBF, U+2ECEC, and U+2ED42.
      • 湖 远星 contributed the following glyphs, which are common in Cantonese, Hokkien, Hakka, etc., from a list provided with the Ichiten font.
        • CJK Unified Ideographs Extension B glyphs:
          U+203B7 𠎷 U+20546 𠕆 U+20584 𠖄 U+205FB 𠗻 U+207A9 𠞩
          U+207AD 𠞭 U+20803 𠠃 U+2081D 𠠝 U+20895 𠢕 U+20BD7 𠯗
          U+20C41 𠱁 U+20CBF 𠲿 U+20CD4 𠳔 U+20D5D 𠵝 U+20D71 𠵱
          U+20DA7 𠶧 U+20E76 𠹶 U+20E98 𠺘 U+20ED8 𠻘 U+20F3B 𠼻
          U+20F7E 𠽾 U+21014 𡀔 U+210AB 𡂫 U+210F6 𡃶 U+21145 𡅅
          U+2176D 𡝭 U+217D3 𡟓 U+2180D 𡠍 U+21883 𡢃 U+2197C 𡥼
          U+21C2A 𡰪 U+21CA2 𡲢 U+21CDE 𡳞 U+21DD1 𡷑 U+21F0F 𡼏
          U+221A1 𢆡 U+22399 𢎙 U+224DC 𢓜 U+2251B 𢔛 U+22775 𢝵
          U+22AB1 𢪱 U+22AE6 𢫦 U+22BED 𢯭 U+22BFE 𢯾 U+22C4B 𢱋
          U+22C62 𢱢 U+22C64 𢱤 U+22CB4 𢲴 U+22CB8 𢲸 U+22CC6 𢳆
          U+22CEA 𢳪 U+22D80 𢶀 U+22F0C 𢼌 U+22F1B 𢼛 U+23073 𣁳
          U+23074 𣁴 U+23350 𣍐 U+236BA 𣚺 U+236EE 𣛮 U+23B88 𣮈
          U+23CA9 𣲩 U+23EF8 𣻸 U+23F0E 𣼎 U+240D2 𤃒 U+241AC 𤆬
          U+24259 𤉙 U+242B6 𤊶 U+2430D 𤌍 U+24352 𤍒 U+24364 𤍤
          U+24419 𤐙 U+24430 𤐰 U+24605 𤘅 U+2479A 𤞚 U+24C8D 𤲍
          U+24D80 𤶀 U+24D83 𤶃 U+24E01 𤸁 U+24E31 𤸱 U+24E85 𤺅
          U+24EA7 𤺧 U+24EAA 𤺪 U+25148 𥅈 U+2517E 𥅾 U+2531A 𥌚
          U+25349 𥍉 U+25435 𥐵 U+2546E 𥑮 U+257C7 𥟇 U+25BDF 𥯟
          U+25BE5 𥯥 U+25C14 𥰔 U+25D0A 𥴊 U+25E86 𥺆 U+2624E 𦉎
          U+26293 𦊓 U+26706 𦜆 U+267EA 𦟪 U+2688A 𦢊 U+2690E 𦤎
          U+26E05 𦸅 U+2725F 𧉟 U+27304 𧌄 U+27371 𧍱 U+27486 𧒆
          U+277F0 𧟰 U+279A0 𧦠 U+27A63 𧩣 U+27B2A 𧬪 U+27B99 𧮙
          U+27EF4 𧻴 U+27FC1 𧿁 U+27FEC 𧿬 U+27FF3 𧿳 U+280BE 𨂾
          U+280BF 𨂿 U+280E9 𨃩 U+280F0 𨃰 U+28154 𨅔 U+282CD 𨋍
          U+2837D 𨍽 U+2838A 𨎊 U+28487 𨒇 U+28595 𨖕 U+28891 𨢑
          U+28D99 𨶙 U+28E39 𨸹 U+2945D 𩑝 U+2947E 𩑾 U+294E5 𩓥
          U+296A8 𩚨 U+296E9 𩛩 U+29704 𩜄 U+29730 𩜰 U+29D71 𩵱
          U+29DD3 𩷓 U+29E19 𩸙 U+29E36 𩸶 U+29EAC 𩺬 U+29F27 𩼧
          U+29F30 𩼰 U+29F48 𩽈 U+29F70 𩽰 U+2A04E 𪁎 U+2A0BA 𪂺
          U+2A1E1 𪇡 U+2A41E 𪐞 U+2A590 𪖐 U+2A612 𪘒 U+2A64A 𪙊
        • CJK Unified Ideographs Extension C glyphs: U+2A736 𪜶, U+2AE5A 𪹚, and U+2B4A2 𫒢
        • CJK Unified Ideographs Extension E glyphs: U+2B8C6 𫣆, U+2C816 𬠖, and U+2C9B0 𬦰.
    • Plane 3:
      • twuchiutann modified U+30EDD and U+30EDE (biang), originally drawn by Ming Fan, to differentiate between traditional and simplified Chinese versions.
      • 湖 远星 contributed the following glyphs, which are common in Cantonese, Hokkien, Hakka, etc., from a list given in the Ichiten font.
        • CJK Unified Ideographs Extension G glyphs: U+301DB 𰇛, U+308FB 𰣻, and U+30E6C 𰹬
        • CJK Unified Ideographs Extension H glyph: U+31C7F 𱱿.
    • Plane 15 CSUR/UCSUR:
      • Rebecca Bettencourt contributed:
        • U+F16B0..U+F16DF Derani
        • U+F2000..U+F267F Sadalian.
      • Paul Hardy contributed U+F1C80..U+F1C9C Sitelen Pona Radicals.
    *New in Unicode 16.0.0.

Unifont 15.1

  • 24 February 2024 (Unifont 15.1.05)
    • Plane 0:
      • Ho-seok Ee redrew all Hangul glyphs not in the Hangul Syllables range, so their style more closely resembles the style of the Hangul Syllables range: U+1100..U+11FF Hangul Jamo, U+3131..U+318E Hangul Compatibility Jamo, U+A960..U+A97C Hangul Jamo Extended-A, U+D7B0..U+D7FB Hangul Jamo Extended-B.
      • Hayden Wong improved several glyphs in the range U+2100..U+214F Letterlike Symbols.
      • Johnnie Weaver redrew U+013D LATIN CAPITAL LETTER L WITH CARON for better compatibility with other glyphs in the Czech and Slovak alphabets.
    • Planes 2 and 3: almost 600 new ideographs, including:
      • Boris Zhang and Yzy32767 contributed U+20000..U+2001F.
      • Boris Zhang and Yzy32767 contributed the entire CJK Unified Ideographs Extension D range, U+2B740..U+2B81D.
      • 湖 远星 contributed 335 glyphs across Plane 2 and Plane 3 with common Cantonese ideographs.
      • Other new idegraphs in CJK Unified Ideographs Extension I.
    • Plane F: Paul Hardy modified the Sitelen Pona script, added combining character indicators and adding several new glyphs since the last release. This completes the most current version of Sitelen Pona. encodings
  • 29 October 2023 (Unifont 15.1.04)
    • Default and Japanese versions have larger supersets of Plane 2 and Plane 3 glyphs.
    • Johnnie Weaver contributed updates for U+266D..U+266F and U+26BC.
  • 21 October 2023 (Unifont 15.1.03)
    • Boris Zhang and Yzy32767 contributed CJK Unified Ideographs Extension I glyphs (U+2EBF0..U+2EE5D).
    • 湖 远星 contributed 14 glyphs to CJK Unified Ideographs Extensions B and C and updated U+5C81 and U+6708.
  • 21 September 2023 (Unifont 15.1.02)
    • 湖 远星:
      • Adjusted 46 glyphs in the Plane 0 Wen Quan Yi range, U+2F00..U+9FFF.
      • Contributed Plane 3 CJK Unified Ideographs Extension G glyphs in the range U+30000..U+3017F.
  • 12 September 2023 (Unifont 15.1.01)
    • As mentioned during the year leading up to this release, TrueType fonts are no longer produced by the default build; OpenType fonts have taken their place. This change has been driven by the diminishing support for TrueType fonts in the Pango font rendering engine. TrueType fonts can still be built from the distribution tarball using the command "make truetype" in the font directory.
    • Johab 6/3/1 Hangul Jamo Ho-Seok Ee proposed a new Johab encoding for algorithmic Hangul Syllables generation. The resulting scheme uses 6 variations of initial consonants (choseong), 3 of medial vowels and diphthongs (jungseong), and 1 of final consonants (jongseong). The image on the left is partial output from a new supporting Unifont utility, unijohab2html , which gives an overview of how the three components of a Hangul syllable combine with each other and outputs any overlaps for a font designer's analysis. A full discussion of this new Johab 6/3/1 encoding appears on the Unifont Hangul Syllables Generation web page. Minseo Lee (이민서) provided feedback on the glyphs prior to their release.
    • Following a suggestion by Ho-Seok Ee, the hangul-base.hex file that contains the Johab 6/3/1 glyphs for Hangul syllable formation now begins at code point U+E000. This allows building a Unifont variant with that entire Hangul johab glyph set in the Uniode Plane 0 Private Use Area (PUA) using the command " make PUA=plane00/hangul/hangul-base.hex ". in the font directory. Unifont builds have traditionally left the PUA available for CSUR/UCSUR glyphs, which is still the default; see below for a discussion of the CSUR/UCSUR glyphs.
    • Johnnie Weaver modified "IJ" ligature glyphs U+0132 and U+0133. He also modified U+1E9E LATIN CAPITAL LETTER SHARP S.
    • Paul Hardy:
      • Modified U+2CC2 COPTIC CAPITAL LETTER CROSSED SHEI and U+2CC3 COPTIC SMALL LETTER CROSSED SHEI for consistency with the redrawn U+03E2 COPTIC CAPITAL LETTER SHEI and U+03E3 COPTIC SMALL LETTER SHEI.
      • Redrew Ideographic Description Characters (U+2FF0..U+2FFB) for consistency and added new glyphs (U+2FFC..U+2FFF). Also added CJK Strokes glyph U+31EF IDEOGRAPHIC DESCRIPTION CHARACTER SUBTRACTION.
      • Modified star glyphs U+2605, U+2606, and U+2BE8 for consistency.
      • Modified several Chinese ideographs and Korean ideographs in CJK Unified Ideographs Extension A (U+3400..U+4DBF) per the Unicode Standard version 15.1.0.
      • Wen Quan Yi Glyphs: Made modifications to Korean ideographs in CJK Unified Ideographs Extension A (U+3400..U+4DBF) per Unicode 15.1.0 changes. Modified CJK Unified Ideographs Extension A U+3B9D, U+454E, U+49C8 (from 湖 远星) and U+56B8. Modified CJK Unified Ideographs Extension U+809E and U+891D.
      • Modified Alchemical Symbols (U+1F700..U+1F77F) per Unicode 15.1.0 changes.
      • Added three hexadecimal digit notations to the Plane 0 UCSUR:
        • U+EBE0..U+EBEF: Boby Lapointe's "bibi-binary" notation.
        • U+EBF0..U+EBFF: Bruce Alan Martin's bit location notation.
        • U+ECF0..U+ECFF: Ronald O. Whitaker's triangular notation.
    • Implemented other glyph changes per the Unicode Standard version 15.1.0.
    • Several other minor changes; see the ChangeLog file in the main tarball for details.

Earlier Releases

See the Archive link at the top of this page for information on earlier Unifont releases.

Unifont Glyph Tables

Unifont font files contain glyphs in several Unicode planes. The following table provides an overview of this coverage.

GNU Unifont Font File Plane Coverage
Font Filename Plane 0 Plane 1 Plane 2 Plane 3 Plane 14 Plane 15
unifont-* X X 1,2 X 1,2
unifont_jp-* X X 1,2 X 1,2
unifont_upper-* X X 3 X 3 X
unifont_csur-* X X

Notes:

1 PCF fonts can only include glyphs in Plane 0.

2 Only a subset of Plane 2 and Plane 3 CJK glyphs plus the Plane 1 Copyleft glyph (U+1F12F) are included, to stay within the OpenType limit of 65,536 glyphs.

3 unifont_upper fonts will contain a superset of Chinese Plane 2 and Plane 3 glyphs plus JIS X 0213 glyphs until the OpenType font nears its limit of 65,536 code points.

Click on each link in the tables below to show its corresponding 256-code point range within the respective Unicode planes.

Plane 0 Glyphs

The table below links to the glyphs in the Plane 0 (Basic Multilingual Plane) unifont font files.

GNU Unifont Glyphs
Unicode Basic Multilingual Plane
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F
20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F
30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F
40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F
50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F
60 61 62 63 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F
70 71 72 73 74 75 76 77 78 79 7A 7B 7C 7D 7E 7F
80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F
90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F
A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF
B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF
C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF
D0 D1 D2 D3 D4 D5 D6 D7 Surrogate Pairs
Private Use Area
Private Use Area F9 FA FB FC FD FE FF

This next table links to the glyphs in the Plane 0 (Basic Multilingual Plane) unifont_jp Japanese variant font files. See also the Plane 2 glyphs further down, which are only included in the unifont_jp OpenType and TrueType font files.

GNU Unifont Glyphs — Japanese Version
with Page Coverage for Plane 0
(Green=100%, Red=0%)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F
20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F
30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F
40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F
50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F
60 61 62 63 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F
70 71 72 73 74 75 76 77 78 79 7A 7B 7C 7D 7E 7F
80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F
90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F
A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF
B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF
C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF
D0 D1 D2 D3 D4 D5 D6 D7 Surrogate Pairs
Private Use Area
Private Use Area F9 FA FB FC FD FE FF

Plane 1 Glyphs

The next table links to glyphs in Plane 1 (Supplementary Multilingual Plane) OpenType and TrueType unifont_upper font files.

GNU Unifont Glyphs
with Page Coverage for Plane 1
(Green=100%, Red=0%)
0100 0101 0102 0103 0104 0105 0106 0107 0108 0109 010A 010B 010C 010D 010E 010F
0110 0111 0112 0113 0114 0115 0116 0117 0118 0119 011A 011B 011C 011D 011E 011F
0120 0121 0122 0123 0124 0125 0126 0127 0128 0129 012A 012B 012C 012D 012E 012F
0130 0131 0132 0133 0134 0135 0136 0137 0138 0139 013A 013B 013C 013D 013E 013F
0140 0141 0142 0143 0144 0145 0146 0147 0148 0149 014A 014B 014C 014D 014E 014F
0150 0151 0152 0153 0154 0155 0156 0157 0158 0159 015A 015B 015C 015D 015E 015F
0160 0161 0162 0163 0164 0165 0166 0167 0168 0169 016A 016B 016C 016D 016E 016F
0170 0171 0172 0173 0174 0175 0176 0177 0178 0179 017A 017B 017C 017D 017E 017F
0180 0181 0182 0183 0184 0185 0186 0187 0188 0189 018A 018B 018C 018D 018E 018F
0190 0191 0192 0193 0194 0195 0196 0197 0198 0199 019A 019B 019C 019D 019E 019F
01A0 01A1 01A2 01A3 01A4 01A5 01A6 01A7 01A8 01A9 01AA 01AB 01AC 01AD 01AE 01AF
01B0 01B1 01B2 01B3 01B4 01B5 01B6 01B7 01B8 01B9 01BA 01BB 01BC 01BD 01BE 01BF
01C0 01C1 01C2 01C3 01C4 01C5 01C6 01C7 01C8 01C9 01CA 01CB 01CC 01CD 01CE 01CF
01D0 01D1 01D2 01D3 01D4 01D5 01D6 01D7 01D8 01D9 01DA 01DB 01DC 01DD 01DE 01DF
01E0 01E1 01E2 01E3 01E4 01E5 01E6 01E7 01E8 01E9 01EA 01EB 01EC 01ED 01EE 01EF
01F0 01F1 01F2 01F3 01F4 01F5 01F6 01F7 01F8 01F9 01FA 01FB 01FC 01FD 01FE 01FF

Plane 2 Glyphs

The table below links to the Japanese glyphs in Plane 2 (Supplementary Ideographic Plane) contained in the unifont_jp OpenType and TrueType font files. Note: These Plane 2 glyphs along with the Plane 0 glyphs in unifont_jp font files provide complete coverage of the JIS X 0213 standard. Only 303 glyphs appear in the files below. Files with no glyphs appear with a gray background.

GNU Unifont Glyphs — Japanese Version
with Page Coverage for Plane 2
(Gray=0%)
0200 0201 0202 0203 0204 0205 0206 0207 0208 0209 020A 020B 020C 020D 020E 020F
0210 0211 0212 0213 0214 0215 0216 0217 0218 0219 021A 021B 021C 021D 021E 021F
0220 0221 0222 0223 0224 0225 0226 0227 0228 0229 022A 022B 022C 022D 022E 022F
0230 0231 0232 0233 0234 0235 0236 0237 0238 0239 023A 023B 023C 023D 023E 023F
0240 0241 0242 0243 0244 0245 0246 0247 0248 0249 024A 024B 024C 024D 024E 024F
0250 0251 0252 0253 0254 0255 0256 0257 0258 0259 025A 025B 025C 025D 025E 025F
0260 0261 0262 0263 0264 0265 0266 0267 0268 0269 026A 026B 026C 026D 026E 026F
0270 0271 0272 0273 0274 0275 0276 0277 0278 0279 027A 027B 027C 027D 027E 027F
0280 0281 0282 0283 0284 0285 0286 0287 0288 0289 028A 028B 028C 028D 028E 028F
0290 0291 0292 0293 0294 0295 0296 0297 0298 0299 029A 029B 029C 029D 029E 029F
02A0 02A1 02A2 02A3 02A4 02A5 02A6 02A7 02A8 02A9 02AA 02AB 02AC 02AD 02AE 02AF
02B0 02B1 02B2 02B3 02B4 02B5 02B6 02B7 02B8 02B9 02BA 02BB 02BC 02BD 02BE 02BF
02C0 02C1 02C2 02C3 02C4 02C5 02C6 02C7 02C8 02C9 02CA 02CB 02CC 02CD 02CE 02CF
02D0 02D1 02D2 02D3 02D4 02D5 02D6 02D7 02D8 02D9 02DA 02DB 02DC 02DD 02DE 02DF
02E0 02E1 02E2 02E3 02E4 02E5 02E6 02E7 02E8 02E9 02EA 02EB 02EC 02ED 02EE 02EF
02F0 02F1 02F2 02F3 02F4 02F5 02F6 02F7 02F8 02F9 02FA 02FB 02FC 02FD 02FE 02FF

This next table links to the Chinese glyphs in Plane 2 (Supplementary Ideographic Plane) contained in unifont OpenType and TrueType font files. Note: These Plane 2 glyphs along with the default Plane 0 glyphs in Unifont provide complete coverage of the Table of General Standard Chinese Characters (通用规范汉字表). Only 232 glyphs appear in the files below. Files with no glyphs appear with a gray background.

GNU Unifont Glyphs — Chinese Version
with Page Coverage for Plane 2
(Gray=0%)
0200 0201 0202 0203 0204 0205 0206 0207 0208 0209 020A 020B 020C 020D 020E 020F
0210 0211 0212 0213 0214 0215 0216 0217 0218 0219 021A 021B 021C 021D 021E 021F
0220 0221 0222 0223 0224 0225 0226 0227 0228 0229 022A 022B 022C 022D 022E 022F
0230 0231 0232 0233 0234 0235 0236 0237 0238 0239 023A 023B 023C 023D 023E 023F
0240 0241 0242 0243 0244 0245 0246 0247 0248 0249 024A 024B 024C 024D 024E 024F
0250 0251 0252 0253 0254 0255 0256 0257 0258 0259 025A 025B 025C 025D 025E 025F
0260 0261 0262 0263 0264 0265 0266 0267 0268 0269 026A 026B 026C 026D 026E 026F
0270 0271 0272 0273 0274 0275 0276 0277 0278 0279 027A 027B 027C 027D 027E 027F
0280 0281 0282 0283 0284 0285 0286 0287 0288 0289 028A 028B 028C 028D 028E 028F
0290 0291 0292 0293 0294 0295 0296 0297 0298 0299 029A 029B 029C 029D 029E 029F
02A0 02A1 02A2 02A3 02A4 02A5 02A6 02A7 02A8 02A9 02AA 02AB 02AC 02AD 02AE 02AF
02B0 02B1 02B2 02B3 02B4 02B5 02B6 02B7 02B8 02B9 02BA 02BB 02BC 02BD 02BE 02BF
02C0 02C1 02C2 02C3 02C4 02C5 02C6 02C7 02C8 02C9 02CA 02CB 02CC 02CD 02CE 02CF
02D0 02D1 02D2 02D3 02D4 02D5 02D6 02D7 02D8 02D9 02DA 02DB 02DC 02DD 02DE 02DF
02E0 02E1 02E2 02E3 02E4 02E5 02E6 02E7 02E8 02E9 02EA 02EB 02EC 02ED 02EE 02EF
02F0 02F1 02F2 02F3 02F4 02F5 02F6 02F7 02F8 02F9 02FA 02FB 02FC 02FD 02FE 02FF

Plane 3 Glyphs

Plane 3 begins with the CJK Unified Ideographs Extension G block, from U+30000 through U+3134A. This includes the highly complex biang Chinese ideograph and taito Japanese ideograph:

GNU Unifont Glyphs
with Page Coverage for Plane 3
(Gray=0%)
0300 0301 0302 0303 0304 0305 0306 0307 0308 0309 030A 030B 030C 030D 030E 030F
0310 0311 0312 0313 0314 0315 0316 0317 0318 0319 031A 031B 031C 031D 031E 031F
0320 0321 0322 0323 0324 0325 0326 0327 0328 0329 032A 032B 032C 032D 032E 032F
0330 0331 0332 0333 0334 0335 0336 0337 0338 0339 033A 033B 033C 033D 033E 033F
0340 0341 0342 0343 0344 0345 0346 0347 0348 0349 034A 034B 034C 034D 034E 034F
0350 0351 0352 0353 0354 0355 0356 0357 0358 0359 035A 035B 035C 035D 035E 035F
0360 0361 0362 0363 0364 0365 0366 0367 0368 0369 036A 036B 036C 036D 036E 036F
0370 0371 0372 0373 0374 0375 0376 0377 0378 0379 037A 037B 037C 037D 037E 037F
0380 0381 0382 0383 0384 0385 0386 0387 0388 0389 038A 038B 038C 038D 038E 038F
0390 0391 0392 0393 0394 0395 0396 0397 0398 0399 039A 039B 039C 039D 039E 039F
03A0 03A1 03A2 03A3 03A4 03A5 03A6 03A7 03A8 03A9 03AA 03AB 03AC 03AD 03AE 03AF
03B0 03B1 03B2 03B3 03B4 03B5 03B6 03B7 03B8 03B9 03BA 03BB 03BC 03BD 03BE 03BF
03C0 03C1 03C2 03C3 03C4 03C5 03C6 03C7 03C8 03C9 03CA 03CB 03CC 03CD 03CE 03CF
03D0 03D1 03D2 03D3 03D4 03D5 03D6 03D7 03D8 03D9 03DA 03DB 03DC 03DD 03DE 03DF
03E0 03E1 03E2 03E3 03E4 03E5 03E6 03E7 03E8 03E9 03EA 03EB 03EC 03ED 03EE 03EF
03F0 03F1 03F2 03F3 03F4 03F5 03F6 03F7 03F8 03F9 03FA 03FB 03FC 03FD 03FE 03FF

Plane 14 Glyphs

This table links to the two ranges of 256 assigned code points in Plane 14 (Tags and Variation Selector Supplement) that appear in the unifont_upper OpenType and TrueType font files.

GNU Unifont Glyphs
Unicode Plane 14
0E00 0E01

Plane 0 and Plane 15 Private Use Area Glyphs

Finally, this last glyph table shows ConScript Unicode Registry (CSUR) and Under CSUR glyphs that appear in the unifont_csur OpenType and TrueType font files. Not all of the Plane 0 CSUR and UCSUR scripts have been drawn, but given the esoteric nature of some CSUR and UCSUR scripts (including the unavailability of glyph samples for many of the more obscure constructed scripts), the boxes in the table all have a green background color even if not at 100% coverage.

GNU Unifont Glyphs
Private Use Area, Planes 0 and 15 — ConScript Unicode Registry
E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF
F0 F1 F2 F3 F4 F5 F6 F7 F8 Unicode Assigned Code Points
0F00 0F01 0F02 0F03 0F04 0F05 0F06 0F07 0F08 0F09 0F0A 0F0B 0F0C 0F0D 0F0E 0F0F
0F10 0F11 0F12 0F13 0F14 0F15 0F16 0F17 0F18 0F19 0F1A 0F1B 0F1C 0F1D 0F1E 0F1F
0F20 0F21 0F22 0F23 0F24 0F25 0F26 0F27 0F28 0F29 0F2A 0F2B 0F2C 0F2D 0F2E 0F2F

Contributing Glyphs

16 by 16 pixel sample grid

If you would like to contribute glyphs to the GNU Unifont effort, you can download the associated PNG file from the tables above (SMP and CSUR need additions). Then draw new glyphs in the 16-by-16 pixel area that is inside the inner box you see in the image on the left.

When done, erase the surrounding inner box and ruler lines around the inner box. You can then save the file as a monochrome bitmap image. Then convert the .png file into a .hex file with the unipng2hex utility in the source tarball. Or you can just email the .png file to me as a contribution to this effort and I will do the conversion.

Q: Why is the outer grid so much larger than the 16-by-16 pixel inner box?

A: Because in a future version, unipng2hex, unihex2png, and other utilities should be able to handle larger glyphs.

The table below shows the current state of completion of the Supplementary Multilingual Plane (Plane 1). Any range in the table that doesn't have a green background has missing glyphs. To see which scripts are in a particular range, consult the "Supplementary Multilingual Plane" list in the Current Coverage section below. The more red a range appears in the table below, the more glyphs are missing from that range.

Current Coverage

Links in this section reference the first block of 256 glyphs where a script begins.

The list below shows the scripts that are in the Unicode Basic Multilingual Plane, with coverage in this release of Unifont.

      Covered      Range       Script
      -------      -----       ------
       100.0%  U+0000..U+007F  C0 Controls and Basic Latin
       100.0%  U+0080..U+00FF  C1 Controls and Latin-1 Supplement
       100.0%  U+0100..U+017F  Latin Extended-A
       100.0%  U+0180..U+024F  Latin Extended-B
       100.0%  U+0250..U+02AF  IPA Extensions
       100.0%  U+02B0..U+02FF  Spacing Modifier Letters
       100.0%  U+0300..U+036F  Combining Diacritical Marks
       100.0%  U+0370..U+03FF  Greek and Coptic
       100.0%  U+0400..U+04FF  Cyrillic
       100.0%  U+0500..U+052F  Cyrillic Supplement
       100.0%  U+0530..U+058F  Armenian
       100.0%  U+0590..U+05FF  Hebrew
       100.0%  U+0600..U+06FF  Arabic
       100.0%  U+0700..U+074F  Syriac
       100.0%  U+0750..U+077F  Arabic Supplement
       100.0%  U+0780..U+07BF  Thaana
       100.0%  U+07C0..U+07FF  N'Ko
       100.0%  U+0800..U+083F  Samaritan
       100.0%  U+0840..U+085F  Mandaic
       100.0%  U+0860..U+086F  Syriac Supplement
       100.0%  U+0870..U+089F  Arabic Extended-B
       100.0%  U+08A0..U+08FF  Arabic Extended-A
       100.0%  U+0900..U+097F  Devanagari
       100.0%  U+0980..U+09FF  Bengali
       100.0%  U+0A00..U+0A7F  Gurmukhi
       100.0%  U+0A80..U+0AFF  Gujarati
       100.0%  U+0B00..U+0B7F  Oriya
       100.0%  U+0B80..U+0BFF  Tamil
       100.0%  U+0C00..U+0C7F  Telugu
       100.0%  U+0C80..U+0CFF  Kannada
       100.0%  U+0D00..U+0D7F  Malayalam
       100.0%  U+0D80..U+0DFF  Sinhala
       100.0%  U+0E00..U+0E7F  Thai
       100.0%  U+0E80..U+0EFF  Lao
       100.0%  U+0F00..U+0FFF  Tibetan
       100.0%  U+1000..U+109F  Myanmar
       100.0%  U+10A0..U+10FF  Georgian
       100.0%  U+1100..U+11FF  Hangul Jamo
       100.0%  U+1200..U+137F  Ethiopic
       100.0%  U+1380..U+139F  Ethiopic Supplement
       100.0%  U+13A0..U+13FF  Cherokee
       100.0%  U+1400..U+167F  Unified Canadian Aboriginal Syllabics
       100.0%  U+1680..U+169F  Ogham
       100.0%  U+16A0..U+16FF  Runic
       100.0%  U+1700..U+171F  Tagalog
       100.0%  U+1720..U+173F  Hanunoo
       100.0%  U+1740..U+175F  Buhid
       100.0%  U+1760..U+177F  Tagbanwa
       100.0%  U+1780..U+17FF  Khmer
       100.0%  U+1800..U+18AF  Mongolian
       100.0%  U+18B0..U+18FF  Unified Canadian Aboriginal Syllabics Extended
       100.0%  U+1900..U+194F  Limbu
       100.0%  U+1950..U+197F  Tai Le
       100.0%  U+1980..U+19DF  New Tai Lue
       100.0%  U+19E0..U+19FF  Khmer Symbols
       100.0%  U+1A00..U+1A1F  Buginese
       100.0%  U+1A20..U+1AAF  Tai Tham
       100.0%  U+1AB0..U+1AFF  Combining Diacritical Marks Extended
       100.0%  U+1B00..U+1B7F  Balinese
       100.0%  U+1B80..U+1BBF  Sundanese
       100.0%  U+1BC0..U+1BFF  Batak
       100.0%  U+1C00..U+1C4F  Lepcha
       100.0%  U+1C50..U+1C7F  Ol Chiki
       100.0%  U+1C80..U+1C8F  Cyrillic Extended-C
       100.0%  U+1C90..U+1CBF  Georgian Extended
       100.0%  U+1CC0..U+1CCF  Sundanese Supplement
       100.0%  U+1CD0..U+1CFF  Vedic Extensions
       100.0%  U+1D00..U+1D7F  Phonetic Extensions
       100.0%  U+1D80..U+1DBF  Phonetic Extensions Supplement
       100.0%  U+1DC0..U+1DFF  Combining Diacritical Marks Supplement
       100.0%  U+1E00..U+1EFF  Latin Extended Additional
       100.0%  U+1F00..U+1FFF  Greek Extended
       100.0%  U+2000..U+206F  General Punctuation
       100.0%  U+2070..U+209F  Superscripts and Subscripts
       100.0%  U+20A0..U+20CF  Currency Symbols
       100.0%  U+20D0..U+20FF  Combining Diacritical Marks for Symbols
       100.0%  U+2100..U+214F  Letterlike Symbols
       100.0%  U+2150..U+218F  Number Forms
       100.0%  U+2190..U+21FF  Arrows
       100.0%  U+2200..U+22FF  Mathematical Operators
       100.0%  U+2300..U+23FF  Miscellaneous Technical
       100.0%  U+2400..U+243F  Control Pictures
       100.0%  U+2440..U+245F  Optical Character Recognition
       100.0%  U+2460..U+24FF  Enclosed Alphanumerics
       100.0%  U+2500..U+257F  Box Drawing
       100.0%  U+2580..U+259F  Block Elements
       100.0%  U+25A0..U+25FF  Geometric Shapes
       100.0%  U+2600..U+26FF  Miscellaneous Symbols
       100.0%  U+2700..U+27BF  Dingbats
       100.0%  U+27C0..U+27EF  Miscellaneous Mathematical Symbols-A
       100.0%  U+27F0..U+27FF  Supplemental Arrows-A
       100.0%  U+2800..U+28FF  Braille Patterns
       100.0%  U+2900..U+297F  Supplemental Arrows-B
       100.0%  U+2980..U+29FF  Miscellaneous Mathematical Symbols-B
       100.0%  U+2A00..U+2AFF  Supplemental Mathematical Operators
       100.0%  U+2B00..U+2BFF  Miscellaneous Symbols and Arrows
       100.0%  U+2C00..U+2C5F  Glagolithic
       100.0%  U+2C60..U+2C7F  Latin Extended-C
       100.0%  U+2C80..U+2CFF  Coptic
       100.0%  U+2D00..U+2D2F  Georgian Supplement
       100.0%  U+2D30..U+2D7F  Tifinagh
       100.0%  U+2D80..U+2DDF  Ethiopic Extended
       100.0%  U+2DE0..U+2DFF  Cyrillic Extended-A
       100.0%  U+2E00..U+2E7F  Supplemental Punctuation
       100.0%  U+2E80..U+2EFF  CJK Radicals Supplement
       100.0%  U+2F00..U+2FDF  Kangxi Radicals
       100.0%  U+2FE0..U+2FEF  Unassigned
       100.0%  U+2FF0..U+2FFF  Ideographic Description Characters
       100.0%  U+3000..U+303F  CJK Symbols and Punctuation
       100.0%  U+3040..U+309F  Hiragana
       100.0%  U+30A0..U+30FF  Katakana
       100.0%  U+3100..U+312F  Bopomofo
       100.0%  U+3130..U+318F  Hangul Compatibility Jamo
       100.0%  U+3190..U+319F  Kanbun
       100.0%  U+31A0..U+31BF  Bopomofo Extended
       100.0%  U+31C0..U+31EF  CJK Strokes
       100.0%  U+31F0..U+31FF  Katakana Phonetic Extensions
       100.0%  U+3200..U+32FF  Enclosed CJK Letters and Months
       100.0%  U+3300..U+33FF  CJK Compatibility
       100.0%  U+3400..U+4DBF  CJK Unified Ideographs Extension A
       100.0%  U+4DC0..U+4DFF  Yijing Hexagram Symbols
       100.0%  U+4E00..U+9FFF  CJK Unified Ideographs
       100.0%  U+A000..U+A48F  Yi Syllables
       100.0%  U+A490..U+A4CF  Yi Radicals
       100.0%  U+A4D0..U+A4FF  Lisu
       100.0%  U+A500..U+A63F  Vai
       100.0%  U+A640..U+A69F  Cyrillic Extended-B
       100.0%  U+A6A0..U+A6FF  Bamum
       100.0%  U+A700..U+A71F  Modifier Tone Letters
       100.0%  U+A720..U+A7FF  Latin Extended-D
       100.0%  U+A800..U+A82F  Syloti Nagri
       100.0%  U+A830..U+A83F  Common Indic Number Forms
       100.0%  U+A840..U+A87F  Phags-pa
       100.0%  U+A880..U+A8DF  Saurashtra
       100.0%  U+A8E0..U+A8FF  Devanagari Extended
       100.0%  U+A900..U+A92F  Kayah Li
       100.0%  U+A930..U+A95F  Rejang
       100.0%  U+A960..U+A97F  Hangul Jamo Extended-A
       100.0%  U+A980..U+A9DF  Javanese
       100.0%  U+A9E0..U+A9FF  Myanmar Extended-B
       100.0%  U+AA00..U+AA5F  Cham
       100.0%  U+AA60..U+AA7F  Myanmar Extended-A
       100.0%  U+AA80..U+AADF  Tai Viet
       100.0%  U+AAE0..U+AAFF  Meetei Mayek Extensions
       100.0%  U+AB00..U+AB2F  Ethiopic Extended-A
       100.0%  U+AB30..U+AB6F  Latin Extended-E
       100.0%  U+AB70..U+ABBF  Cherokee Supplement
       100.0%  U+ABC0..U+ABFF  Meetei Mayek
       100.0%  U+AC00..U+D7AF  Hangul Syllables
       100.0%  U+D7B0..U+D7FF  Hangul Jamo Extended-B
         0.0%  U+D800..U+DFFF  Surrogate Pairs - Not Used
         0.0%  U+E000..U+F8FF  Private Use Area - drawn but not included
       100.0%  U+F900..U+FAFF  CJK Compatibility Ideographs
       100.0%  U+FB00..U+FB4F  Alphabetic Presentation Forms
       100.0%  U+FB50..U+FDFF  Arabic Presentation Forms-A
       100.0%  U+FE00..U+FE0F  Variation Selectors
       100.0%  U+FE10..U+FE1F  Vertical Forms
       100.0%  U+FE20..U+FE2F  Combining Half Marks
       100.0%  U+FE30..U+FE4F  CJK Compatibility Forms
       100.0%  U+FE50..U+FE6F  Small Form Variants
       100.0%  U+FE70..U+FEFF  Arabic Presentation Forms-B
       100.0%  U+FF00..U+FFEF  Halfwidth and Fullwidth Forms
       100.0%  U+FFF0..U+FFFF  Specials
    

The list below shows the scripts that are in the Unicode Supplementary Multilingual Plane, with coverage in this release of Unifont. Scripts labeled "(Pending)" are being drawn currently.

      Covered        Range         Script
      -------        -----         ------
       100.0%  U+010000..U+01007F  Linear B Syllabary
       100.0%  U+010080..U+0100FF  Linear B Ideograms
       100.0%  U+010100..U+01013F  Aegean Numbers
       100.0%  U+010140..U+01018F  Ancient Greek Numbers
       100.0%  U+010190..U+0101CF  Ancient Symbols
       100.0%  U+0101D0..U+0101FF  Phaistos Disc
       100.0%  U+010280..U+01029F  Lycian
       100.0%  U+0102A0..U+0102DF  Carian
       100.0%  U+0102E0..U+0102FF  Coptic Epact Numbers
       100.0%  U+010300..U+01032F  Old Italic
       100.0%  U+010330..U+01034F  Gothic
       100.0%  U+010350..U+01037F  Old Permic
       100.0%  U+010380..U+01039F  Ugaritic
       100.0%  U+0103A0..U+0103DF  Old Persian
       100.0%  U+010400..U+01044F  Deseret
       100.0%  U+010450..U+01047F  Shavian
       100.0%  U+010480..U+0104AF  Osmanya
       100.0%  U+0104B0..U+0104FF  Osage
       100.0%  U+010500..U+01052F  Elbasan
       100.0%  U+010530..U+01056F  Caucasian Albanian
       100.0%  U+010570..U+0105BF  Vithkuqi
       100.0%  U+0105C0..U+0105FF  Todhri
       100.0%  U+010600..U+01077F  Linear A
       100.0%  U+010780..U+0107BF  Latin Extended-F
       100.0%  U+010800..U+01083F  Cypriot Syllabary
       100.0%  U+010840..U+01085F  Imperial Aramaic
       100.0%  U+010860..U+01087F  Palmyrene
       100.0%  U+010880..U+0108AF  Nabataean
       100.0%  U+0108E0..U+0108FF  Hatran
       100.0%  U+010900..U+01091F  Phoenecian
       100.0%  U+010920..U+01093F  Lydian
       100.0%  U+010940..U+01095F  Sidetic
       100.0%  U+010980..U+01099F  Meroitic Hieroglyphs
       100.0%  U+0109A0..U+0109FF  Meroitic Cursive
       100.0%  U+010A00..U+010A5F  Kharoshthi
       100.0%  U+010A60..U+010A7F  Old South Arabian
       100.0%  U+010A80..U+010A9F  Old North Arabian
       100.0%  U+010AC0..U+010AFF  Manichaean
       100.0%  U+010B00..U+010B3F  Avestan
       100.0%  U+010B40..U+010B5F  Inscriptional Parthian
       100.0%  U+010B60..U+010B7F  Inscriptional Pahlavi
       100.0%  U+010B80..U+010BAF  Psalter Pahlavi
       100.0%  U+010C00..U+010C4F  Old Turkic
       100.0%  U+010C80..U+010CFF  Old Hungarian
       100.0%  U+010D00..U+010D3F  Hanifi Rohingya
       100.0%  U+010D40..U+010D8F  Garay
       100.0%  U+010E60..U+010E7F  Rumi Numeral Symbols
       100.0%  U+010E80..U+010EBF  Yezidi
       100.0%  U+010EC0..U+010EFF  Arabic Extended-C
       100.0%  U+010F00..U+010F2F  Old Sogdian
       100.0%  U+010F30..U+010F6F  Sogdian
       100.0%  U+010F70..U+010FAF  Old Uyghur
       100.0%  U+010FB0..U+010FDF  Chorasmian
       100.0%  U+010FE0..U+010FFF  Elymaic
       100.0%  U+011000..U+01107F  Brahmi
       100.0%  U+011080..U+0110CF  Kaithi
       100.0%  U+0110D0..U+0110FF  Sora Sompeng
       100.0%  U+011100..U+01114F  Chakma
       100.0%  U+011150..U+01117F  Mahajani
       100.0%  U+011180..U+0111DF  Sharada
       100.0%  U+0111E0..U+0111FF  Sinhala Archaic Numbers
       100.0%  U+011200..U+01124F  Khojki
       100.0%  U+011280..U+0112AF  Multani
       100.0%  U+0112B0..U+0112FF  Khudawadi
       100.0%  U+011300..U+01137F  Grantha
       100.0%  U+011380..U+0113FF  Tulu-Tigalari
       100.0%  U+011400..U+01147F  Newa
       100.0%  U+011480..U+0114DF  Tirhuta
       100.0%  U+011580..U+0115FF  Siddham
       100.0%  U+011600..U+01165F  Modi
       100.0%  U+011660..U+01167F  Mongolian Supplement
       100.0%  U+011680..U+0116CF  Takri
       100.0%  U+0116D0..U+0116FF  Myanmar Extended-C
       100.0%  U+011700..U+01174F  Ahom
       100.0%  U+011800..U+01184F  Dogra
       100.0%  U+0118A0..U+0118FF  Warang Citi
       100.0%  U+011900..U+01195F  Dives Akuru
       100.0%  U+0119A0..U+0119FF  Nandinagari
       100.0%  U+011A00..U+011A4F  Zanabazar Square
       100.0%  U+011A50..U+011AAF  Soyombo
       100.0%  U+011AB0..U+011ABF  Unified Canadian Aboriginal Syllabics Extended-A
       100.0%  U+011AC0..U+011AFF  Pau Cin Hau
       100.0%  U+011B60..U+011B7F  Sharada Supplement
       100.0%  U+011BC0..U+011BFF  Sunuwar
       100.0%  U+011C00..U+011C6F  Bhaiksuki
       100.0%  U+011C70..U+011CBF  Marchen
       100.0%  U+011D00..U+011D5F  Masaram Gondi
       100.0%  U+011D60..U+011DAF  Gunjala Gondi
       100.0%  U+011DB0..U+011DEF  Tolong Siki
       100.0%  U+011EE0..U+011EFF  Makasar
       100.0%  U+011F00..U+011F5F  Kawi
       100.0%  U+011FC0..U+011FFF  Tamil Supplement
         0.0%  U+012000..U+0123FF  Cuneiform*
         0.0%  U+012400..U+01247F  Cuneiform Numbers and Punctuation*
         0.0%  U+012480..U+01254F  Early Dynastic Cuneiform*
       100.0%  U+012F90..U+012FFF  Cypro-Minoan
         0.0%  U+013000..U+01342F  Egyptian Hieroglyphs*
       100.0%  U+013430..U+01345F  Egyptian Hieroglyph Format Controls
         0.0%  U+013460..U+0143FF  Egyptian Hieroglyphics Extended-A
         0.0%  U+014400..U+01467F  Anatolian Hieroglyphs*
       100.0%  U+016100..U+01613F  Gurung Khema
         0.0%  U+016800..U+0168BF  Bamum Supplement*
       100.0%  U+016A40..U+016A6F  Mro
       100.0%  U+016A70..U+016ACF  Tangsa
       100.0%  U+016AD0..U+016AFF  Bassa Vah
       100.0%  U+016D40..U+016D7F  Kirat Rai
       100.0%  U+016B00..U+016B8F  Pahawh Hmong
       100.0%  U+016E40..U+016E9F  Medefaidrin
       100.0%  U+016EA0..U+016EDF  Beria Erfe
       100.0%  U+016F00..U+016F9F  Miao
       100.0%  U+016FE0..U+016FFF  Ideographic Symbols and Punctuation
         0.0%  U+017000..U+0187FF  Tangut
         0.0%  U+018800..U+018AFF  Tangut Components
       100.0%  U+018B00..U+018CFF  Khitan Small Script
         0.0%  U+018D00..U+018D7F  Tangut Supplement
         0.0%  U+018D80..U+018DFF  Tangut Components Supplement
       100.0%  U+01AFF0..U+01AFFF  Kana Extended-B
       100.0%  U+01B000..U+01B0FF  Kana Supplement
       100.0%  U+01B100..U+01B12F  Kana Extended-A
       100.0%  U+01B130..U+01B16F  Small Kana Extension
       100.0%  U+01B170..U+01B2FF  Nushu
       100.0%  U+01BC00..U+01BC9F  Duployan
       100.0%  U+01BCA0..U+01BCAF  Shorthand Format Controls
       100.0%  U+01CC00..U+01CEBF  Symbols for Legacy Computing
       100.0%  U+01CEC0..U+01CEFF  Miscellaneous Symbols Supplement
       100.0%  U+01CF00..U+01CFCF  Znamenny Musical Notation
       100.0%  U+01D000..U+01D0FF  Byzantine Musical Symbols
       100.0%  U+01D100..U+01D1FF  Musical Symbols
       100.0%  U+01D200..U+01D24F  Ancient Greek Musical Notation
       100.0%  U+01D2E0..U+01D2FF  Mayan Numerals
       100.0%  U+01D300..U+01D35F  Tai Xuan Jing Symbols
       100.0%  U+01D360..U+01D37F  Counting Rod Numerals
       100.0%  U+01D400..U+01D7FF  Mathematical Alphanumeric Symbols
       100.0%  U+01D800..U+01DAAF  Sutton SignWriting
       100.0%  U+01DF00..U+01DFFF  Latin Extended-G
       100.0%  U+01E000..U+01E02F  Glagolitic Supplement
       100.0%  U+01E100..U+01E14F  Nyiakeng Puachue Hmong
       100.0%  U+01E290..U+01E2BF  Toto
       100.0%  U+01E2C0..U+01E2FF  Wancho
       100.0%  U+01E5D0..U+01E5FF  Ol Onal
       100.0%  U+01E6C0..U+01E6FF  Tai Yo
       100.0%  U+01E7E0..U+01E7FF  Ethiopic Extended-B
       100.0%  U+01E800..U+01E8DF  Mende Kikakui
       100.0%  U+01E900..U+01E95F  Adlam
       100.0%  U+01EC70..U+01ECBF  Indic Siyaq Numbers
       100.0%  U+01ED00..U+01ED4F  Ottoman Siyaq Numbers
       100.0%  U+01EE00..U+01EEFF  Arabic Mathematical Alphabetic Symbols
       100.0%  U+01F000..U+01F02F  Mahjong Tiles
       100.0%  U+01F030..U+01F09F  Domino Tiles
       100.0%  U+01F0A0..U+01F0FF  Playing Cards
       100.0%  U+01F100..U+01F1FF  Enclosed Alphanumeric Supplement
       100.0%  U+01F200..U+01F2FF  Enclosed Ideographic Supplement
       100.0%  U+01F300..U+01F5FF  Miscellaneous Symbols and Pictographs
       100.0%  U+01F600..U+01F64F  Emoticons
       100.0%  U+01F650..U+01F67F  Ornamental Dingbats
       100.0%  U+01F680..U+01F6FF  Transport and Map Symbols
       100.0%  U+01F700..U+01F77F  Alchemical Symbols
       100.0%  U+01F780..U+01F7FF  Geometric Shapes Extended
       100.0%  U+01F800..U+01F8FF  Supplemental Arrows-C
       100.0%  U+01F900..U+01F9FF  Supplemental Symbols and Pictographs
       100.0%  U+01FA00..U+01FA6F  Chess Symbols
       100.0%  U+01FA70..U+01FAFF  Symbols and Pictographs Extended-A
       100.0%  U+01FB00..U+01FBFF  Symbols for Legacy Computing
    

*Note: Scripts such as Cuneiform, Egyptian Hieroglyphs, and Bamum Supplement will not be drawn on a 16-by-16 pixel grid. There are plans to draw these scripts on a 32-by-32 pixel grid in the future.

Plane 14 has two scripts, both of which Unifont covers:

GNU Unifont Glyphs
Plane 14
Range Script
U+0E0000..U+0E007F Tags
U+0E0100..U+0E01EF Variations Selectors Supplement

The list below shows the scripts that are in Michael Everson's ConScript Unicode Registry (CSUR) and Rebecca Bettencourt's Under-CSUR that have coverage in this release of Unifont:

GNU Unifont Glyphs
Private Use Area, Planes 0 and 15 — ConScript Unicode Registry
Range Script
U+E000..U+E07F Tengwar
U+E080..U+E0FF Cirth
U+E100..U+E14F Engsvanyáli
U+E150..U+E1AF Kinya
U+E1B0..U+E1CF Ilianóre
U+E1D0..U+E1FF Syai
U+E200..U+E26F Verdurian
U+E280...U+E29F aUI
U+E2A0...U+E2CF Amman-iar
U+E2D0...U+E2FF Xaîni
U+E300...U+E33F Mizarian
U+E340...U+E35F Zíirí:nka
U+E3B0...U+E3FF Olaetyan
U+E400...U+E42F Nísklôz
U+E430...U+E44F Kazat ?Akkorou
U+E450...U+E46F Kazvarad
U+E470...U+E48F Zarkhánd
U+E490...U+E4BF Røzhxh
U+E4C0...U+E4EF Serivelna [Not Drawn]
U+E4F0...U+E4FF Kelwathi
U+E500..U+E51F Saklor
U+E520..U+E54F Rynnan
U+E550..U+E57F Alzetjan
U+E580..U+E59F Telarasso
U+E5A0..U+E5BF Ssûraki [Not Drawn]
U+E5C0..U+E5DF Gargoyle
U+E5E0..U+E5FF Ophidian
U+E630..U+E64F Seussian Latin Extensions
U+E650..U+E67F Sylabica
U+E680..U+E6CF Ewellic
U+E6D0..U+E6EF Amlin
U+E6F0..U+E6FF Unifon Extended
U+E740..U+E76F Unifon
U+E770..U+E77F Solresol
U+E780..U+E7FF Visible Speech
U+E800..U+E82F Monofon
U+E830..U+E88F D'ni
U+E890..U+E8DF Aurebesh
U+E8E0..U+E8FF Tonal
U+E900..U+E97F Glaitha-A
U+E980..U+E9FF Glaitha-B
U+EAA0..U+EAFF Wanya
U+EB00..U+EB3F Orokin
U+EB40..U+EB5F Standard Galactic
U+EB60..U+EB9F Braille Extended
U+EBA0..U+EBDF Cistercian Numerals
U+EBE0..U+EBEF Boby Lapointe's "bibi-binary" hexadecimal notation
U+EBF0..U+EBFF Bruce Alan Martin's hexadecimal bit location notation
U+EC00..U+EC2F Cylenian
U+EC30..U+EC6F Syrrin
U+EC70..U+ECEF Graflect
U+ECF0..U+ECFF Ronald O. Whitaker's triangular hexadecimal notation
U+ED00..U+ED3F Deini
U+ED40..U+ED5F Niji
U+F4C0..U+F4EF Ath
U+F8A0..U+F8CF Aiha
U+F8D0..U+F8FF Klingon
U+F0000..U+F00FF
U+F0100..U+F01FF
U+F0200..U+F02FF
U+F0300..U+F03FF
U+F0400..U+F04FF
U+F0500..U+F05FF
U+F0600..U+F06FF
U+F0700..U+F07FF
U+F0800..U+F08FF
U+F0900..U+F09FF
U+F0A00..U+F0AFF
U+F0B00..U+F0BFF
U+F0C00..U+F0CFF
U+F0D00..U+F0DFF
U+F0E00..U+F0E6F
Kinya Syllables
U+F0E70..U+F0EFF
U+F0F00..U+F0FFF
U+F1000..U+F10FF
U+F1100..U+F11E7
Pikto
U+F16B0..U+F16DF
Derani
U+F1900..U+F19FF
Sitelen Pona
U+F1B00..U+F1BFF
U+F1C00..U+F1C3F
Shidinn
U+F1C40..U+F1C7F Titi Pula
U+F1C80..U+F1C9F Sitelen Pona Radicals
U+F2000..U+F20FF
U+F2100..U+F21FF
U+F2200..U+F22FF
U+F2300..U+F23FF
U+F2400..U+F24FF
U+F2500..U+F25FF
U+F2600..U+F267F
Sadalian
U+F28A0..U+F28DF
Zbalermorna

Initially I just posted my additions to Roman Czyborra's original unifont.hex file. Then in mid-January 2008, his website went down. So I started posting font updates here. Roman has encouraged me to continue with my additions.

Roman's website is now back online, and you can read his Unifont description and motivation for its creation on his website, along with his archive of Unifont's changes: http://czyborra.com/unifont .

TrueType Font Generation

Luis Alejandro González Miranda wrote a cool combination of scripts to convert GNU Unifont from .hex format into FontForge .sfd format, then to have FontForge convert this to a TrueType outline font (see the Unicode Utilities web page on this site for more information). Pixels are drawn as outlined squares, so they scale to all point sizes. This works well with GNOME; I haven't tried it with any other Unix windowing environment. I've removed the OpenType SBIT font link from this page because the outline font is much more flexible.

Luis has given me permission to modify his scripts to convert the latest GNU Unifont versions to TrueType. I've modified his original scripts to handle Unicode combining characters.

JIS X 0213 Kanji

Jiskan16

Unifont 12.1.02 added Japanese BDF and TrueType versions, unifont_jp . This replaced over 10,000 ideographs in the default Unifont font with Japanese kanji from the 16 × 16 pixel Jiskan 16 font. The font is available in two files, corresponding to the two planes in JIS X 0213. Both files are in the public domain.

The comments in the BDF source font files (downloadable from the Japanese Fonts page) credit the following contributors (in order): Toshiyuki Imamura, HANATAKA Shinya, Taichi Kawabata, Koichi Yasuoka, TOYOSHIMA Masayuki, Kazuo Koike, and SATO Yasunao.

For the Unifont release, the glyphs from the two JIS X 0213 planes were converted into Unifont .hex files and mapped to code points in Unicode's Plane 0 and Plane 2 for Unifont. The result provides complete representation of the kanji in JIS X 0213 in a free Unicode font.

Izumi16

Unifont 12.1.03 replaced the Jiskan16 glyphs with the public domain Izumi16 glyphs. These provide improvements on the earlier Jiskan16 glyphs.

Wen Quan Yi: Spring of Letters (文泉驛 / 文泉驿)

The original Unifont CJK glyphs were replaced by new CJK glyphs from version 1.1 of Qianqian Fang's Unibit font. The Unibit font began as a combination of the original GNU Unifont glyphs and a basic CJK bitmap font placed in the public domain by the People's Republic of China. It adopted GNU Unifont's scheme of 8x16 and 16x16 glyphs. Qianqian Fang and many others then added about 10,000 more glyphs.

Qianqian states in the Unibit distribution: "The entire CJK Unified Ideographics (U4E00-U9FA5) and CJK Unified Ideographics Extension A(U3400-U4DB5) blocks were replaced by high-quality glyphs from China National Standard GB19966-2005 (public domain)." Wen Quan Yi volunteeers then edited thousands of these characters. Qianqian also drew the new 22 CJK ideographs in the range U+9FA6..U+9FBB that appear in GNU Unifont.

Wen Quan Yi (WQY) means "spring of letters," as in a spring of water. This is an interesting choice of words, as the British spelling of "font" is "fount" (but still pronounced "font"). See his website for more details: http://wqy.sourceforge.net/cgi-bin/enindex.cgi , or in Chinese at http://wenq.org/wqy2/index.cgi .

The following code points in the latest unifont.hex file are taken from the WQY Unibit font (with my additions to complete the U+3000..U+33FF range, particularly the missing Hiragana, Katakana, and Kanji), including glyphs updated by the Wen Quan Yi volunteers and other modifications as part of the Unifont font:

  • U+2E80..U+2EFF: CJK Radicals Supplement
  • U+2F00..U+2FDF: Kangxi Radicals
  • U+2FF0..U+2FFF: Ideographic Description Characters
  • U+3000..U+303F: CJK Symbols and Punctuation
  • U+31C0..U+31EF: CJK Strokes
  • U+3200..U+32FF: Enclosed CJK Letters and Months
  • U+3300..U+33FF: CJK Compatibility
  • U+3400..U+4DBF: CJK Unified Ideographs Extension A
  • U+4E00..U+9FBF: CJK Unified Ideographs
  • U+F900..U+FAFF: CJK Compatibility Ideographs
  • U+FF00..U+FF60: Fullwidth Forms of Roman Letters

Qianqian has given his okay to add these CJK glyphs from the Wen Quan Yi project into GNU Unifont. Likewise, I've told him to incorporate any glyphs he wants from my contributions to GNU Unifont into his Unibit font. In October 2020, Qianqian Fang also granted permission to apply the SIL Open Font License version 1.1 to Wen Quan Yi glyphs in Unifont as a dual license.

What's Next?

All of the glyphs in the Supplementary Multilingual Plane that could easily be drawn in a 16-by-16 pixel grid have been drawn as of the Unifont 9.0.01 release. There are no plans to draw Tangut. A number of ConScript Unicode Registry (CSUR) scripts remain to be drawn. If you are interested in contributing glyphs to this effort, please contact me. All new contributions must be licensed under the same license as the rest of Unifont (in a nutshell, GPL 2+ with the GNU font embedding exception and the SIL OFL 1.1).

With the great work done by contributors in providing ConScript Unicode Registry (CSUR) glyphs, they are available in font files that have "_csur" in their name.

macOS 26.2 enables fast AI clusters with RDMA over Thunderbolt

Hacker News
developer.apple.com
2025-12-12 20:41:38
Comments...

Pure vs. impure iterators in Go

Lobsters
jub0bs.com
2025-12-12 20:41:16
Comments...

Post-Quantum Cryptography on CHERIoT

Lobsters
cheriot.org
2025-12-12 20:36:11
Comments...
Original Article

When you tell everyone you’re building a secure platform, the first thing that they ask about is encryption. And, in 2025, the hot topic in encryption is algorithms that are safe from hypothetical quantum computers that, unlike real ones, can factorise numbers bigger than 31. These algorithms are referred to as post-quantum cryptography (PQC). Since NIST standardised a few such algorithms, there’s been a lot more interest in seeing them in production, so I spent some time getting the implementations from the Linux Foundation’s PQ Code Package to run on CHERIoT. A lot of companies are building hardware to accelerate these operations, so it seemed useful to have a performance baseline on the CHERIoT Ibex, as well as something that can be used in future CHERIoT-based products.

What are ML-KEM and ML-DSA for?

I am not a mathematician and so I’m not going to try to explain how these algorithms work, but I am going to explain what they’re for .

Module-Lattice-Based Key-Encapsulation Mechanism (ML-KEM) is, as the name suggests, an algorithm for key encapsulation. One side holds a public key and uses it (plus some entropy source) to generate a secret in both plain and encapsulated forms. The encapsulated secret can be sent to a remote party who holds the corresponding private key. The receiver can then recover unencrypted version of the secret (and detect tampering). Now, both parties have the same secret and can use it with some key-derivation function to produce something like an AES key for future communication.

Note that this is somewhat more restrictive than traditional key-exchange protocols. You don’t get to exchange an arbitrary value, the generation step is part of encapsulation. This also means that it’s a fixed size, defined by the algorithm, which is why you typically feed it into a key-derivation function rather than using it directly.

Module-Lattice Digital Signature Algorithm (ML-DSA) has a similarly informative name. It is intended for providing and validating digital signatures. It takes a private key, an arbitrary-sized document and context, and produces a signature. A holder of the associated public key can then validate that the document matches the version signed with the private key and context.

These are both quite low-level building blocks for higher-level protocols. For example, TLS can use ML-KEM for key exchange and ML-DSA for certificate validation, but also incorporates traditional algorithms in case the PQC algorithms have unexpected weaknesses against classical computers.

Initial porting

As is usually the case for CHERIoT, porting the C implementations of ML-KEM and ML-DSA required no code changes. I worked with upstream to slightly simplify the platform-integration layer, so we just provide a single header describing the port. For example, the port heaer for ML-DSA configures the build to produce ML-DSA44 support, defines a custom function for zeroing memory and getting entropy, and adds the __cheriot_libcall attribute to the all exported APIs (so we can build them as shared libraries, rather than embedded in a single compartment). The file for ML-KEM is almost identical.

With these defined, it is possible to build both libraries as CHERIoT shared libraries. This motivated a bit of cleanup. We have a device interface for entropy sources, but it wasn’t implemented on the Sail model (which doesn’t have an entropy source). It has a way of exposing the fact that entropy is insecure, so that wasn’t a problem, it just needed doing, so I refactored all of the insecure entropy-source drivers to use a common base. Most encryption algorithms want an API that fills a buffer with entropy. It’s nice if these don’t all need to touch the driver directly, so I created a compartment that provides this API and exposes it. Now, both libraries are simply consumers of this API. This also makes it easier to add stateful whitening for entropy drivers for hardware entropy sources that don’t do the whitening in hardware.

Most CHERIoT stacks are on the order of 1-2 KiBs. The PQC algorithms use much more space. More, in fact, than we permitted.

The previous limitation was based on the precision of bounds rounding. A CHERI capability compresses the bounds representation by taking advantage of the fact that, for a pointer to an allocation, there is a lot of redundancy between the address of the pointer, the address of the end of the allocation (the top), and the address of the start of the allocation (the base). The distance from the address to base and top are stored as floating-point values with a shared exponent. In practical terms, this means that the larger an allocation is, the more strongly aligned its start and end addresses must be. The same restrictions apply for any capability that grants access to less than an entire object.

When you call a function in another compartment, the switcher will truncate the stack capability so that the callee sees only the bit of the stack that you weren’t using. The top and base of the stack must be 16-byte aligned (as an ABI requirement), but a very large stack may have hardware requirements for greater alignment and so may require a gap between the bottom of the caller’s stack and the top of the callee’s.

Fortunately, we’d added an instruction precisely for this kind of use case: CSetBoundsRoundDown . This takes a capability and a length and truncates it to at most that length. It was a fairly small tweak to the switcher to make it do this, and a much larger amount of time with SMT solvers to convince ourselves that this was a safe thing to do.

This also showed up a bug in our linker’s handling of the CAPALIGN directive, which rounds a section’s base and size up to the required alignment to be representable. This was not working for sections that followed an explicit alignment directive. Our stacks must be both at least 16-byte aligned and representable as capabilities. This is now fixed.

So now we support stacks up to almost 64 KiB, a limitation imposed by the current loader metadata format rather than anything intrinsic to how the system operates after booting. We could easily increase this limit but 64 KiB ought to be enough for anyone.

Performance on CHERIoT Ibex

The repository contains a simple benchmark example that tries each of the operations and reports both the cycle time and stack usage. The output on the CHERIoT Ibex verilator simulation is:

PQC benchmark: Starting: stack used: 224 bytes, cycles elapsed: 41
PQC benchmark: Generated ML-KEM key pair: stack used: 14304 bytes, cycles elapsed: 5143987
PQC benchmark: Encrypted secret pair with ML-KEM: stack used: 17440 bytes, cycles elapsed: 1773235
PQC benchmark: Decrypted secret pair with ML-KEM: stack used: 18464 bytes, cycles elapsed: 2176226
PQC benchmark: Compared results successfully for ML-KEM: stack used: 224 bytes, cycles elapsed: 414
PQC benchmark: Generated ML-DSA key pair: stack used: 46912 bytes, cycles elapsed: 3622132
PQC benchmark: Signed message with ML-DSA: stack used: 60544 bytes, cycles elapsed: 5391177
PQC benchmark: Verified message signature with ML-DSA: stack used: 44672 bytes, cycles elapsed: 3674071
PQC benchmark: Correctly failed to verify message signature with ML-DSA after tampering: stack used: 44672 bytes, cycles elapsed: 3673706

The ML-KEM encrypt (generate shared secret and encrypted version) and decrypt (recover shared secret from encrypted version) each use around 18 KiB of stack and run in around two million cycles. CHERIoT Ibex should scale up to 200-300 MHz (though may be clocked lower for power reasons in some deployments), but even at 100 MHz that’s 50 encryption or decryption operations per second. Remember that this is an operation that typically happens when you establish a connection, then you use a stream cypher such as AES with the exchanged key.

The ML-DSA operations are slower and use a lot more stack space (almost 60 KiB for signing!). But, even there, the performance is reasonable, under 4 M cycles. This means that you can do 20 signature-verification operations per second at 100 MHz.

Even using ML-KEM for key exchange and ML-DSA for certificate validation in a TLS flow is unlikely to add more than a few tens of milliseconds to the handshake time, which is perfectly acceptable for the common use case for embedded devices.

In terms of code size, both are small. The ML-KEM implementation is around 12 KiB, the ML-DSA implementation 18 KiB. These both include a SHA3 (FIPS 202) implementation, so there’s scope for code-size reduction on systems that need both, but 30 KiB of code isn’t too bad.

Future plans

The stack usage is very high. Upstream has some plans to allow pluggable allocators, which will allow us to move a lot of this to the heap. This is precisely the kind of use case that CHERIoT’s memory-safe heap is great for: something needs 60 KiB of RAM for 4,000,000 cycles, but then doesn’t need that RAM again for a long time. That memory can then be used for something else, even in a mutually distrusting compartment.

Currently, the library builds are very thin wrappers around the upstream projects. This is great as a building block, but we should make more use of CHERIoT features in the longer term.

Both ML-KEM and ML-DSA depend on SHA3 (FIPS 202). Ideally, we’d factor that out as some common code, rather than carrying a copy in each library. Similarly, the libraries provide an option to plug in your own SHA3 implementation. This is likely to be a common hardware operation even for chips that don’t have full PQC implementations, so we should expose this option in the build system.

Is it secure?

Security always depends on the threat model.

For signature validation, you don’t have any secret data, just a public key, a document, and a signature. The only concerns are whether there are weaknesses in the algorithm, or bugs, that would allow an attacker to substitute a different document for the same signature. CHERIoT prevents memory-safety bugs, so this is concerned solely with logic errors. The code upstream is checked against a set of test vectors that aim to trigger corner cases in the logic of the underlying implementation, so hopefully is secure in this way.

For signing or key exchange, you need to worry about the key leaking. On a CHERI system, it’s unlikely to leak explicitly, but may leak via side channels. The security section of the upstream projects discusses a number of techniques that they use to mitigate this kind of attack.

That’s typically sufficient. It’s been recommended practice for embedded devices to have per-device secrets for a long time. This means that leaking a key from one device doesn’t compromise the device class, only that specific device.

For some very high-assurance use cases, that secret may matter and need to be robust against an adversary with physical access to the device. Hardware encryption engines typically care about confidentiality breaches via power side channels and integrity breaches via glitch injection. Power side channels are difficult to mitigate in software: the power requirements of multiplying two numbers together may depend on the number of carry bits set, for example. They’re much easier to mitigate in hardware, by simply doing the same calculation twice in parallel, once with the original inputs and once with the inputs permuted to have the opposite power characteristics.

Glitch injection takes the chip out of its specified power or frequency (or thermal) envelope and attempts to introduce bit flips, which can corrupt state in such a way that tamper with signing or leak a key. These are also effectively impossible to mitigate in software because the software that’s attempting the mitigation is vulnerable to the same glitches. There are some compiler techniques that can make these harder, but they come with a high performance cost.

If power analysis and glitch injection are part of your threat model, the software implementations are not sufficient. In this case you may also need to worry about someone removing the top of the chip and using a scanning-tunnelling electron microscope to read bits from non-volatile memory. This used to require tens of thousands of dollars but is now much cheaper. Devices that need to worry about this often have tiny explosive charges in the package to destroy the chip in cases of tampering. If that’s your threat model, hardware PQC implementations may not be sufficient, at least alone.

But if you care about attackers on the network being unable to compromise the security of the class of devices, even if they have a magical and imaginary quantum computer, then these should be sufficient.

Security issues with electronic invoices

Hacker News
invoice.secvuln.info
2025-12-12 20:28:41
Comments...
Original Article

This page provides supplementary material for a presentation given at the German OWASP Day 2025 ( Presentation Slides ).

Intro

With the eInvoicing Directive (2014/55/EU) , the European Union introduced “standardized” electronic invoices in XML format. Increasingly, institutions and businesses in EU member states will be required to support these electronic invoices.

While machine-readable invoices are, in general, a good idea, there are various issues with the EU’s approach, including needless complexity, a lack of true standardization (multiple syntaxes and various sub-formats), and a tendency to use technologies with inherent security problems.

Due to a combination of unfortunate design decisions, implementing software for electronic invoices is likely to be affected by security flaws if no countermeasures are implemented.

XML Insecurity and XXE

The XML format is known to have inherent security flaws, the most dangerous ones being XXE vulnerabilities (XML eXternal Entity injection).

XXE vulnerabilities often allow the exfiltration of files. While some XML implementations have implemented secure defaults or were never vulnerable to begin with (e.g., Python , libxml2 , .NET , Expat ), others remain insecure by default.

Two notable examples of implementations with insecure defaults are the Java standard library and the Saxon library. Both are commonly used within the electronic invoicing ecosystem.

The problem with XSLT 2.0

XSLT is a document transformation language. Only XSLT version 1.0 is widely supported. For XSLT 2.0 and above, only one freely available implementation exists: Saxon.

To check compliance with the EN16931 standards, the EU provides validation artifacts based on Schematron . Those validation artifacts require XSLT 2.0.

Thus, anyone using these validation artifacts will likely use Saxon to implement invoice parsing. Saxon, as mentioned, is vulnerable to XXE by default.

Despite its poor implementation status and the fact that its primary implementation has insecure defaults, XSLT 2.0 (and its successor 3.0) is a W3C recommendation. I raised these concerns with the W3C.

Security test suite

A security test suite for electronic invoices is provided here .

Getting the EN16931 standards

The EU requirements for electronic invoices are standardized by the European Committee for Standardization (CEN) in a set of standards named EN16931. The first two parts are available free of charge. Subsequent parts cost money.

Accessing these standards is surprisingly difficult. A link on the EU web page to CEN is currently broken. CEN does not provide direct downloads of these documents and refers to national standardization organizations. Those often require account registrations even to access the free-of-charge parts of the standard.

The Estonian standardization organization (EVS) provides downloads of parts one and two without registration:

For the parts of EN16931 that are not available free of charge, prices at EVS are cheaper than those at most other national standardization organizations.

XXE vulnerabilties

List of security vulnerabilities discovered in electronic invoicing software during this research:

Product Vuln type Info
kivitendo XXE Reported 2025-03-25, Fixed in 3.9.2beta (2025-03-28) / 3.9.2 (2025-05-05) , Software Stack: Perl/XML::LibXML, CVE-2025-66370
peppol-py Blind XXE Reported: 2025-11-13, fixed in 1.1.1 (2025-11-13) , Software Stack: Python/Saxon, CVE-2025-66371
ZUV * Blind XXE Reported: 2025-11-17, no longer developed according to README, Software Stack: Java/Saxon
papierkram.de E-Rechnung-Viewer XXE Reported: 2025-03-30, fixed: 2025-03-31
EPO E-Invoice Viewer XXE Reported: 2025-10-13, fixed: 2025-10-14
portinvoice XXE Reported: 2025-10-29, fixed: 2025-10-29
xrechnung-erstellen.com E-Rechnung Viewer XXE Reported: 2025-10-14, fixed: 2025-10-16
Belegmeister ZUGFERD VIEWER Blind XXE Reported: 2025-11-15 (only supports PDF upload), fixed: 2025-11-25
E-Rechnungs-Validator by winball.de Blind XXE Reported: 2025-11-17, fixed: 2025-11-19, confirmation
ZUGFeRD Community ZF/FX Invoiceportal Blind XXE Reported: 2025-11-17, no reply, re-tested on 2025-11-25, validation functionality was removed (relied on ZUV)
REDACTED1 XXE Reported: 2025-10-29, no reply, re-tested on 2025-11-18, fix incomplete (see next line)
REDACTED1 Blind XXE Reported: 2025-11-18, no reply, unfixed
REDACTED2 Blind XXE Reported: 2025-11-17, no reply, unfixed

* ZUV is no longer developed, and it is recommended to use Mustang instead. Mustang was also vulnerable to XXE in versions before 2.16.3 ( CVE-2025-66372 ).

More

Questions?

Get in touch!

Text and logo are licensed as CC0 . The logo is a mix of three icons from svgrepo.com, all CC0. The web page uses Pico CSS (MIT license) and Hugo .

Created by Hanno Böck (created: , last update: )

Imprint

New Kindle feature uses AI to answer questions about books

Hacker News
reactormag.com
2025-12-12 20:24:04
Comments...
Original Article

At present, there are multiple cases in which authors are suing AI companies for scraping their works without payment or permission. While these legal battles have been going on, Amazon has quietly added a new AI feature to its Kindle iOS app—a feature that “lets you ask questions about the book you’re reading and receive spoiler-free answers,” according to an Amazon announcement .

The company says the feature, which is called Ask this Book, serves as “your expert reading assistant, instantly answering questions about plot details, character relationships, and thematic elements without disrupting your reading flow.”

Publishing industry resource Publishers Lunch noticed Ask this Book earlier this week, and asked Amazon about it. Amazon spokesperson Ale Iraheta told PubLunch, “The feature uses technology, including AI, to provide instant, spoiler-free answers to customers’ questions about what they’re reading. Ask this Book provides short answers based on factual information about the book which are accessible only to readers who have purchased or borrowed the book and are non-shareable and non-copyable.”

As PubLunch summed up: “In other words, speaking plainly, it’s an in-book chatbot.”

Amazon did not answer PubLunch’s questions about “what rights the company was relying upon to execute the new feature was not answered, nor did they elaborate on the technical details of the service and any protections involved (whether to prevent against hallucinations, or to protect the text from AI training).”

Perhaps most alarmingly, the Amazon spokesperson said, “To ensure a consistent reading experience, the feature is always on, and there is no option for authors or publishers to opt titles out.”

It also sounds as though authors and publishers were, for the most part, not notified of this feature’s existence.

Amazon is already in the news this week for its flawed AI recaps of television shows. After a Fallout recap was “garbage filled with mistakes,” as io9 called it, the company paused the feature . A similar thing happened earlier this year with Amazon’s AI dubs for anime series .

As PubLunch says of Ask this Book, “Many rightsholders and creators are likely not to want an in-book chatbot without their specific review and approval (or at all), and we expect that message will be getting delivered to publishers and Amazon loud and clear in the ensuing days. And many people would deem the outputs of generative AI analyzing a particular copyrighted work as the very embodiment of a derivative work (or simply a direct infringement).”

Ask this Book is currently only available in the Kindle iOS app in the US, but Amazon says it “will come to Kindle devices and Android OS next year.” icon-paragraph-end

LLM 0.28

Simon Willison
simonwillison.net
2025-12-12 20:20:14
LLM 0.28 I released a new version of my LLM Python library and CLI tool for interacting with Large Language Models. Highlights from the release notes: New OpenAI models: gpt-5.1, gpt-5.1-chat-latest, gpt-5.2 and gpt-5.2-chat-latest. #1300, #1317 When fetching URLs as fragments using llm -f URL, th...
Original Article

LLM 0.28 . I released a new version of my LLM Python library and CLI tool for interacting with Large Language Models. Highlights from the release notes:

  • New OpenAI models: gpt-5.1 , gpt-5.1-chat-latest , gpt-5.2 and gpt-5.2-chat-latest . #1300 , #1317
  • When fetching URLs as fragments using llm -f URL , the request now includes a custom user-agent header: llm/VERSION (https://llm.datasette.io/) . #1309
  • Fixed a bug where fragments were not correctly registered with their source when using llm chat . Thanks, Giuseppe Rota . #1316
  • Fixed some file descriptor leak warnings. Thanks, Eric Bloch . #1313
  • Type annotations for the OpenAI Chat, AsyncChat and Completion execute() methods. Thanks, Arjan Mossel . #1315
  • The project now uses uv and dependency groups for development. See the updated contributing documentation . #1318

That last bullet point about uv relates to the dependency groups pattern I wrote about in a recent TIL . I'm currently working through applying it to my other projects - the net result is that running the test suite is as simple as doing:

git clone https://github.com/simonw/llm
cd llm
uv run pytest

The new dev dependency group defined in pyproject.toml is automatically installed by uv run in a new virtual environment which means everything needed to run pytest is available without needing to add any extra commands.

Lawmakers Pave the Way to Billions in Handouts for Weapons Makers That the Pentagon Itself Opposed

Intercept
theintercept.com
2025-12-12 20:19:44
The pilot program, added to the military budget behind closed doors, upends an 80-year precedent against covering contractors’ interest payments. The post Lawmakers Pave the Way to Billions in Handouts for Weapons Makers That the Pentagon Itself Opposed appeared first on The Intercept....
Original Article

For the better part of a century, there was one thing even the U.S. government would not do to pad the profits of defense contractors.

Now, more than 80 years of precedent may be coming to an end.

On Thursday, lawmakers in the House approved a “pilot program” in the pending Pentagon budget bill that could eventually open the door to sending billions to big contractors, while providing what critics say would be little benefit to the military.

The provision, which appeared in the budget bill after a closed-door session overseen by top lawmakers, would allow contractors to claim reimbursement for the interest they pay on debt they take on to build weapons and other gadgets for the armed services.

“The fact that we are even exploring this question is a little crazy in terms of financial risk.”

The technical-sounding change has such serious implications for the budget that the Pentagon itself warned against it two years ago.

One big defense contractor alone, Lockheed Martin, reported having more than $17.8 billion in outstanding interest payments last year, said Julia Gledhill, an analyst at the nonprofit Stimson Center.

“The fact that we are even exploring this question is a little crazy in terms of financial risk for the government,” Gledhill said.

Gledhill said even some Capitol Hill staffers were “scandalized” to see the provision in the final bill, which will likely be approved by the Senate next week.

Pilot to Where?

For most companies, paying interest on a loan they take out from the bank is a cost of doing business. The pilot program buried in the budget bill, however, is one of many ways in which the federal government would give defense contractors special treatment.

Contractors can already receive reimbursements from the Defense Department for the cost of research and development . Under the terms of the legislation, they would also be allowed to receive reimbursements for “financing costs incurred for a covered activity.”

The legislation leaves it up to the Pentagon to design the program. While it’s billed as a pilot, there is no hard spending cap in the pending legislation. The total amount dedicated to the program would be determined by the House and Senate appropriations committees.

The bill tasks the Defense Department with releasing a report in February 2028 on how well the pilot program worked. As approved by Congress, however, the bill does not explain what metrics, if any, the Pentagon is supposed to use to evaluate the program.

“I don’t see any clear parameters for what success looks like,” Gledhill said. “Are there new entrants? Are we building weapons production capacity? Or are new entrants on the way?”

The chairs and ranking members of the House and Senate armed services committees who oversaw the closed-door conference process that produced the final draft of the National Defense Authorization Act did not respond to requests for comment.

In a document posted online, the committee leaders said that similar provisions were included in House and Senate drafts of the bill.

Big Spending at Stake

The switch to covering financing costs seems to be in line with a larger push this year to shake up the defense industry in light of lessons learned from Russia’s brutal war on Ukraine and fears of competition with China.

“The generous view of this provision is: Look, we have industrial capacity constraints and perhaps if we make borrowing essentially free, then maybe — big maybe — contractors will invest in capacity,” Gledhill said.

She is skeptical that will happen, and the Pentagon itself was dubious in a 2023 study conducted by the Office of the Under Secretary of Defense for Acquisition and Sustainment. The Pentagon found that policy change might even supercharge the phenomenon of big defense contractors using taxpayer dollars for stock buybacks instead of research and development.

“Higher interest rates or increased borrowing only increase Revenue and Profits further,” the report found. “This creates the real risk of a ‘moral hazard’ as it pertains to interest.”

The sums at stake are enormous. The “five primes” — the big defense contractors who claim the lion’s share of Pentagon contracts — each reported spending massive amounts of money on interest payments last year. The companies all disclose their debt loads in slightly different ways in their annual reports, but the scale is nonetheless massive in each case.

Lockheed Martin said it had $17.8 billion in outstanding interest payments.

RTX, formerly known as Raytheon, said it had $23.3 billion in future interest on long-term debt.

“I don’t think a single dollar should go toward interest payments for contractors.”

Northrop Grumman paid $475 million on interest payments in 2024, and General Dynamics, for its part, paid $385 million.

Meanwhile, Boeing said that it had $38.3 billion in long-term interest on debt. The company did not break down specifically how much of that debt related to its defense business, which accounted for 36.5 percent of its revenue in 2024.

Along with the “five primes,” Silicon Valley firms such as Anduril and Palantir are increasingly moving into defense contracting.

It’s unlikely that the contractors’ interest payments would ever be fully reimbursed by the Defense Department, Gledhill said, but even getting a fraction covered would amount to a huge giveaway.

She said, “I don’t think a single dollar should go toward interest payments for contractors.”

Rats Play Doom

Hacker News
ratsplaydoom.com
2025-12-12 20:15:58
Comments...
Original Article

Intro

We built a complete VR setup from scratch to let rats play DOOM. The system includes a motion-tracked treadmill ball, a panoramic headset, an input trigger, and a reward circuit. All hardware and software components are open sourced, including 3D-printable designs, circuit diagrams, firmware, and control software.

The first version (v1) was built in New York by Viktor , who trained rats to walk through a corridor in DOOM using a simpler rig. That version was featured on Vice and PC Gamer. After moving back home, the project was paused. Public interest reignited development, leading to v2, a more advanced and modular version built in collaboration with electrical engineer Sándor Makra . Akos Blaschek later assisted significantly in documenting the project for open-sourcing, aiming to enable others to replicate and build upon this work. Key metallic components were designed and sourced in collaboration with SZURWIN KFT .

V1

  • Basic ball setup
  • Rats trained to run forward
  • Minimal sensors and mechanics
  • No panoramic screen
Rat VR Setup Version 1
Rat VR Setup Version 1

V2

  • New ball driver mechanism for smoother movement
  • Foldable AMOLED screen with 180° horizontal and 80° vertical FOV, Full HD resolution
  • Upgraded sensors for movement tracking
  • Reinforced feeder system with mixing motor
  • Modular 3D-printable components
  • Improved electronics reliability and safety
Rat VR Setup Version 2
Rat VR Setup Version 2
Full setup from side showing rat on ball, screen around, trigger, and water tube.
Full setup from side showing rat on ball, screen around, trigger, and water tube.

Limitations

We reached the point of rat habituation but didn’t start training. Our rats (Todd, Kojima, Gabe) aged out before full testing. The setup works, but behavioral validation is pending.

Hardware

The hardware is a comprehensive VR rig designed for rodents. It consists of a motion-tracked sphere that captures the rat's movements, a custom-built trigger for in-game actions, a curved panoramic screen for visual immersion, and an automated reward system that dispenses sugar water to reinforce behavior. All these components are mounted on a modular aluminum frame, creating a complete, self-contained environment for the rat to interact with the game.

View the Hardware Assembly Guide

Visual Interface

The headset wraps around the rat’s head with a foldable AMOLED screen. It maximizes immersion without obstructing whisker space. The screen supports Full HD resolution.

The headset frame also integrates several sensory components: two small air nozzles are positioned near the left and right whiskers, capable of delivering targeted air puffs on command (e.g., signaling wall collisions in-game). The frame provides a secure mounting point for the reward system's dispenser tube, placing it near the rat's mouth. Additionally, the design includes placeholders for miniature speakers near each ear, intended for future implementation of stereo audio cues.

Headset close-up from above.
Headset close-up.

3D Model: Headset

Locomotion

Movement is captured via a free-spinning ball under the rat. Rotary sensors track displacement and convert it into game motion. The ball can also be driven by motors.

These motors are used during training to roll the ball and simulate movement paths before a reward. This guides the rat on where to go, helping form movement-action associations. Like the trigger, this allows for programmatic training sequences with minimal initial input from the animal.

Ball mount showing driven/undriven modes and sensor placement.
Ball mount showing driven/undriven modes and sensor placement.

3D Model: Stand/Ball

Trigger Input

The shooting input is a custom-built hand-operated lever. Rats pull it with their paws to fire. The lever is held in place by small springs, encased in a 3D-printed housing. It includes a rotary encoder to detect motion and a stepper motor to actuate it.

The motor allows programmatic control—pulling the lever to demonstrate shooting. This enables training by pairing visual cues with mechanical motion, reinforcing the association before the rat initiates the action on its own.

Close-up of trigger lever with encoder and motor.
Close-up of trigger lever with encoder and motor.

3D Model: Trigger

Reward System

Positive in-game actions trigger a liquid reward: sugar water delivered through a precise dispensing mechanism. The system consists of:

  • Mixer: Continuously stirs the sugar solution to maintain even concentration
  • Pump + Pressure Sensor: Keeps the line under constant pressure
  • Solenoid Valve: Magnetic valve that opens to release exact 10 µL doses
  • Dispenser: Positioned near the mouth for easy access

This setup ensures accurate, repeatable reward delivery with minimal delay. The reward is synchronized with game events to reinforce desired behaviors.

Reward circuit with labeled mixer, pump, valve, and dispenser.
The messy but functional reward circuit from behind.

Limitations

The current system assumes basic rat mobility and grooming behavior. Fine-tuning might be needed for rats of different sizes or temperaments. Trigger placement and reward tube flow may need calibration per subject.

Software

The setup is controlled through a modular Python system. The main entry point is arena_scenario.py , which runs the full control loop.

The system includes:

  • Motion capture: Reads movement from optical flow sensors mounted around the treadmill ball.
  • Locomotion control: Drives the ball motors to guide the rat during training.
  • Trigger input: Reads lever pulls, detects voluntary shooting actions.
  • Reward delivery: Dispenses precise 10 μL sugar water rewards via a controlled solenoid valve and maintains constant line pressure.
  • DOOM integration: Interfaces with a modified ViZDoom environment for real-time closed-loop behavior.
  • Training logic: Enforces demonstrations and delivers rewards based on game state and rat behavior.

View the Project on GitHub

The software runs on a PC and communicates with a Raspberry Pi via TCP sockets. The Pi handles real-time sensor reading, ball actuation, and reward control; the PC processes the sensor data, runs the game, and sends high-level commands to the Pi.

All major components—movement tracking, ball driving, trigger detection, and reward control—can be operated manually or in closed-loop mode. All control parameters (e.g., motor speeds, reward volumes) are set in Python code.

Limitations

There’s no in-built calibration suite. Users must validate sensor alignment and reward timing manually. Some microcontroller firmwares might require tuning based on hardware tolerances.

Results

The rats successfully learned to navigate the virtual environment and trigger the shooting mechanism. Habituation took approximately two weeks per rat. While advanced training wasn't completed due to time constraints, initial data showed promising engagement with the system.

Rat engaging with the VR setup during a session.
Rat interacting with the VR setup.

Limitations

Full behavioral validation requires longer training periods. Cross-subject variability wasn't extensively studied. The impact of prolonged VR exposure on rat well-being needs further research.

What Now?

Interested in building your own animal VR setup? Feel free to reach out for guidance. We're also compiling a comprehensive Rat VR Build Guide .

At YoloRun.Capital , we invest in ambitious, boundary-pushing projects like this, even the beautifully impractical ones. Have a wild idea? Let's talk.

A rat wearing a tiny Santa hat.

Team

Viktor Tóth

Viktor Tóth

Gamer Rat Coach

Sándor Makra

Sándor Makra

Electrical Engineer

Ákos Blaschek

Ákos Blaschek

Documentation Lead

Three new stable kernels

Linux Weekly News
lwn.net
2025-12-12 19:45:30
Greg Kroah-Hartman has released the 6.18.1, 6.17.12, and 6.12.62 stable kernels. Each contains important fixes; users of those kernels are advised to upgrade. ...
Original Article

[Posted December 12, 2025 by jzb]

Greg Kroah-Hartman has released the 6.18.1 , 6.17.12 , and 6.12.62 stable kernels. Each contains important fixes; users of those kernels are advised to upgrade.


The Average Founder Ages 6 Months Each Year

Hacker News
tomtunguz.com
2025-12-12 19:41:46
Comments...
Original Article

No, founders are not adopting Bryan Johnson’s regimen to reverse aging. Quite the opposite : the average founder raising capital ages six months every year. 1

Median founder age trend over time

I suspect founder age has been increasing steadily for three reasons. First, venture capital has shifted toward AI, which grew from roughly 10% to 60% of investment in just three years. 2 AI founders skew older. Many AI labs are started by PhDs who spent extended periods in school & often come from industry, commercializing initiatives from major labs or hyperscalers.

Second, the shift toward B2B rewards experience. B2B founders benefit from established relationships with potential team members, design partners & expertise selling to enterprises. These networks take years to build.

B2B vs B2C Series A capital allocation trend

Third, press coverage distorts perception. Media tends to spotlight younger founders pursuing product-led growth or consumer strategies. The Cursor team , fresh from MIT, captures the zeitgeist. But there are many founders who grow up within an industry & then go out to upend it.

Perhaps venture capitalists should start funding reverse aging programs. If this trend holds, the typical founder will be a decade older in 20 years.

Subscribe

The 1-minute read that turns tech data into strategic advantage. Read by 150k+ founders & operators.

GP at Theory Ventures. Former Google PM. Sharing data-driven insights on AI, web3, & venture capital.

Bloomberg WSJ Economist

Daring Fireball Weekly Sponsorships, End of Year and Q1 2026

Daring Fireball
daringfireball.net
2025-12-12 19:25:23
Weekly sponsorships have been the top source of revenue for Daring Fireball ever since I started selling them back in 2007. They’ve succeeded, I think, because they make everyone happy. They generate good money. There’s only one sponsor per week and the sponsors are always relevant to at least some ...
Original Article
Schedule:
Dec 1 – Dec 7 Sold
Dec 8 – Dec 14 Sold
Dec 15 – Dec 21 Year-end discount: $9,500
Dec 22 – Dec 28 Year-end discount: $9,500
Dec 29 – Jan 4 Year-end discount: $9,500
Jan 5 – Jan 11 Available
Jan 12 – Jan 18 Sold
Jan 19 – Jan 25 Available
Jan 26 – Feb 1 Sold
Feb 2 – Feb 8 Available
Feb 9 – Feb 15 Sold
Feb 16 – Feb 22 Available
Feb 23 – Mar 1 Available
Mar 2 – Mar 8 Sold
Mar 9 – Mar 15 Sold
Mar 16 – Mar 22 Available
Mar 23 – Mar 29 Sold
Mar 30 – Apr 5 Available

Price: $11,000

To schedule a sponsorship or for additional information, email John Gruber .


Week-long sponsorships are available for Daring Fireball. This is the only way to promote your product or service specifically to Daring Fireball’s audience of Mac nerds, designers, nitpickers, perfectionists, and connoisseurs of fine sarcasm.

What sponsors get:

  • A display ad in the sidebar on every page of the site, all week long.

  • A post from the sponsor will appear in the RSS feed at the start of the week. You, the sponsor, get to address Daring Fireball’s most dedicated readers directly.

  • At the end of the week, I’ll also post an item thanking and linking to the feed sponsor.

  • Sponsorship is exclusive. Only one sponsor per week.

  • An archive of all previous DF RSS feed sponsors is available.

About Daring Fireball’s audience:

  • Typical weekday unique visitors: 150,000.

  • Estimated monthly unique visitors: 2.5 million.

  • Estimated Daring Fireball RSS feed subscribers: Over 200,000.

Windows 3.1 'Hot Dog Stand' color scheme true story

Hacker News
www.pcgamer.com
2025-12-12 19:13:35
Comments...
Original Article
Windows 3.1 color schemes
(Image credit: Microsoft)

Every so often, a wonderful thing happens: someone young enough to have missed out on using computers in the early 1990s is introduced to the Windows 3.1 "Hot Dog Stand" color scheme. Back in the day Windows was pretty plain looking out of the box, with grey windows and blue highlights as the default. A number of optional color palettes gave it a bit more pep, like the wine-tinged Bordeaux or the more sophisticated teal of Designer.

And then there was Hot Dog Stand, which more or less turned Windows into a carnival.

"The truly funny thing about this color scheme is that all the other Windows 3.1 color schemes are surprisingly rational, totally reasonable color schemes," tech blogger Jeff Atwood wrote back in 2005 . "And then you get to 'Hot Dog Stand. Which is utterly insane . … I have to think it was included as a joke."

(Image credit: Microsoft)

Did Windows 3.1 really ship with a garish color scheme that was dared into being? That was a story I needed to hear, so I went digging for the credits of the Microsoft employees who worked on the user interface back then and found my way to Virginia Howlett , who joined Microsoft in 1985 as the company's first interface designer, and worked there up through the launch of Windows 95.

Howlett also co-created the font Verdana , which is partially named after her daughter Ana and is up there with Helvetica as one of the most-used fonts of the last 30 years. But enough about her world-changing contributions to modern technology: we're here to talk Hot Dog Stand.

"I confess that I'm surprised anyone cares about Windows 3.1 in late 2025! It was such a long time ago and the world has changed so much," Howlett told me when I reached out over email. She confirmed that she and a "small team of designers" created Windows 3.1's themes, which were a "radically new" feature at the time—prior to its release, you couldn't customize different parts of the OS, like the backgrounds and title bars of windows, with different colors.

Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.

Publicity photo from her early years at Microsoft (Image credit: Virginia Howlett)

I asked if the designers at Microsoft really had included Hot Dog Stand as a joke, or if it was inspired by a particular stand they frequented near the corporate campus (hey, it was a longshot, but you never know). I'll let Virginia tell the rest of the story:

As I recall there were 16 colors: white, black, gray, RGB, CMY, and the dark versions of those colors—so dark red, dark green, dark blue, dark cyan, dark magenta, dark yellow, dark gray. (Normal people might call some of these colors teal, navy, burgundy, etc.) Much of the user interface was black lines on a white background and used 2 shades of gray to create 3-D buttons: 'affordances.'

We designed a long list of themes using those 16 colors. No one today seems interested in 'Bordeaux' or 'Tweed' or 'Arizona.' We were covering all the bases, hoping to come up with color schemes that would appeal to a broad range of people. 'Hot Dog Stand' used bright yellow and red.

I have been mystified about why that particular theme causes so much comment in the media. Maybe it's partly the catchy name. (Never underestimate the power of a good brand name!)

I do remember some discussion about whether we should include it, and some snarky laughter. But it was not intended as a joke. It was not inspired by any hot dog stands, and it was not included as an example of a bad interface—although it was one. It was just a garish choice, in case somebody out there liked ugly bright red and yellow.

The 'Fluorescent' theme was also pretty ugly, but it didn't have a catchy name, so I've never heard anything about it.

I'm really glad that 'Hot Dog Stand' has entertained so many people for so many years.

With regards to design historians everywhere,

Virginia Howlett

As delightfully garish as Hot Dog Stand is, Howlett is right that it's far from the only eye searing theme in the Windows 3.1 collection. Check out Fluorescent and Plasma Power Saver:

Windows 3.1 color schemes
(Image credit: Microsoft)

You can play around with Windows 3.1 in your browser thanks to the emulator PCjs Machines ; if you get really into it, you can even customize every color yourself instead of relying on one of the preset themes.

So that's that: Hot Dog Stand may have inadvertently served as a warning to aspiring theme customizers that madness was just a few overzealous color choices away, but that wasn't its original intent. It wasn't included on the floppy disks as a dare, or a joke—it just happened to end up one of the funniest and most memorable relics of Windows history.

Virginia Howlett recently guested on an episode of the design podcast Complementary , so give it a listen if you'd like to hear her talk about interface design and her time at Microsoft.

Wes has been covering games and hardware for more than 10 years, first at tech sites like The Wirecutter and Tested before joining the PC Gamer team in 2014. Wes plays a little bit of everything, but he'll always jump at the chance to cover emulation and Japanese games.

When he's not obsessively optimizing and re-optimizing a tangle of conveyor belts in Satisfactory (it's really becoming a problem), he's probably playing a 20-year-old Final Fantasy or some opaque ASCII roguelike. With a focus on writing and editing features, he seeks out personal stories and in-depth histories from the corners of PC gaming and its niche communities. 50% pizza by volume (deep dish, to be specific).

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

White House Refuses to Rule Out Summary Executions of People on Its Secret Domestic Terrorist List

Intercept
theintercept.com
2025-12-12 19:02:03
The Trump administration ignored questions about whether it would order the killings of those on its NSPM-7 list — even while answering our other queries. The post White House Refuses to Rule Out Summary Executions of People on Its Secret Domestic Terrorist List appeared first on The Intercept....
Original Article

President Donald Trump has shattered the limits of executive authority by ordering the summary executions of individuals he deems members of designated terrorist organizations. He has also tested the bounds of his presidential powers by creating a secret list of domestic terrorist organizations, established under National Security Presidential Memorandum 7, or NSPM-7 .

Are Americans that the federal government deems to be members of domestic terrorist organizations subject to extrajudicial killings like those it claims are members of designated terrorist organizations? The White House, Justice Department, and Department of War have, for more than a month, failed to answer this question.

Lawmakers and other government officials tell The Intercept that the pregnant silence by the Trump administration has become especially worrisome as the death toll mounts from attacks on alleged members of “designated terrorist organizations” in the Caribbean Sea and Pacific Ocean, and as Trump himself makes ever more unhinged threats to imprison or execute his political adversaries.

In early September , The Intercept revealed that elite Special Operators killed the shipwrecked victims of a September 2 attack on a suspected drug smuggling boat. They have since struck more than 20 other vessels. The administration insists the attacks are permitted because the U.S. is engaged in “ non-international armed conflict ” with “ designated terrorist organizations ” it refuses to name. Experts and lawmakers say these killings are outright murders — and that Trump could conceivably use similar lethal force inside the United States.

“The Trump Administration is trying to justify blowing small boats out of the water by arbitrarily calling them ‘designated terrorist organizations’ — a label not grounded in U.S. statute nor international law, but in solely what Trump says,” Sen. Tammy Duckworth, D-Ill., told The Intercept. “If Trump is using this justification to use military force on any individuals he chooses — without verified evidence or legal authorization — what’s stopping him from designating anyone within our own borders in a similar fashion and conducting lethal, militarized attacks against them? This illegal and dangerous misuse of lethal force should worry all Americans, and it can’t be accepted as normal.”

For almost a quarter century, the United States has been killing people — including American citizens , on occasion — around the world with drone strikes. Beginning as post-9/11 counterterrorism operations, these targeted killings in Afghanistan , Iraq, Somalia, Yemen, and other nations relied on a flimsy legal rationale that consistently eroded respect for international law . Details of these operations were kept secret from the American people, and civilian casualties were ignored, denied, and covered up . The recent attacks on alleged drug boats lack even the rickety legal rationale of the drone wars, sparking fear that there is little to stop the U.S. government from taking the unprecedented step of military action against those it deems terrorists within the nation’s borders.

The military has carried out 22 known attacks in the Caribbean Sea and eastern Pacific Ocean since September, killing at least 87 civilians . Last week, footage of the September 2 double-tap strike shown to select members of Congress ignited a firestorm . Trump announced, on camera, that he had “ no problem ” with releasing the video of the attack. This week, he denied ever saying it, in another example of his increasingly unbalanced behavior.

“The public deserves to know how our government is justifying the cold-blooded murder of civilians as lawful and why it believes it can hand out get-out-of-jail-free cards to people committing these crimes,” said Jeffrey Stein, staff attorney with the American Civil Liberties Union’s National Security Project, on Tuesday, as the ACLU, the Center for Constitutional Rights, and the New York Civil Liberties Union filed a federal lawsuit for the immediate release of a classified Justice Department’s opinion and other documents related to the attacks on boats. “The Trump administration must stop these illegal and immoral strikes, and officials who have carried them out must be held accountable.”

Since October, The Intercept has been asking if the White House would rule out conducting summary executions of members of the list “of any such groups or entities” designated as “domestic terrorist organization[s]” under NSPM-7, without a response. Similar questions posed to the Justice and War departments have also been repeatedly ignored, despite both departments offering replies to myriad other queries. The Justice Department responded with a statement that did not answer the question. “Political violence has no place in this country, and this Department of Justice will investigate, identify, and root out any individual or violent extremist group attempting to commit or promote this heinous activity,” a spokesperson told The Intercept.

“The Trump administration should answer all questions about the terrorist lists,” Rep. Ro Khanna, D-Calif., told The Intercept. “The American people have a right to answers about who is on them and what that means for all of us.”

Rebecca Ingber, a former State Department lawyer, notes that while the designated terrorist organization label as a targeting authority is “entirely manufactured,” the administration is relying on it to summarily execute people in the boat strikes, making their application of the terrorist label on the domestic front especially concerning. “Many of us have warned that there seems to be no legal limiting principle to the Administration’s claims of authority to use force and to kill people,” Ingber, now a law professor at Cardozo Law School in New York, told The Intercept. “This is one of the many reasons it is so important that Congress push back on the President’s claim that he can simply label transporting drugs an armed attack on the United States and then claim the authority to summarily execute people on that basis.”

Last month, members of Congress spoke up against Trump’s increasingly authoritarian measures when a group of Democratic lawmakers posted a video on social media in which they reminded military personnel that they are required to disobey illegal orders. This led to a Trump tirade that made the White House’s failure to dismiss the possibility of summary executions of Americans even more worrisome.

“This is really bad,” the president wrote on Truth Social, “and Dangerous to our Country. Their words cannot be allowed to stand. SEDITIOUS BEHAVIOR FROM TRAITORS!!! LOCK THEM UP???” A follow-up post read: “SEDITIOUS BEHAVIOR, punishable by DEATH!” Trump also reposted a comment that said: “HANG THEM GEORGE WASHINGTON WOULD !!”

“What’s most telling is that the President considers it punishable by death for us to restate the law,” the six lawmakers — Sens. Elissa Slotkin, Mark Kelly, and Reps. Jason Crow, Chris Deluzio, Maggie Goodlander, and Chrissy Houlahan — all of them former members of the armed forces or the intelligence community — replied in a joint statement . “Every American must unite and condemn the President’s calls for our murder and political violence.” Trump later claimed he did not call for the lawmakers’ executions.

For decades, Trump has called for violence against — including executions of — those he dislikes, including a group of Black and Latino boys were wrongly accused of raping a white woman jogger in New York’s Central Park in 1989; immigrants at the southern border , those who carry out hate crimes and mass shootings ; demonstrators protesting the death of George Floyd; the chief suspect in the fatal shooting of a Trump supporter in Portland, Oregon; former chair of the Joint Chiefs Gen. Mark Milley ; and former Rep. Liz Cheney . In August, Trump also called for “ Capital capital punishment ,” explaining: “If somebody kills somebody in the capital, Washington, we’re going to be seeking the death penalty.”

In January, immediately after being sworn in, Trump also signed an order to expand the death penalty , and Attorney General Pam Bondi has spent the year carrying out orders to put more Americans to death. Eleven states have executed 44 people since January, according to the Death Penalty Information Center — the highest annual total in more than a decade.

White House spokesperson Taylor Rogers failed to answer questions about Trump’s history of threatening to kill people and his recent unhinged behavior.

As Trump lobs threats at political foes and his administration seeks to put convicted and supposed criminals to death at home and abroad, NSPM-7 directs hundreds of thousands of federal officials to target U.S. progressive groups and their donors as well as political activists who profess undefined anti-American, antifascist, or anti-Christian sentiments. The memorandum harkens back to past government enemies lists and efforts that led to massive overreach and illegal acts of repression to stifle dissent . That includes the House Un-American Activities Committee , which began in the 1940s, the FBI’s secret Counter Intelligence Program, or COINTELPRO, which began in the 1950s, and the Patriot Act, enacted in the wake of 9/11, which led to abuses of Black, brown, and Muslim communities , along with racial, social, environmental, animal rights, and other social justice activists and groups.

“NSPM-7 is a greater infringement on freedoms than the Patriot Act.”

“Trump’s NSPM-7 represses freedom of speech and association. Investigating any organization with anti-capitalism or anti-American views is anti-American. NSPM-7 is a greater infringement on freedoms than the Patriot Act,” said Khanna. “We’re seeing the greatest erosion of civil liberties and human rights in our modern history.”

NSPM-7 directs Bondi to compile a list “of any such groups or entities” to be designated as “domestic terrorist organization[s]” and Bondi has ordered the FBI to “compile a list of groups or entities engaging in acts that may constitute domestic terrorism,” according to a Justice Department memo disclosed by reporter Ken Klippenstein on Saturday. The department also shared the December 4 memo, “Implementing National Security Presidential Memorandum-7: Countering Domestic Terrorism and Organized Political Violence,” with The Intercept.

The Justice Department memo notes that under Section 3 of NSPM-7, “the FBI, in coordination with its partners on the [Joint Terrorism Task Forces], and consistent with applicable law, shall compile a list of groups or entities engaged in acts that may constitute domestic terrorism” and “provide that list to the Deputy Attorney General.” (The FBI’s Joint Terrorism Task Forces are located in each of the FBI’s 56 field offices and specifically “support President Trump’s executive orders,” according to a top FBI official .)

The Justice Department memorandum offers a fictitious apocalyptic vision of urban America which the Trump administration has previously employed to justify its military occupations , including “mass rioting and destruction in our cities, violent efforts to shut down immigration enforcement, [and] targeting of public officials or other political actors.” While Trump has even falsely claimed, for example, that members of the Venezuelan gang Tren de Aragua have engaged in hand-to-hand combat with U.S. troops on the streets of Washington, D.C., state attorneys general have repeatedly and successfully argued that troop deployments in Chicago, Los Angeles, and Portland, Oregon, were illegal because Trump administration claims of rampant civil unrest were found to be overblown or fictional .

The December 4 Justice Department memo also claims that “certain Antifa-aligned extremists” profess “extreme viewpoints on immigration, radical gender ideology, and anti-American sentiment” and “a willingness to use violence against law-abiding citizenry to serve those beliefs.” Over the last decade, Republicans have frequently blamed antifa for violence and used it as an omnibus term for left-wing activists , as if it were an organization with members and a command structure.

In September, Trump signed an executive order designating antifa as a “domestic terror organization,” despite the fact that it is essentially a decentralized , leftist ideology — a collection of related ideas and political concepts much like feminism or environmentalism.

Last month, the State Department designated four European groups — Antifa Ost, based in Germany; Informal Anarchist Federation/International Revolutionary Front, a mostly Italian group; and Armed Proletarian Justice and Revolutionary Class Self-Defense, both Greek organizations — as “ foreign terrorist organizations ” because of their alleged threats and attacks against political and economic institutions in Europe. The State Department announced that the FTO designation specifically supports NSPM-7. The Treasury Department’s Office of Foreign Assets Control also designated the groups as “specially designated nationals.”

Michael Glasheen, a longtime FBI agent serving as operations director of the bureau’s national security branch, was flummoxed by questions about antifa while testifying on Thursday before the House Committee on Homeland Security. He said antifa was the “ most immediate violent threat ” facing the United States, but could not answer basic details about the movement, including its size or where it is headquartered. The FBI, Glasheen said, has conducted more than 1,700 domestic terrorism investigations this year, including “ approximately 70 antifa investigations ,” and logged a 171 percent increase in arrests. He also drew attention to a “concerning uptick in the radicalization of our nation’s young people,” specifically “those who may be motivated to commit violence and other criminal acts to further social or political objectives stemming from domestic influences.”

Last month, a federal grand jury in Fort Worth, Texas, indicted nine alleged “ North Texas Antifa Cell operatives ” — one of them a former Marine Corps reservist — on multiple charges, including attempted murder, stemming from a shooting during a July 4 protest at the ICE Prairieland Detention Center in Alvarado in which a local police officer was injured. The Justice Department claims that the North Texas Antifa Cell is “part of a larger militant enterprise made up of networks of individuals and small groups primarily ascribing to an ideology that explicitly calls for the overthrow of the United States Government, law enforcement authorities, and the system of law.”

The December 4 Justice Department memo states that within 60 days, the FBI “shall disseminate an intelligence bulletin on Antifa and Antifa-aligned anarchist violent extremist groups,” including their “organizations’ structures, funding sources, and tactics so that law enforcement partners can effectively investigate and policy makers can effectively understand the nature and gravity of the threat posed by these extremist groups.”

The memo calls for bounties and a network of informants.

The memo also calls for bounties and a network of informants . The “FBI shall establish a cash reward system for information that leads to the successful identification and arrest of individuals in the leadership of domestic terrorist organizations,” reads the document, noting that the bureau also aims to “establish cooperators to provide information and eventually testify against other members and leadership of domestic terrorist organizations.”

Neither NSPM-7 nor the December 4 memo mentions summary executions, and both speak explicitly in terms of “prosecution” and “arrest” of members of domestic terrorist organizations. Attacks on members of designated terrorist organizations are justified by another document — a classified opinion from the Justice Department’s Office of Legal Counsel — that claims that narcotics on supposed drug boats are lawful military targets because their cargo generates revenue for cartels whom the Trump administration claims are in armed conflict with the United States. Attached to that secret memo is a similarly secret list of designated terrorist organizations.

The December 4 memorandum directs Justice Department prosecutors to focus on specific federal crimes highlighted in NSPM-7 and flags more than 25 federal charges including crimes that may be capital offenses under specific, aggravating circumstances, such as killing or attempting to kill a federal officer and murder for hire.

It’s notable that the alleged members of designated terrorist organizations summarily killed in boat strikes would never, if tried in court, receive the death penalty .

“The administration is creating new categories of organizations outside of the law, creating immense uncertainty about who and what they intend to target and how,” Faiza Patel, the senior director of the Brennan Center for Justice’s Liberty and National Security Program, told The Intercept, drawing attention to the administration’s invented term: designated terrorist organizations. “But drug trafficking is not war, and these actions are patently illegal in the absence of Congressional authorization,” she added. “At the same time, National Security Presidential Memorandum 7 is aimed at ‘domestic terrorist organizations’ — another term that has no basis in U.S. law. It is designed to ramp up law enforcement scrutiny of groups espousing a broad swath of First Amendment-protected beliefs from anti-Christianity to anti-Americanism. NSPM-7 does not in any way, shape, or form authorize military strikes and using it for that would be plainly unlawful.”

Benn Jordan's flock camera jammer will send you to jail in Florida now [video]

Hacker News
www.youtube.com
2025-12-12 18:58:43
Comments...

Special Dyslexia Fonts Are Based on Voodoo Pseudoscience

Daring Fireball
www.edutopia.org
2025-12-12 18:37:47
Youki Terada, writing for Edutopia in 2022 (via Jens Kutílek): Under close scrutiny, the evidence for dyslexia-friendly fonts falls apart. In a 2017 study, for example, researchers tested whether OpenDyslexic, a popular font with thicker lines near the bottom of the letters, could improve the re...
Original Article

In 1927, Samuel Orton, a neuropsychiatrist, observed that many of his young patients with reading difficulties reversed similar letters, confusing d for b , for example. Concluding that the condition was caused by “directional confusion ,” he coined the term strephosymbolia , meaning “twisted symbol.” The characterization, but not the coinage, stuck—and fueled early speculation that what came to be known as dyslexia was a visual disorder that caused printed letters to appear as a confusing, jumbled mess.

Since then, a cottage industry of dyslexia-focused products has emerged, hawking everything from prisms to tinted glasses and transparent color overlays. One website catering to dyslexic readers—whose tagline promises to solve “complicated problems with a simple solution”—sells prism glasses, offering up a slew of testimonials touting the product’s benefits. “My reading has improved from 4th grade to college level,” exclaims one satisfied wearer.

In the last decade, another contender—typographic fonts designed to alleviate the reading difficulties associated with dyslexia—has entered the popular discourse. The simple, classroom-friendly intervention claims to improve the speed and accuracy of dyslexic readers by adjusting the size and shape of fonts, adding thicker lines to help students distinguish between similar letters. The designers of the fonts claim that the “heaviness” of the letters, for example, prevents them from flipping upside-down or left-to-right, while the arms—the top of a b or d , for example—have varying thicknesses to reduce possible confusion.

According to the Yale Center for Dyslexia and Creativity , dyslexia is the most common learning disability, affecting one in five children. Students with dyslexia often struggle to read, prompting teachers to search far and wide for helpful remedies. The market for solutions is large and alluring.

But the new fonts—and the odd assortment of paraphernalia that came before them—assume that dyslexia is a visual problem rooted in imprecise letter recognition. That’s a myth, explains Joanne Pierson, a speech-language pathologist at the University of Michigan. “Contrary to popular belief, the core problem in dyslexia is not reversing letters (although it can be an indicator),” she writes . The difficulty lies in identifying the discrete units of sound that make up words and “matching those individual sounds to the letters and combinations of letters in order to read and spell.”

In other words, dyslexia is a language-based processing difference, not a vision problem, despite the popular and enduring misconceptions. “Even when carefully explained, soundly discredited, or decisively dispatched, these and similar dyslexia myths and their vision-based suppositions seem to rise from the dead—like the villain-who-just-won’t-die trope in a B movie,” the International Dyslexia Association forcefully asserts .

Dyslexia Fonts, Under the Microscope

Under close scrutiny, the evidence for dyslexia-friendly fonts falls apart. In a 2017 study , for example, researchers tested whether OpenDyslexic, a popular font with thicker lines near the bottom of the letters, could improve the reading rate and accuracy for young children with dyslexia. According to the developers of the font, which is open-source and free of charge, the “heaviness” of the letters prevented them from turning upside down for readers with dyslexia, which they claimed would improve reading accuracy and speed.

Example of OpenDyslexic font

Shelley Adams

OpenDyslexic features heavier lines that are meant to increase readability for readers with dyslexia—but rigorous research suggests that other mainstream fonts may be more effective.

Researchers put the font to the test, comparing it with two other popular fonts designed for legibility—Arial and Times New Roman—and discovered that the purportedly dyslexia-friendly font actually reduced reading speed and accuracy. In addition, none of the students preferred to read material in OpenDyslexic, a surprising rebuke for a font specifically designed for the task.

In a separate 2018 study , researchers compared another popular dyslexia font—Dyslexie, which charges a fee for usage—with Arial and Times New Roman and found no benefit to reading accuracy and speed. As with the previous dyslexia font, children expressed a preference for the mainstream fonts. “All in all, the font Dyslexie, developed to facilitate the reading of dyslexic people, does not have the desired effect,” the researchers concluded. “Children with dyslexia do not read better when text is printed in the font Dyslexie than when text is printed in Arial or Times New Roman.”

“I don’t necessarily think teachers need to go and get a special font,” says Julie Rawe, a member of W3C’s Cognitive and Learning Disabilities Task Force and a reading and disability expert at Understood . “So far, the research doesn’t really have a lot of evidence showing that these special fonts help kids or adults with dyslexia to read faster or make fewer mistakes.”

Giving False Hope

Dyslexia fonts may also give students false hope—and result in disappointment, the researchers of the 2017 study warn. “The most harm may come when students who have already experienced significant struggle and academic failures related to learning to read have yet another experience with failure when they are not able to read significantly better in a font designed to do so,” they caution.

That’s because children with dyslexia often have to deal with the stigma of being behind their peers, and they may conclude that they’re not smart enough to master the materials, according to a 2010 study . If a child is told that a dyslexia font can help them read, but it doesn’t actually improve their grades or their reading experience, they may assume that the problem lies with their own inability—not with the font.

Legible Fonts and Evidence-Based Instruction

Fonts do matter, experts at the British Dyslexia Association explain, but only because they matter for all readers: “Adopting best practice for dyslexic readers has the advantage of making all written communication easier on the eye for everyone.” They recommend fonts designed for general legibility, like Arial, Verdana, and Tahoma. For better reading outcomes, font size should be between 12 and 14 points, and section headings should be used to create a consistent structure within your documents, easing navigation and supporting better sense-making.

Of course, typography is just one small part of the puzzle. Most children with dyslexia can learn to read—but it takes considerably more time and effort than for their peers, according to the Yale Center for Dyslexia and Creativity . Reading instruction should be “evidence-based, systematic, and delivered in a small group setting,” they say, and should include explicit instruction in phonemic awareness and phonics, with many opportunities to practice reading skills in a supportive environment. The International Dyslexia Association recommends a “multisensory, structured language approach” that systematically integrates several senses (hearing, seeing, touching) while the child is learning to read.

Classroom accommodations such as audiobooks, note-taking apps, video recordings of assignment instructions, and text-to-speech software can help students with dyslexia feel supported and accepted, explains former literacy teacher Jessica Hamman. Tasks that appear simple to most students may take extra time for those with dyslexia, so it’s important to provide tools “that take into account their unique processing challenges and allow them to demonstrate their content understanding and access the curriculum with more ease,” she says.

The Takeaway

On scores of reading speed and accuracy, dyslexia fonts perform no better than common fonts like Arial and Times New Roman, and sometimes they perform worse, according to recent studies. Even using dyslexia fonts with neutral effects can raise false hopes in struggling young readers, contributing to feelings of helplessness and discouragement.

My Python setup, December 2025

Lobsters
chrisamico.com
2025-12-12 18:32:39
Comments...
Original Article

I’m at the point where I’m migrating all my projects to uv , and new Python projects don’t use any other package manager.

I finally got around to migrating this site to use it, using the very handy migrate-to-uv tool. So it’s time to update my recommended Python setup.

My old Python setup was very much built around the complications of managing environments and dependencies, and the conflicting set of tools to deal with those two problems. There are still a few places where I’ll use pipx , but otherwise everything is on uv .

This guide is still aimed at a recent Apple or Linux computer, or WSL if you’re on Windows. I’m writing this on a MacBook Pro with an M2 chip, if that matters to you.

Tools and helpful links:

You probably don’t need both uv and pipx . I have a bunch of existing tools I installed with pipx , and those work fine, so I haven’t migrated them to uvx .

There is one set of tools that stays on pipx , though: Datasette and its SQLite toolchain . Simon Willison built those to install their own plugins, using datasette install <plugin> or llm install <plugin> . Those use pip internally and sometimes uv can cause problems upgrading, so I’ve kept them on pipx .

Installing the right Python

Use uv and nothing else for this. Run uv python list to see what’s already installed or otherwise available. If you’re not using pipx , it’s fine to just let uv install the right version of Python for each project.

If you want a specific version of Python installed globally, use uv python install <version> . The docs are good.

For pipx , stick to my instructions from a couple years ago:

pip install --user pipx # install it
pipx ensurepath # make sure your system can find it

That’s assuming your system already comes with a vesion of Python and pip installed. If not, try Homebrew . Maybe it’s better now, especially with uv managing everything else.

Virtual environments and local dependencies

Everything is now part of uv . Run uv init to create a project, uv add for each dependency and uv sync to install everything from an existing project.

Use uv run to run scripts inside the virtual environment that uv creates.

This is easier now

I never managed to write the post about why Python’s setup is so hard. It ultimately comes down to dependencies, both libraries and Python itself. For the most part, uv has made this a non-issue. It’s also significantly faster than the tools it replaced, which means I can iterate faster and don’t lose focus waiting for dependencies to download and install.

Now, to migrate more projects …

Coupang data breach traced to ex-employee who retained system access

Bleeping Computer
www.bleepingcomputer.com
2025-12-12 18:28:30
A data breach at Coupang that exposed the information of 33.7 million customers has been tied to a former employee who retained access to internal systems after leaving the company. [...]...
Original Article

Coupang

A data breach at Coupang that exposed the information of 33.7 million customers has been tied to a former employee who retained access to internal systems after leaving the company.

This was shared by the Seoul Metropolitan Police Agency with local news outlets, following an investigation that included a raid on the firm's offices earlier this week.

Coupang is South Korea's largest online retailer, employing 95,000 people and generating annual revenue of over $30 billion.

On December 1, 2025, the company announced that it had suffered a data breach that exposed the personal data of 33.7 million customers , including names, email addresses, physical addresses, and order information.

The breach occurred on June 24, 2025, but Coupang only discovered it on November 18, when it also launched an internal investigation.

On December 6, Coupang published an update on the incident, assuring its customers that the stolen information had not been leaked anywhere online.

Despite these assurances and the company's claimed full collaboration with the authorities, the police raided the company's offices on Tuesday to collect evidence for an independent investigation.

On Wednesday, the company's CEO, Park Dae-Jun, announced his resignation and apologized to the public for failing to stop what is the country's worst cybersecurity breach in history.

As the police continued their investigations in Coupang's offices for a second day, they uncovered that the primary suspect was a 43-year-old Chinese national who was a former employee of the retail giant.

According to JoongAng , the man, who joined Coupang in November 2022, was assigned to an authentication management system and left the firm in 2024. He is believed to have already left the country.

The Korean news outlet reports that the police were still at Coupang's offices yesterday, gathering records such as internal documents, logs, system records, IP addresses, user credentials, and access histories that could help explain how the rogue former employee gained access to the corporate systems.

Police transporting seized documents out of Coupang's office
Police transporting seized documents out of Coupang's office
Source: Korea JoungAng Daily

The police have stated that, while Coupang is treated as the victim, if negligence or other legal violations are found, the company and employees responsible for protecting customer data may be deemed liable.

In the meantime, the incident has sparked high-volume phishing activity in the country, affecting roughly two-thirds of its population, and the police have received hundreds of reports of Coupang impersonation since the start of the month.

tines

Break down IAM silos like Bitpanda, KnowBe4, and PathAI

Broken IAM isn't just an IT problem - the impact ripples across your whole business.

This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.

Home Depot GitHub token exposed for a year, granted access to internal systems

Hacker News
techcrunch.com
2025-12-12 18:23:21
Comments...
Original Article

A security researcher said Home Depot exposed access to its internal systems for a year after one of its employees published a private access token online, likely by mistake. The researcher found the exposed token and tried to privately alert Home Depot to its security lapse but was ignored for several weeks.

The exposure is now fixed after TechCrunch contacted company representatives last week.

Security researcher Ben Zimmermann told TechCrunch that, in early November, he found a published GitHub access token belonging to a Home Depot employee, which was exposed sometime in early 2024.

When he tested the token, Zimmermann said that it granted access to hundreds of private Home Depot source code repositories hosted on GitHub and allowed the ability to modify their contents.

The researcher said the keys allowed access to Home Depot’s cloud infrastructure, including its order fulfillment and inventory management systems, and code development pipelines, among other systems. Home Depot has hosted much of its developer and engineering infrastructure on GitHub since 2015, according to a customer profile on GitHub’s website .

Zimmermann said he sent several emails to Home Depot but didn’t hear back.

Nor did he get a response from Home Depot’s chief information security officer, Chris Lanzilotta, after sending a message over LinkedIn.

Zimmermann told TechCrunch that he has disclosed several similar exposures in recent months to companies, which have thanked him for his findings.

“Home Depot is the only company that ignored me,” he said.

Given that Home Depot does not have a way to report security flaws, such as a vulnerability disclosure or bug bounty program, Zimmermann contacted TechCrunch in an effort to get the exposure fixed.

When reached by TechCrunch on December 5, Home Depot spokesperson George Lane acknowledged receipt of our email but did not respond to follow-up emails asking for comment. The exposed token is no longer online, and the researcher said the token’s access was revoked soon after our outreach.

We also asked Lane if Home Depot has the technical means, such as logs, to determine if anyone else used the token during the months it was left online to access any of Home Depot’s internal systems. We did not hear back.

Zack Whittaker is the security editor at TechCrunch. He also authors the weekly cybersecurity newsletter, this week in security .

He can be reached via encrypted message at zackwhittaker.1337 on Signal. You can also contact him by email, or to verify outreach, at zack.whittaker@techcrunch.com .

View Bio

50 years of proof assistants

Lobsters
lawrencecpaulson.github.io
2025-12-12 18:21:22
Comments...
Original Article

05 Dec 2025

[ memories LCF HOL system Isabelle Coq MJC Gordon ]

Crackpots ranging from billionaire Peter Thiel to random YouTube influencers claim that science has been stagnating for the past 50 years. They admit that computing is an exception: they don’t pretend that my personal 32GB laptop is not an advance over the 16MB mainframe that served the whole Caltech community when I was there. Instead they claim that advances in computing were driven solely by industrial research, quite overlooking the role of academia and government funding in pushing the VLSI revolution, RISC processor design, networking, hypertext, virtual memory and indeed computers themselves. As for the industrial research, most of it came from just two “blue sky” institutes – Bell Labs and Xerox PARC – that closed a long time ago. LCF-style proof assistants are a world away from mainstream computing, so let’s look at 50 years of progress there.

1975–1985: Edinburgh LCF

The first instance of LCF was Stanford LCF, developed by Robin Milner in 1972, but it was not an LCF-style proof assistant! LCF meant “Logic for Computable Functions”, a quirky formalism based on Scott domains and intended for reasoning about small functional programs. But “LCF-style proof assistant” means one that, like Edinburgh LCF, was coded in some form of the ML programming language and provided a proof kernel, encapsulated in an abstract type definition, to ensure that a theorem could only be generated by applying inference rules to axioms or other theorems:

… the ML type discipline is used… so that—whatever complex procedures are defined—all values of type thm must be theorems, as only inferences can compute such values…. This security releases us from the need to preserve whole proofs… — an important practical gain since large proofs tended to clog up the working space… [ Edinburgh LCF , page IV]

Edinburgh LCF was first announced in 1975, which conveniently is exactly 50 years ago, at the almost mythical conference on Proving and Improving Programs held at Arc-et-Senans. The user manual , published in the Springer lecture notes series, came out in 1979. Edinburgh LCF introduced some other principles that people still adhere to today:

  • inference rules in the natural deduction style, with a dynamic set of assumptions
  • a goal-directed proof style, where you start with the theorem statement and work backwards
  • a structured system of theories to organise groups of definitions

Edinburgh LCF had its own version of the ML language. It supported a fragment of first-order logic containing the logical symbols $\forall$, $\land$ and $\to$ along with the relation symbols $\equiv$ and $\sqsubseteq$. It introduced proof tactics and also tacticals : operators for combining tactics. Tactics supported goal-directed proof, but Edinburgh LCF had no notion of the current goal or anything to help the user manage the tree of subgoals. Its user interface was simply the ML top level and the various theorem-proving primitives were simply ML functions. ML stood for metalanguage , since managing the process of proof was its exact job.

Avra Cohn and Robin Milner wrote a report on proving the correctness of a parsing algorithm using Edinburgh LCF. The proof consists of one single induction followed by a little simplification and other reasoning. The report includes a succinct description of Edinburgh LCF and is a nice snapshot of the state of the art in 1982 when Cambridge in 1982 to join a project run by Robin Milner and Mike Gordon. Full of youthful enthusiasm, I told Mike that it would be great if one day we could formalise the Prime Number Theorem. I hardly knew what the theorem was about or how to prove it, but my college roommate had told me it was really deep.

Disappointed to discover that we only had $\forall$, $\land$ and $\to$, I set out to fix that, to support full first-order logic. I ended up changing so much (backwards compatibility is overrated) that people eventually shamed me into writing my own user manual . Cambridge LCF never caught on because, well, nobody liked the LCF formalism. But I used it for a development that seemed big at the time: to verify the unification algorithm . This development was later ported to Isabelle . It contains 36 inductions, so we were making progress. And this takes us to 1985, exactly 40 years ago; see also this survey of the state of play. But there was almost no mathematics: no negative numbers and no decimal notation, so you could not even write 2+2=4. As far as the broader computer science community was concerned, we were a joke.

1985–1995: Cambridge LCF and HOL

Cambridge LCF was in itself a dead end, but because it included a much faster ML compiler, it ended up being incorporated into a lot of other proof assistants, notably Mike’s HOL88 . And just like that, hardware verification became a reality. Although software verification seemed stuck in the doldrums, a couple of production-ready chip designs were verified! Mike’s explanation was that hardware verification was simply easier.

Also in 1985, we got a new standard for the ML language and, soon, two compilers for it. So then I started working on experiments that would lead to Isabelle . It would be like LCF but would support constructive type theory, crucially allowing both unification and backtracking, like in Prolog. But there was no working system yet, just a grant application. And that was the state of play 40 years ago.

Funding secured, Isabelle development started in earnest in 1986. It was coded in Standard ML from the start, while HOL88 was ported from the Cambridge LCF version of ML to Standard ML, emerging as HOL90. Mike acquired a bevy of energetic PhD students, who engaged in verification projects or built extensions for HOL. Versions of HOL were being used in institutes around the world.

Stepping aside from HOL for a moment, other proof assistants had made great progress by the mid 1990s. The addition of inductive definitions to the calculus of constructions gave us the calculus of inductive constructions , which in essence is the formalism used today by Rocq and Lean. The very first release of Isabelle/HOL happened in 1991 , primarily the work of Tobias Nipkow, though I was soon to join in . Isabelle/ZF, which was my pet project, formalised axiomatic set theory to some quite deep results .

But I am still not certain whether negative numbers were supported (can somebody help me?). Our weak support for arithmetic may seem odd when our research community was aware that the real numbers had been formalised in AUTOMATH , but we didn’t seem to want them. To many, we were still a joke. This was about to change.

1995–2005: Proof assistants come of age

In 1994, came the Pentium with its FDIV bug : a probably insignificant but detectable error in floating-point division. The subsequent product recall cost Intel nearly half a billion dollars. John Harrison, a student of Mike’s, decided to devote his PhD research to the verification of floating-point arithmetic. By June 1996 he had submitted an extraordinary thesis , Theorem Proving with the Real Numbers , which described a formidable series of achievements:

  • a formalisation of the real member system in HOL
  • formalised analysis including metric spaces, sequences and series, limits, continuity and differentiation, power series and transcendental functions, integration
  • proper numerals represented internally by symbolic binary, and calculations on them
  • computer algebra techniques including a decision procedure for real algebra
  • tools and techniques for floating-point verification by reference to the IEEE standard

This thesis, which I had the privilege to examine, won a Distinguished Dissertation Award and was published as a book by Springer. So by the middle of the 1990s, which was 30 years ago, we had gone from almost no arithmetic to a decent chunk of formalised real analysis that was good enough to verify actual floating-point algorithms.

This period also saw something of an arms race in automation. My earlier, Prolog-inspired vision of backtracking search had led to some fairly general automation that was effective not just in standard predicate logic but with any theorems were expressed in a form suitable for forward or backward chaining. I had also done experiments with classical automatic techniques such as model elimination, which, although pathetic compared with automatic provers of that era, was good enough to troll users on the hol-info mailing list. Soon I had provoked John Harrison to build a superior version of ME for HOL Light. Later, Joe Hurd built his metis superposition prover, which found its way into HOL4. Not to be outdone, Tobias made Isabelle’s simplifier the best in its class incorporating a number of sophisticated refinements, including some great ideas from Nqthm.

Twenty years from the start of this chronology we now had several reasonably mature and powerful systems, including Isabelle/ZF, Isabelle/HOL, multiple versions of the HOL system, and Coq (now Rocq). 1 Many of them used Proof General , a common user interface for tactic-based proof assistants based on the Emacs editor. And we had 100MHz machines, some with 64MB of memory! We were ready to do big things.

During this period, I did a lot of work on the verification of cryptographic protocols , also here . These secure Internet connections and other network communications; they are valuable when you need to know who is on the other end and need to keep messaging secure from eavesdropping and tampering. Among the protocols investigated were the ubiquitous TLS and the late, unlamented SET protocol. These proofs were not at the level of code or bits; buggy implementations could and did emerge.

In 2005, the big thing that caught everyone’s eye was George Gonthier’s formalisation (in Coq) of the Four Colour Theorem. Most educated people had heard of the theorem already, and its history is fascinating: numerous proofs had been attempted and rejected since the mid 19th century. The 1977 proof by Appel and Haken was questioned because it relied on a lot of ad-hoc computer code. Suddenly, despite the still unwelcome involvement of computers, no one could doubt the theorem anymore.

At the opposite extreme was my own formalisation of Gödel’s proof of the relative consistency of the axiom of choice in Isabelle/ZF. This was the apex of my ZF work, technically difficult but incomprehensible to most people. My early dream of having a formalisation of the Prime Number Theorem came true in 2005 when Jeremy Avigad formalised the theorem in Isabelle. Somewhat later, John Harrison formalised a different proof in HOL Light. And there was much more. Without any doubt, our systems were capable of serious mathematics.

Perhaps the most consequential achievement of this period was Mike Gordon’s collaboration with Graham Birtwistle and Anthony Fox to verify the ARM6 processor . Graham, at Leeds, formally specified the instruction set architecture of the processor (i.e. the assembly language level), while Mike and Anthony at Cambridge verified the implementation of that architecture in terms of lower level hardware components. Eventually a number of other processors were similarly specified, and some verified. Without any doubt, our systems were capable of serious verification.

Despite of the focus on applications in this section, system development continued in the run-up to 2005. I am only familiar with Isabelle development, but they were tremendous:

  • the Isar language for structured, legible proofs (a break with the LCF idea that the top level must be a programming language, i.e. ML)
  • axiomatic type classes , providing principled overloading
  • counterexample finders : Quickcheck and Refute (now Nitpick)
  • code generation from the executable fragment of higher-order logic, and reflection
  • sledgehammer was under active development, but only ready a couple of years later.

With so much going on, it’s not surprising that our community started doing big things, and other people were starting to notice.

2005–2015: The first landmarks

I am not used to phone calls from journalists: for most of my career, formal verification has been seen as (at best) niche. But the journalist on the end of the line was asking for information about seL4 , the first operating system kernel ever to be formally verified. Tools for extended static checking were by then able to detect a lot of program faults, but the seL4 verification claimed to cover full functional correctness : the code did exactly what it was supposed to do. There is now an entire ecosystem around seL4, backed by a million lines of Isabelle/HOL proofs.

People have wanted to verify compilers since forever . The task of fully specifying a programming language, target machine and compiler already seemed impossible, let alone providing the actual proof. With CompCert , that task was finally fulfilled, for a large subset of the C language:

What sets CompCert apart from any other production compiler, is that it is formally verified, using machine- assisted mathematical proofs, to be exempt from mis- compilation issues. In other words, the executable code it produces is proved to behave exactly as specified by the semantics of the source C program.

A seemingly intractable problem with compiler verification was how to translate your verified compiler into machine code. For example, CompCert is mostly written in Rocq, which is then extracted to OCaml code. The OCaml compiler had never been verified, so how do we know that its compiled code is correct?

CakeML squares this circle through bootstrapping . CakeML translates from its source language (a dialect of ML) to assembly language, accompanied by a proof that the two pieces of code are equivalent. This work was an outgrowth of the ARM6 project mentioned earlier. Magnus Myreen had developed techniques for automatically and verifiably translating between assembly language and recursive functions in higher-order logic, in both directions. At the start of the bootstrapping process, a tiny compiler was written in pure logic and proved correct. It was now safe to run this compiler and use its tiny language to implement a bigger language. This process ultimately produced a verified compiler in both source form and assembly language form, with a proof of their equivalence, as well as verified extraction from higher-order logic to ML.

The end of the decade also saw impressive results in the formalisation of mathematics:

Without going into details here, each of these was an ambitious proof, combining in various ways deep mathematics, intricate technicalities and sheer bulk. Our community was proud of our achievements. We were no longer a joke, but what exactly we were good for?

2015–2025: Breaking through

This period brought something astonishing: acceptance of proof assistants by many mainstream mathematicians. I mostly recall mathematicians regardeding computers with something close to contempt. Even some logicians regarded formalised mathematics as impossible, somehow fixating on Gödel’s incompleteness or that notorious proof of 1+1=2 on page 360. Regarding my work formalising big chunks of ZF theory, someone commented “only for finite sets obviously”.

My EU-funded ALEXANDRIA project started in 2017. My team formalised more advanced and deep mathematics than I ever imagined to be possible, using Isabelle/HOL. (I have told this story in an earlier blogpost .) But ALEXANDRIA alone would not have had much of an impact on mathematical practice. What made a difference was Kevin Buzzard and his enthusiastic, tireless promotion of the idea of formalising mathematics in Lean . He recruited a veritable army. I got the idea of blogging from him, but my blog has not had the same impact. Where are you guys?

In 2022, for the first time ever, machine assistance was used to confirm brand-new mathematics that a Fields Medallist had concerns about. Mathematicians will for the most part continue to work the way they always have done, but proof assistants are getting better and better, and they will encroach more and more on the everyday practice of mathematics.

Meanwhile, Isabelle continued to be useful for verification. I was amazed to hear that that the systems group here in the Computer Lab had completed a major verification using Isabelle/HOL. The tradition is for systems people to despise verification tools for sweeping aside ugly things like overflow and floating point errors, even though they no longer do. Besides, a research tool like Isabelle is only used by its own developer and his students. Times were changing.

Isabelle is also one of the several proof assistants involved with CHERI , a large-scale project reviving the old idea of capabilities to ensure security at the hardware level. CHERI has produced numerous publications, some of which (for example this one and that one ) describe very large proofs. These concern the design and implementation of novel computer architectures with fine-grained memory protection, and a design process with formal verification at its heart.

Isabelle has also contributed to the design of WebAssembly , a relatively new platform for web applications. By subjecting the WebAssembly specification to formal scrutiny , Conrad Watt was able to identify a number of issues in time for them to be fixed.

Finally, I’d like to mention this announcement (4 December 2025) by Dominic Mulligan of Amazon Web Services (AWS):

Over three years, lots of hard work, and 260,000 lines of Isabelle/HOL code later, the Nitro Isolation Engine (NIE) is finally announced alongside Graviton5.

Working with our colleagues in EC2, Annapurna, and AWS AppSec, we have been working to rearchitect the Nitro system for Graviton5+ instances around a small, trusted separation kernel. Written from scratch in Rust, we have additionally specified the behaviour of a core subset of the Nitro Isolation Engine kernel, verified that the implementation meets this specification, and additionally proved deep security properties—confidentiality and integrity—of the implementation.

I am biased, since I’ve been working with AWS on this exact project, but this is a big deal. AWS has been using formal verification tools for a considerable time. A notable earlier accomplishment was verify tricky but efficient algorithms using HOL Light, speeding up RSA encryption by a massive factor.

2025–2035 Becoming ordinary

A couple of months ago, Apple announced new models in their iPhone range, but no crowds formed around Apple Stores. They once did: the iPhone was once regarded as revolutionary. Now, smartphones are a commodity, which is the final stage of a new technology. Formal verification is not ordinary yet. But it’s coming: more and more software will be seen as too important to develop any other way, as is already the case for hardware.

Postscript

I am well aware that there is much outstanding work adjacent to that described here, e.g. using other interactive tools, such as Nqthm and ACL2, PVS and Agda, and much else using Rocq. There have been amazing advances in the broader theorem proving world, also in model checking, SAT/SMT solving and their applications to extended static checking of software. I have related what I personally know. And remember, the point of this post is not (simply) to boast but to demonstrate the progress of our research community, so the more achievements the better. Feel free to add some in the comments!

This post does not prove anything about other fields of science, such as solid-state physics, molecular biology or mathematics. But it’s fair to assume that such fields have not been idle either. People have proved Fermat’s Last Theorem and the Poincaré conjecture, and settled more obscure questions such as the projective plane of order 10. People have located the remains of King Richard III, who died in 1485, excavating and positively identifying the body by its DNA. People have linked a piece of bloody cloth to Adolf Hitler and diagnosed that he had a specific genetic condition. The immensely complex James Webb Space Telescope was successfully deployed; it is now revealing secrets about the early Universe.

Sometimes I wonder about the motives of those who claim that science is moribund. Do they have political aims, or just unrealistic expectations? Were they expecting time travel or some sort of warp drive? People need to remember that movies are fiction.

YOCaml a framework used to describe static site generator

Lobsters
yocaml.github.io
2025-12-12 18:12:05
Comments...
Original Article

Welcome to the YOCaml user guide! This page is quite marketing-oriented .

What is YOCaml

YOCaml is a framework used to describe build systems in OCaml , released under GPL3 license, with an API suited for creating static site generators . Unlike Hugo , Jekyll or Zola , which provide a CLI , YOCaml is closer to Hakyll , as it imposes no structure , requiring you to build your generator step by step. This offers the opportunity to create diverse projects such as a personal blog , a personal wiki , more experimental sites , a webring or even this documentation website.

Written in OCaml

YOCaml is, as its name suggests, written in the wonderful language OCaml , a programming language that is statically typed (with type inference), functional , imperative , and object-oriented , and that features a rich module system. While the simplest reason we wrote YOCaml in OCaml is probably that we like OCaml , the language’s grammatical and conceptual flexibility made it easier to design an API that we find expressive . In addition, OCaml is a high-performance language with a rich ecosystem — if you want to convince yourself to use OCaml, we invite you to read Why I chose OCaml as my primary language .

Adhering to the ecosystem

YOCaml was designed in a very modular way, allowing us to take advantage of the OCaml ecosystem. As a result, even though YOCaml is packaged with a set of standard plugins , the core API makes it fairly easy to integrate other libraries. For example, users have requested support for Gemtext , in order to serve their site over Gemini . No changes were required in YOCaml’s core, demonstrating its flexibility .

Easy deployment

One of the great strengths of statically generated sites is that they are very easy to deploy. In fact, a simple static server is enough! However, YOCaml goes further: thanks to the Mirage project, it is possible to directly generate documents using a Git repository as a file system (compatible with GitHub Pages ) and serve them statically. For example, by using Unipi , you can build an operating system (unikernel) designed to statically serve your site with great ease!

Id Software devs form "wall-to-wall" union

Hacker News
www.rockpapershotgun.com
2025-12-12 18:11:23
Comments...
Original Article

“Remote work isn’t a perk"

A shotgun being pointed at a demon thing in Doom: The Dark Ages.
Image credit: Id Software / Bethesda Softworks

Doom and Quake studio id Software are now home to a "wall-to-wall" union according to the Communications Workers of America (CWA). The organisation have announced that a group of 165 id workers have just voted to unionise, adding to the ranks of the 300 ZeniMax quality assurance staff who unionised back in 2023 .

According to the CWA's press release, Microsoft have already recognised this latest union - which is made up of "developers, artists, programmers, and more" - in accordance with the labour neutrality agreement the two parties agreed in 2022.

"The wall-to-wall organizing effort at id Software was much needed; it’s incredibly important that developers across the industry unite to push back on all the unilateral workplace changes that are being handed down from industry executives," said id Software producer and CWA organising committee member Andrew Willis.

Meanwhile, id lead services programmer and CWA committee member Chris Hays specifically cited remote staff not being dragged into the office as a reason behind the push for representation. "Remote work isn’t a perk," he said. "It’s a necessity for our health, our families, and our access needs. RTO policies should not be handed down from executives with no consideration for accessibility or our well-being."

The CWA release also cited " mass industry layoffs , sudden periods of crunch time, and unfair pay" as part of the impetus behind a wider push towards unionisation among devs across the industry this year, adding that the total of unionised workers across Microsoft's fiefdom is now "nearly 4,000" strong.

CWA president Ron Swaggerty added that the union "look forward to sitting across the table from Microsoft to negotiate a contract that reflects the skill, creativity, and dedication these workers bring to every project."

If you want to learn more about the CWA's unionisation efforts as the games industry's suits and moneyfolk continue to lob developers out of windows with depressing regularity, give this interview Nic did a read.

Meanwhile, members of the "industry-wide union" the CWA announced earlier this year held a protest outside of The Game Awards yesterday, with their aim being to "to acknowledge the video games and studios that have been closed and to also condemn the creativity that’s been crushed by corporate greed and studio executives".

Solidarity to these id Software workers.

Google Releases Its New Google Sans Flex Font as Open Source

Hacker News
www.omgubuntu.co.uk
2025-12-12 18:07:57
Comments...
Original Article

Google has made its ‘next generation brand typeface’, Google Sans Flex , available for download — under an open source license , which is welcome news.

A modern sans serif font purpose-designed for use on screens and OSes, Google Sans Flex is a ground-up, multi-axis rebuild of the proprietary Google Sans font, by typographer David Berlow (of Font Bureau fame).

The “flex” in GS Flex is because it’s a variable font that is “extremely flexible [with] variable axes for weight, width, optical size, slant, as well as an axis for rounded terminals” (as in terminals in letters, not command-line apps).”

Android and web developers will find the varied variable axes on offer a creative boon for “expressive” design work.

Changing system font is a simple way to give Ubuntu (or any other Linux) desktop a subtle new vibe without having to futz around with themes, icon packs or other eye-candy extras which substantially alter the stock experience:

Google Sans Flex as UI font on Ubuntu 25.10

However, Linux desktop environments don’t yet support doing anything fancy with variable fonts, beyond the basics.

Ergo, unlike on modern Android, you can’t toggle Dark Mode in GNOME or KDE with this font enabled to make it automatically adjust its GRAD axis to compensate for the optical thinning that typically occurs when white text is rendered against darker backgrounds.

It’s not a major drawback, and GS Flex works great as a competent, classy system UI font on Linux, especially on HiDPI displays with fractional scaling. For my tastes, Google Sans Flex has (like GNOME’s default Adwaita Sans font) more presence than the Ubuntu font.

Want to try it out? Google has released the font under the SIL Open Font License (OFL) , meaning you can modify, redistribute and use it in your own projects.

To get it:

  1. Go to Google Fonts
  2. Search for ‘Google Sans Flex’
  3. Hit “Get Font” > “Download All”
  4. Extract the ZIP
  5. Find the .ttf file inside and either:
    • Move it to ~/.local/share/fonts ; or
    • Install via your desktop’s font manager GUI

Once installed it’ll be available to use/select in other apps, settings and so on.

To change UI font on Ubuntu you can install the GNOME Tweaks tool and then open it, go to Appearance and set the UI font to Google Sans Flex . Although you may see variable options listed to pick from, GNOME will always render the ‘regular’ version.

Nuclear energy key to decarbonising Europe, says EESC

Hacker News
www.eesc.europa.eu
2025-12-12 17:32:48
Comments...
Original Article

The EESC has adopted an opinion pointing out that nuclear energy is an essential component of the clean energy mix which is needed to phase out fossil fuels. The Committee calls on the European Commission to include key regulatory and financial enablers in order to make the planned investment possible, and to enhance transparent dialogue with civil society.

Nuclear energy plays and will continue to play a crucial role in decarbonising the European Union, says the European Economic and Social Committee (EESC) in an opinion adopted at the December plenary session. This is particularly true given the fact that the EU needs to consolidate its strategic autonomy in the fields of energy and technology.

The EESC opinion, drawn up by rapporteur Dumitru Fornea and co-rapporteur Alena Mastantuono , assesses the European Commission’s 8th Nuclear Illustrative Programme (PINC), published in June 2025.

According to the Committee, nuclear energy is a key element in diversifying the EU’s energy supply because it delivers safe, reliable, low-carbon electricity. This ensures that the grid remains stable most of the time, regardless of the weather or time of day, with less pressure on systemic costs.

Nuclear energy can therefore play an important role in supporting the EU’s overall industrial transition as it bolsters resilience against supply disruptions while complementing renewables and reducing dependence on imported fuels. Against this backdrop, existing EU industries (such as steel, cement and chemicals) as well as new industries (data centres) can enjoy a constant stream of decarbonised electricity.

‘The European nuclear industry sustains more than 1.1 million jobs in the EU and is a significant economic sector with a major footprint in terms of jobs, supply chain capacity and advanced R&D. It is a net-zero value chain based almost entirely in the EU,’ said Mr Fornea . ‘If we want to effectively move away from coal, we need accessible clean energy and funding for nuclear.’

Moving ahead with planned investment

In the opinion, the EESC regrets that the PINC does not propose any specific enablers, nor a real action plan, for the planned investment and urges the European Commission to include regulatory and financial measures. The goal is to enable investment in the sector, promote the development of innovative fuel cycle facilities and propose specific figures on the investment required by the nuclear fuel cycle.

‘We call on the Commission to put forward concrete measures to make the investment planned under the PINC possible,’ said Ms Mastantuono . ‘This is more necessary than ever given the geopolitical turmoil which is forcing the Union to develop EU-based capacities. For this reason, the nuclear value chain should be supported in terms of skills, research and the fuel supply chain.’

More specifically, the Committee recommends speeding up investment through specific measures such as a streamlined State aid process, access to EU cohesion funds, sustainable financing, licensing processes and faster decisions at EU and national level.

In addition, the EESC advises applying the same facilities to investment in nuclear energy as for renewables. These two energy sources are complementary and Member States are free to choose their own energy mix.

Keeping transparent dialogue open with civil society

Dialogue with civil society remains pivotal in building trust, ownership and societal acceptance, and could be more prominently addressed in the PINC. Moreover, there is no dedicated funding available for meaningful civil society participation.

On this matter, the EESC’s view is that decisions on new projects in the nuclear sector, including the development of new technologies, should be taken following the outcome of a broad and transparent dialogue with civil society on the technical, economic, social and environmental aspects.

Public engagement is essential to ensure that energy strategies reflect societal priorities (such as sustainability, reliability, land-use and responsibility for long-term waste management) and the early involvement of civil society through dialogue strengthens trust and legitimacy for both nuclear energy and other low-carbon technologies.

Background - Nuclear Illustrative Programme (PINC)

According to article 40 of the Euratom Treaty, the European Commission is required to periodically publish a Nuclear Illustrative Programme (PINC) and consult the EESC. The Commission Communication on the PINC issued in June 2025 has therefore been presented under this article for the opinion of the European Economic and Social Committee.

The PINC provides a comprehensive overview of investment needs in nuclear energy, both fission and fusion, and encompasses all stages of the nuclear lifecycle. It also feeds into the debate on the role of nuclear energy in achieving carbon neutrality in the EU by 2050. In line with the highest level of nuclear safety, the PINC supports EU competitiveness, energy security and affordable energy prices.

In the 8th PINC, the Commission points out that nuclear energy requires significant investment, of around EUR 241 billion until 2050, both for lifetime extensions of existing reactors and the construction of new large-scale reactors. The Commission also says that additional investment is needed for Small Modular Reactors (SMRs), Advanced Modular Reactors (AMRs) and microreactors and in fusion for the longer-term future.

Work organisation

iMessage Doesn’t Use APNs for Attachments

Daring Fireball
support.apple.com
2025-12-12 17:29:02
Small follow-up point re: my post this week on iMessage’s delivery architecture being built atop the Apple Push Notification service: APNs can only relay messages up to 4 or 16 KB in size, depending on the iOS or iPadOS version. If the message text is too long or if an attachment such as a photo...
Original Article

Labor Leaders Cheer House Vote To Undo ‘Single-Largest Act of Union Busting in American History’

Portside
portside.org
2025-12-12 17:24:03
Labor Leaders Cheer House Vote To Undo ‘Single-Largest Act of Union Busting in American History’ Maureen Fri, 12/12/2025 - 12:24 ...
Original Article
Labor Leaders Cheer House Vote To Undo ‘Single-Largest Act of Union Busting in American History’ Published

Members of the American Federation of Government Employees protest against firings during a rally in Washington, DC on February 11, 2025 | Nathan Posner/Anadolu

US labor leaders on Thursday celebrated the House of Representatives’ bipartisan vote in favor of a bill that would reverse President Donald Trump’s attack on the collective bargaining rights of 1 million federal workers .

Trump’s sweeping assault on federal workers has included March and August executive orders targeting their rights under the guise of protecting national security. In response, Congressmen Jared Golden (D-Maine) and Brian Fitzpatrick (R-Pa.) spearheaded the fight for the Protect America’s Workforce Act . They recently collected enough signatures to force the 231-195 vote, in which 20 Republicans joined all Democrats present to send the bill to the Senate.

“The right to be heard in one’s workplace may appear basic, but it carries great weight—it ensures that the people who serve our nation have a seat at the table when decisions shape their work and their mission,” Fitzpatrick

said

after the vote.

“This bill moves us closer to restoring that fundamental protection for nearly 1 million federal employees, many of them

veterans

,” he added. “I will always fight for our workers, and I call on the Senate to help ensure these protections are fully reinstated.”

American Federation of Labor and Congress of Industrial Organizations (AFL-CIO) president Liz Shuler joined union leaders in applauding the lower chamber on Thursday and calling on the Senate to follow suit. She said in a statement that “President Trump betrayed workers when he tried to rip away our collective bargaining rights. In these increasingly polarized times, working people delivered a rare bipartisan majority to stop the administration’s unprecedented attacks on our freedoms.”

“We commend the Republicans and Democrats who stood with workers and voted to reverse the single-largest act of union busting in American history,” she continued. “Americans t rust unions more than either political party. As we turn to the Senate—where the bill already has bipartisan support—working people are calling on the politicians we elected to stand with us, even if it means standing up to the union-busting boss in the White House .”

Everett Kelley, national president of the American Federation of Government Employees, the largest federal workers union, similarly praised the members of Congress who “demonstrated their support for the nonpartisan civil service, for the dedicated employees who serve our country with honor and distinction, and for the critical role that collective bargaining has in fostering a safe, protective, and collaborative workplace.”

“This vote marks an historic achievement for the House’s bipartisan pro-labor majority, courageously led by Reps. Jared Golden of Maine and Brian Fitzpatrick of Pennsylvania ,” he said. “We need to build on this seismic victory in the House and get immediate action in the Senate—and also ensure that any future budget bills similarly protect collective bargaining rights for the largely unseen civil servants who keep our government running.”

American Federation of State, County, and Municipal Employees president Lee Saunders also applauded the House’s passage of “a bill that strengthens federal workers’ freedoms on the job so they can continue to keep our nation safe, healthy, and strong.”

“This bill not only provides workers’ critical protections from an administration that has spent the past year relentlessly attacking them,” he noted, “but it also ensures that our communities are served by the most qualified public service workers—not just those with the best political connections.”

Randy Erwin, the head of the National Federation of Federal Employees, declared that “this is an incredible testament to the strength of federal employees and the longstanding support for their fundamental right to organize and join a union.”

“The president cannot unilaterally strip working people of their constitutional freedom of association. In bipartisan fashion, Congress has asserted their authority to hold the president accountable for the biggest attack on workers that this country has ever seen,” he added, thanking the House supporters and pledging to work with “senators from both parties to ensure this bill is signed into law.”

Django: what’s new in 6.0

Lobsters
adamj.eu
2025-12-12 17:18:19
Comments...
Original Article
Django 6.0: codename “mosaic”

Django 6.0 was released today , starting another release cycle for the loved and long-lived Python web framework (now 20 years old!). It comes with a mosaic of new features, contributed to by many, some of which I am happy to have helped with. Below is my pick of highlights from the release notes .

Upgrade with help from django-upgrade

If you’re upgrading a project from Django 5.2 or earlier, please try my tool django-upgrade . It will automatically update old Django code to use new features, fixing some deprecation warnings for you, including five fixers for Django 6.0. (One day, I’ll propose django-upgrade to become an official Django project, when energy and time permit…)

Template partials

There are four headline features in Django 6.0, which we’ll cover before other notable changes, starting with this one:

The Django Template Language now supports template partials , making it easier to encapsulate and reuse small named fragments within a template file.

Partials are sections of a template marked by the new {% partialdef %} and {% endpartialdef %} tags. They can be reused within the same template or rendered in isolation. Let’s look at examples for each use case in turn.

Reuse partials within the same template

The below template reuses a partial called filter_controls within the same template. It’s defined once at the top of the template, then used twice later on. Using a partial allows the template avoid repetition without pushing the content into a separate include file.

<section id=videos>
  {% partialdef filter_controls %}
    <form>
      {{ filter_form }}
    </form>
  {% endpartialdef %}

  {% partial filter_controls %}

  <ul>
    {% for video in videos %}
      <li>
        <h2>{{ video.title }}</h2>
        ...
      </li>
    {% endfor %}
  </ul>

  {% partial filter_controls %}
</section>

Actually, we can simplify this pattern further, by using the inline option on the partialdef tag, which causes the definition to also render in place:

<section id=videos>
  {% partialdef filter_controls inline %}
    <form>
      {{ filter_form }}
    </form>
  {% endpartialdef %}

  <ul>
    {% for video in videos %}
      <li>
        <h2>{{ video.title }}</h2>
        ...
      </li>
    {% endfor %}
  </ul>

  {% partial filter_controls %}
</section>

Reach for this pattern any time you find yourself repeating template code within the same template. Because partials can use variables, you can also use them to de-duplicate when rendering similar controls with different data.

Render partials in isolation

The below template defines a view_count partial that’s intended to be re-rendered in isolation. It uses the inline option, so when the whole template is rendered, the partial is included.

The page uses htmx , via my django-htmx package , to periodically refresh the view count, through the hx-* attributes. The request from htmx goes to a dedicated view that re-renders the view_count partial.

{% load django_htmx %}
<!doctype html>
<html>
  <body>
    <h1>{{ video.title }}</h1>
    <video width=1280 height=720 controls>
      <source src="{{ video.file.url }}" type="video/mp4">
      Your browser does not support the video tag.
    </video>

    {% partialdef view_count inline %}
    <section
      class=view-count
      hx-trigger="every 1s"
      hx-swap=outerHTML
      hx-get="{% url 'video-view-count' video.id %}"
    >
      {{ video.view_count }} views
    </section>
    {% endpartialdef %}

    {% htmx_script %}
  </body>
</html>

The relevant code for the two views could look like this:

from django.shortcuts import render


def video(request, video_id):
    ...
    return render(request, "video.html", {"video": video})


def video_view_count(request, video_id):
    ...
    return render(request, "video.html#view_count", {"video": video})

The initial video view renders the full template video.html . The video_view_count view renders just the view_count partial, by appending #view_count to the template name. This syntax is similar to how you’d reference an HTML fragment by its ID in a URL.

History

htmx was the main motivation for this feature, as promoted by htmx creator Carson Gross in a cross-framework review post . Using partials definitely helps maintain “Locality of behaviour” within your templates, easing authoring, debugging, and maintenance by avoiding template file sprawl.

Django’s support for template partials was initially developed by Carlton Gibson in the django-template-partials package , which remains available for older Django versions. The integration into Django itself was done in a Google Summer of Code project this year, worked on by student Farhan Ali and mentored by Carlton, in Ticket #36410 . You can read more about the development process in Farhan’s retrospective blog post . Many thanks to Farhan for authoring, Carlton for mentoring, and Natalia Bidart, Nick Pope, and Sarah Boyce for reviewing!

Tasks framework

The next headline feature we’re covering:

Django now includes a built-in Tasks framework for running code outside the HTTP request–response cycle. This enables offloading work, such as sending emails or processing data, to background workers.

Basically, there’s a new API for defining and enqueuing background tasks—very cool!

Background tasks are a way of running code outside of the request-response cycle. They’re a common requirement in web applications, used for sending emails, processing images, generating reports, and more.

Historically, Django has not provided any system for background tasks, and kind of ignored the problem space altogether. Developers have instead relied on third-party packages like Celery or Django Q2 . While these systems are fine, they can be complex to set up and maintain, and often don’t “go with the grain” of Django.

The new Tasks framework fills this gap by providing an interface to define background tasks, which task runner packages can then integrate with. This common ground allows third-party Django packages to define tasks in a standard way, assuming you’ll be using a compatible task runner to execute them.

Define tasks with the new @task decorator:

from django.tasks import task


@task
def resize_video(video_id): ...

…and enqueue them for background execution with the Task.enqueue() method:

from example.tasks import resize_video


def upload_video(request):
    ...
    resize_video.enqueue(video.id)
    ...

Execute tasks

At this time, Django does not include a production-ready task backend, only two that are suitable for development and testing:

  • ImmediateBackend - runs tasks synchronously, blocking until they complete.
  • DummyBackend - does nothing when tasks are enqueued, but allows them to be inspected later. Useful for tests, where you can assert that tasks were enqueued without actually running them.

For production use, you’ll need to use a third-party package that implements one, for which django-tasks , the reference implementation, is the primary option. It provides DatabaseBackend for storing tasks in your SQL database, a fine solution for many projects, avoiding extra infrastructure and allowing atomic task enqueuing within database transactions. We may see this backend merged into Django in due course, or at least become an official package, to help make Django “batteries included” for background tasks.

To use django-tasks’ DatabaseBackend today, first install the package:

Second, add these two apps to your INSTALLED_APPS setting:

INSTALLED_APPS = [
    # ...
    "django_tasks",
    "django_tasks.backends.database",
    # ...
]

Third, configure DatabaseBackend as your tasks backend in the new TASKS setting :

TASKS = {
    "default": {
        "BACKEND": "django_tasks.backends.database.DatabaseBackend",
    },
}

Fourth, run migrations to create the necessary database tables:

Finally, to run the task worker process, use the package’s db_worker management command:

$ ./manage.py db_worker
Starting worker worker_id=jWLMLrms3C2NcUODYeatsqCFvd5rK6DM queues=default

This process runs indefinitely, polling for tasks and executing them, logging events as it goes:

Task id=10b794ed-9b64-4eed-950c-fcc92cd6784b path=example.tasks.echo state=RUNNING
Hello from test task!
Task id=10b794ed-9b64-4eed-950c-fcc92cd6784b path=example.tasks.echo state=SUCCEEDED

You’ll want to run db_worker in production, and also in development if you want to test background task execution.

History

It’s been a long path to get the Tasks framework into Django, and I’m super excited to see it finally available in Django 6.0. Jake Howard started on the idea for Wagtail, a Django-powered CMS, back in 2021, as they have a need for common task definitions across their package ecosystem. He upgraded the idea to target Django itself in 2024, when he proposed DEP 0014 . As a member of the Steering Council at the time, I had the pleasure of helping review and accept the DEP.

Since then, Jake has been leading the implementation effort, building pieces first in the separate django-tasks package before preparing them for inclusion in Django itself. This step was done under Ticket #35859 , with a pull request that took nearly a year to review and land. Thanks to Jake for his perseverance here, and to all reviewers: Andreas Nüßlein, Dave Gaeddert, Eric Holscher, Jacob Walls, Jake Howard, Kamal Mustafa, @rtr1, @tcely, Oliver Haas, Ran Benita, Raphael Gaschignard, and Sarah Boyce.

Read more about this feature and story in Jake’s post celebrating when it was merged .

Content Security Policy support

Our third headline feature:

Built-in support for the Content Security Policy (CSP) standard is now available, making it easier to protect web applications against content injection attacks such as cross-site scripting (XSS). CSP allows declaring trusted sources of content by giving browsers strict rules about which scripts, styles, images, or other resources can be loaded.

I’m really excited about this, because I’m a bit of a security nerd who’s been deploying CSP for client projects for years.

CSP is a security standard that can protect your site from cross-site scripting (XSS) and other code injection attacks. You set a content-security-policy header to declare which content sources are trusted for your site, and then browsers will block content from other sources. For example, you might declare that only scripts your domain are allowed, so an attacker who manages to inject a <script> tag pointing to evil.com would be thwarted, as the browser would refuse to load it.

Previously, Django had no built-in support for CSP, and developers had to rely on building their own, or using a third-party package like the very popular django-csp . But this was a little bit inconvenient, as it meant that other third-party packages couldn’t reliably integrate with CSP, as there was no common API to do so.

The new CSP support provides all the core features that django-csp did, with a slightly tidier and more Djangoey API. To get started, first add ContentSecurityPolicyMiddleware to your MIDDLEWARE setting:

MIDDLEWARE = [
    # ...
    "django.middleware.csp.ContentSecurityPolicyMiddleware",
    # ...
]

Place it next to SecurityMiddleware , as it similarly adds security-related headers to all responses. (You do have SecurityMiddleware enabled, right?)

Second, configure your CSP policy using the new settings:

  • SECURE_CSP to configure the content-security-policy header, which is your actively enforced policy.
  • SECURE_CSP_REPORT_ONLY to configure the content-security-policy-report-only header, which sets a non-enforced policy for which browsers report violations to a specified endpoint. This option is useful for testing and monitoring a policy before enforcing it.

For example, to adopt the nonce-based strict CSP recommended by web.dev , you could start with the following setting:

from django.utils.csp import CSP

SECURE_CSP_REPORT_ONLY = {
    "script-src": [CSP.NONCE, CSP.STRICT_DYNAMIC],
    "object-src": [CSP.NONE],
    "base-uri": [CSP.NONE],
}

The CSP enum used above provides constants for CSP directives, to help avoid typos.

This policy is quite restrictive and will break most existing sites if deployed as-is, because it requires nonces, as covered next. That’s why the example shows starting with the report-only mode header, to help track down places that need fixing before enforcing the policy. You’d later change to setting the SECURE_CSP setting to enforce the policy.

Anyway, those are the two basic steps to set up the new CSP support!

Nonce generation

A key part of the new feature is that nonce generation is now built-in to Django, when using the CSP middleware. Nonces are a security feature in CSP that allow you to mark specific <script> and <style> tags as trusted with a nonce attribute:

<script src=/static/app.js type=module nonce=55vsH4w7ATHB85C3MbPr_g></script>

The nonce value is randomly generated per-request, and included in the CSP header. An attacker performing content injection couldn’t guess the nonce, so browsers can trust only those tags that include the correct nonce. Because nonce generation is now part of Django, third-party packages can depend on it for their <script> and <style> tags and they’ll continue to work if you adopt CSP with nonces.

Nonces are the recommended way to use CSP today, avoiding problems with previous allow-list based approaches. That’s why the above recommended policy enables them. To adopt a nonce-based policy, you’ll need to annotate your <script> and <style> tags with the nonce value through the following steps.

First, add the new csp template context processor to your TEMPLATES setting:

TEMPLATES = [
    {
        "BACKEND": "django.template.backends.django.DjangoTemplates",
        "OPTIONS": {
            "context_processors": [
                # ...
                "django.template.context_processors.csp",
            ],
        },
    },
]

Second, annotate your <script> and <style> tags with nonce="{{ csp_nonce }}" :

-   <script src="{% static 'app.js' %}" type="module"></script>
+   <script src="{% static 'app.js' %}" type="module" nonce="{{ csp_nonce }}"></script>

This can be tedious and error-prone, hence using the report-only mode first to monitor violations might be useful, especially on larger projects.

Anyway, deploying CSP right would be another post in itself, or even a book chapter, so we’ll stop here for now. For more info, check out that web.dev article and the MDN CSP guide .

History

CSP itself was proposed for browsers way back in 2004, and was first implemented in Mozilla Firefox version 4, released 2011. That same year, Django Ticket #15727 was opened, proposing adding CSP support to Django. Mozilla created django-csp from 2010, before the first public availability of CSP, using it on their own Django-powered sites. The first comment on Ticket #15727 pointed to django-csp, and the community basically rolled with it as the de facto solution.

Over the years, CSP itself evolved, as did django-csp, with Rob Hudson ending up as its maintainer. Focusing on the package motivated to finally get CSP into Django itself. He made a draft PR and posted on Ticket #15727 in 2024, which I enjoyed helping review. He iterated on the PR over the next 13 months until it was finally merged for Django 6.0. Thanks to Rob for his heroic dedication here, and to all reviewers: Benjamin Balder Bach, Carlton Gibson, Collin Anderson, David Sanders, David Smith, Florian Apolloner, Harro van der Klauw, Jake Howard, Natalia Bidart, Paolo Melchiorre, Sarah Boyce, and Sébastien Corbin.

Email API updates

The fourth and final headline feature:

Email handling in Django now uses Python’s modern email API, introduced in Python 3.6. This API, centered around the email.message.EmailMessage class, offers a cleaner and Unicode-friendly interface for composing and sending emails.

This is a major change, but it’s unlikely to affect projects using basic email features. You can still use Django’s send_mail() function and EmailMessage class as before, like:

from django.core.mail import EmailMessage

email = EmailMessage(
    subject="🐼 Need more bamboo",
    body="We are desperately low, please restock before the pandas find out!",
    from_email="zookeeper@example.com",
    to=["supplies@example.com"],
)
email.attach_file("/media/bamboo_cupboard.jpg")
email.send()

The key change is that, under-the-hood, when you call send() on a Django EmailMessage object, it now translates itself into a Python’s newer email.message.EmailMessage type before sending.

Modernizing provides these benefits:

  1. Fewer bugs - many edge case bugs in Python’s old email API have been fixed in the new one.
  2. Django is less hacky - a bunch of workarounds and security fixes in Django‘s email code have been removed.
  3. More convenient API - the new API supports some niceties, like the below inline attachment example.

Easier inline attachments with MIMEPart

Django’s EmailMessage.attach() method allows you to attach a file as an attachment. Emails support images as inline attachments , which can be displayed within the HTML email body.

While you could previously use EmailMessage.attach() to add inline attachments, it was a bit fiddly, using a legacy class. Now, you can call the method with a Python email.message.MIMEPart object to add an inline attachment in a few steps:

import email.utils
from email.message import MIMEPart
from django.core.mail import EmailMultiAlternatives

message = EmailMultiAlternatives(
    subject="Cute Panda Alert",
    body="Here's a cute panda picture for you!",
    from_email="cute@example.com",
    to=["fans@example.com"],
)
with open("panda.jpg", "rb") as f:
    panda_jpeg = f.read()

cid = email.utils.make_msgid()
inline_image = MIMEPart()
inline_image.set_content(
    panda_jpeg,
    maintype="image",
    subtype="jpeg",
    disposition="inline",
    cid=cid,
)
message.attach(inline_image)
message.attach_alternative(
    f'<h1>Cute panda baby alert!</h1><img src="cid:{cid[1:-1]}">',
    "text/html",
)

It’s not the simplest API, but it does expose all the power of the underlying email system, and it’s better than the past situation.

History

The new email API was added to Python as provisional in version 3.4 (2014) , and made stable in version 3.6 (2016) . The legacy API, however, was never planned for deprecation, so there was never any deadline to upgrade Django’s email handling.

In 2024, Mike Edmunds posted on the (old) django-developers mailing list , proposing the upgrade with strong reasoning and planning. This conversation led to Ticket #35581 , which he worked on for eight months until it was merged. Many thanks to Mike for leading this effort, and to Sarah Boyce for reviewing! Email is not a glamorous feature, but it’s a critical communication channel for nearly every Django project, so props for this.

Positional arguments in django.core.mail APIs

We’re now out of the headline features and onto the “minor” changes, starting with this deprecation related to the above email changes:

django.core.mail APIs now require keyword arguments for less commonly used parameters. Using positional arguments for these now emits a deprecation warning and will raise a TypeError when the deprecation period ends:

  • All optional parameters ( fail_silently and later) must be passed as keyword arguments to get_connection() , mail_admins() , mail_managers() , send_mail() , and send_mass_mail() .
  • All parameters must be passed as keyword arguments when creating an EmailMessage or EmailMultiAlternatives instance, except for the first four ( subject , body , from_email , and to ), which may still be passed either as positional or keyword arguments.

Previously, Django would let you pass all parameters positionally, which gets a bit silly and hard to read with long parameter lists, like:

from django.core.mail import send_mail

send_mail(
    "🐼 Panda of the week",
    "This week’s panda is Po Ping, sha-sha booey!",
    "updates@example.com",
    ["adam@example.com"],
    True,
)

The final True doesn’t provide any clue what it means without looking up the function signature. Now, using positional arguments for those less-commonly-used parameters raises a deprecation warning, nudging you to write:

from django.core.mail import send_mail

send_mail(
    subject="🐼 Panda of the week",
    body="This week’s panda is Po Ping, sha-sha booey!",
    from_email="updates@example.com",
    ["adam@example.com"],
    fail_silently=True,
)

This change is appreciated for API clarity, and Django is generally moving towards using keyword-only arguments more often. django-upgrade can automatically fix this one for you, via its mail_api_kwargs fixer .

Thanks to Mike Edmunds, again, for making this improvement in Ticket #36163 .

Extended automatic shell imports

Next up:

Common utilities, such as django.conf.settings, are now automatically imported to the shell by default.

One of the headline features back in Django 5.2 was automatic model imports in the shell , making ./manage.py shell import all of your models automatically. Building on that DX boost, Django 6.0 now also imports other common utilities, for which we can find the full list by running ./manage.py shell with -v 2 :

$ ./manage.py shell -v 2
6 objects imported automatically:

  from django.conf import settings
  from django.db import connection, models, reset_queries
  from django.db.models import functions
  from django.utils import timezone

...

(This is from a project without any models, so only the utilities are listed.)

So that’s:

  • settings , useful for checking your runtime configuration:

    In [1]: settings.DEBUG
    Out[1]: False
    
  • connection and reset_queries() , great for checking the executed queries :

    In [1]: Book.objects.select_related('author')
    Out[1]: <QuerySet []>
    
    In [2]: connection.queries
    Out[2]:
    [{'sql': 'SELECT "example_book"."id", "example_book"."title", "example_book"."author_id", "example_author"."id", "example_author"."name" FROM "example_book" INNER JOIN "example_author" ON ("example_book"."author_id" = "example_author"."id") LIMIT 21',
      'time': '0.000'}]
    
  • models and functions , useful for advanced ORM work:

    In [1]: Book.objects.annotate(
       ...:   title_lower=functions.Lower("title")
       ...: ).filter(
       ...:   title_lower__startswith="a"
       ...: ).count()
    Out[1]: 71
    
  • timezone , useful for using Django’s timezone-aware date and time utilities:

    In [1]: timezone.now()
    Out[1]: datetime.datetime(2025, 12, 1, 23, 42, 22, 558418, tzinfo=datetime.timezone.utc)
    

It remains possible to extend the automatic imports with whatever you’d like, as documented in How to customize the shell command documentation page.

Salvo Polizzi contributed the original automatic shell imports feature in Django 5.2. He’s then returned to offer these extra imports for Django 6.0, in Ticket #35680 . Thanks to everyone that contributed to the forum discussion agreeing on which imports to add, and to Natalia Bidart and Sarah Boyce for reviewing!

Dynamic field refresh on save()

Now let’s discuss a series of ORM improvements, starting with this big one:

GeneratedField s and fields assigned expressions are now refreshed from the database after save() on backends that support the RETURNING clause (SQLite, PostgreSQL, and Oracle). On backends that don’t support it (MySQL and MariaDB), the fields are marked as deferred to trigger a refresh on subsequent accesses.

Django models support having the database generate field values for you in three cases:

  1. The db_default field option, which lets the database generate the default value when creating an instance:

    from django.db import models
    from django.db.models.functions import Now
    
    
    class Video(models.Model):
        ...
        created = models.DateTimeField(db_default=Now())
    
  2. The GeneratedField field type, which is always computed by the database based on other fields in the same instance:

    from django.db import models
    from django.db.models.functions import Concat
    
    
    class Video(models.Model):
        ...
        full_title = models.GeneratedField(
            models.TextField(),
            expression=Concat(
                "title",
                models.Value(" - "),
                "subtitle",
            ),
        )
    
  3. Assigning expression values to fields before saving:

    from django.db import models
    from django.db.models.functions import Now
    
    
    class Video(models.Model):
        ...
        last_updated = models.DateTimeField()
    
    
    video = Video.objects.get(id=1)
    ...
    video.last_updated = Now()
    video.save()
    

Previously, only the first method, using db_default , would refresh the field value from the database after saving. The other two methods would leave you with only the old value or the expression object, meaning you’d need to call Model.refresh_from_db() to get any updated value if necessary. This was hard to remember and it costs an extra database query.

Now Django takes advantage of the RETURNING SQL clause to save the model instance and fetch updated dynamic field values in a single query, on backends that support it (SQLite, PostgreSQL, and Oracle). A save() call may now issue a query like:

UPDATE "example_video"
SET "last_updated" = NOW()
WHERE "example_video"."id" = 1
RETURNING "example_video"."last_updated"

Django puts the return value into the model field, so you can read it immediately after saving:

video = Video.objects.get(id=1)
...
video.last_updated = Now()
video.save()
print(video.last_updated)  # Updated value from the database

On backends that don’t support RETURNING (MySQL and MariaDB), Django now marks the dynamic fields as deferred after saving. That way, the later access, as in the above example, will automatically call Model.refresh_from_db() . This ensures that you always read the updated value, even if it costs an extra query.

History

This feature was proposed in Ticket #27222 way back in 2016, by Anssi Kääriäinen. It sat dormant for most of the nine years since, but ORM boss Simon Charette picked it up earlier this year, found an implementation, and pushed it through to completion. Thanks to Simon for continuing to push the ORM forward, and to all reviewers: David Sanders, Jacob Walls, Mariusz Felisiak, nessita, Paolo Melchiorre, Simon Charette, and Tim Graham.

Universal StringAgg aggregate

The next ORM change:

The new StringAgg aggregate returns the input values concatenated into a string, separated by the delimiter string. This aggregate was previously supported only for PostgreSQL.

This aggregate is often used for making comma-separated lists of related items, among other things. Previously, it was only supported on PostgreSQL, as part of django.contrib.postgres :

from django.contrib.postgres.aggregates import StringAgg
from example.models import Video

videos = Video.objects.annotate(
    chapter_ids=StringAgg("chapter", delimiter=","),
)

for video in videos:
    print(f"Video {video.id} has chapters: {video.chapter_ids}")

…which might give you output like:

Video 104 has chapters: 71,72,74
Video 107 has chapters: 88,89,138,90,91,93

Now this aggregate is available on all database backends supported by Django, imported from django.db.models :

from django.db.models import StringAgg, Value
from example.models import Video

videos = Video.objects.annotate(
    chapter_ids=StringAgg("chapter", delimiter=Value(",")),
)

for video in videos:
    print(f"Video {video.id} has chapters: {video.chapter_ids}")

Note the delimiter argument now requires a Value() expression wrapper for literal strings, as above. This change allows you to use database functions or fields as the delimiter if desired.

While most Django projects stick to PostgreSQL, having this aggregate available on all backends is a nice improvement for cross-database compatibility, and it means third-party packages can use it without affecting their database support.

History

The PostgreSQL-specific StringAgg was added way back in Django 1.9 (2015) by Andriy Sokolovskiy, in Ticket #24301 . In Ticket #35444 , Chris Muthig proposed adding the Aggregate.order_by option, something used by StringAgg to specify the ordering of concatenated elements, and as a side effect this made it possible to generalize StringAgg to all backends.

Thanks to Chris for proposing and implementing this change, and to all reviewers: Paolo Melchiorre, Sarah Boyce, and Simon Charette.

BigAutoField as the default primary key type

Next up:

DEFAULT_AUTO_FIELD setting now defaults to BigAutoField

This important change helps lock in scalable larger primary keys.

Django 3.2 (2021) introduced the DEFAULT_AUTO_FIELD setting for changing the default primary key type used in models. Django uses this setting to add a primary key field called id to models that don’t explicitly define a primary key field. For example, if you define a model like this:

from django.db import models


class Video(models.Model):
    title = models.TextField()

…then it will have two fields: id and title , where id uses the type defined by DEFAULT_AUTO_FIELD .

The setting can also be overridden on a per-app basis by defining AppConfig.default_auto_field in the app’s apps.py file:

from django.apps import AppConfig


class ChannelConfig(AppConfig):
    name = "channel"
    default_auto_field = "django.db.models.BigAutoField"

A key motivation for adding the setting was to allow projects to switch from AutoField (a 32-bit integer) to BigAutoField (a 64-bit integer) for primary keys, without needing changes to every model. AutoField can store values up to about 2.1 billion, which sounds large but it becomes easy to hit at scale. BigAutoField can store values up to about 9.2 quintillion, which is “more than enough” for every practical purpose.

If a model using AutoField hits its maximum value, it can no longer accept new rows, a problem known as primary key exhaustion . The table is effectively blocked, requiring an urgent fix to switch the model from AutoField to BigAutoField via a locking database migration on a large table. For a great watch on how Kraken is fixing this problem, see Tim Bell’s DjangoCon Europe 2025 talk , detailing some clever techniques to proactively migrate large tables with minimal downtime.

To stop this problem arising for new projects, Django 3.2 made new projects created with startproject set DEFAULT_AUTO_FIELD to BigAutoField , and new apps created with startapp set their AppConfig.default_auto_field to BigAutoField . It also added a system check to ensure that projects set DEFAULT_AUTO_FIELD explicitly, to ensure users were aware of the feature and could make an informed choice.

Now Django 6.0 changes the actual default values of the setting and app config attribute to BigAutoField . Projects using BigAutoField can remove the setting:

-DEFAULT_AUTO_FIELD = "django.db.models.BigAutoField"

…and app config attribute:

from django.apps import AppConfig

 class ChannelConfig(AppConfig):
     name = "channel"
-    default_auto_field = "django.db.models.BigAutoField"

The default startproject and startapp templates also no longer set these values. This change reduces the amount of boilerplate in new projects, and the problem of primary key exhaustion can fade into history, becoming something that most Django users no longer need to think about.

History

The addition of DEFAULT_AUTO_FIELD in Django 3.2 was proposed by Caio Ariede and implemented by Tom Forbes, in Ticket #31007 . This new change in Django 6.0 was proposed and implemented by ex-Fellow Tim Graham, in Ticket #36564 . Thanks to Tim for spotting that this cleanup was now possible, and to Jacob Walls and Clifford Gama for reviewing!

Template variable forloop.length

Moving on to templates, let’s start with this nice little addition:

The new variable forloop.length is now available within a for loop.

This small extension makes it possible to write a template loop like this:

<ul>
  {% for goose in geese %}
    <li>
      <strong>{{ forloop.counter }}/{{ forloop.length }}</strong>: {{ goose.name }}
    </li>
  {% endfor %}
</ul>

Previously, you’d need to refer to the length in an another way, like {{ geese|length }} , which is a bit less flexible.

Thanks to Jonathan Ströbele for contributing this idea and implementation in Ticket #36186 , and to David Smith, Paolo Melchiorre, and Sarah Boyce for reviewing.

querystring template tag enhancements

There are two extensions to the querystring template tag , which was added in Django 5.1 to help with building links that modify the current request’s query parameters.

  1. Release note:

    The querystring template tag now consistently prefixes the returned query string with a ? , ensuring reliable link generation behavior.

    This small change improves how the tag behaves when an empty mapping of query parameters are provided. Say you had a template like this:

    <a href="{% querystring params %}">Reset search</a>
    

    …where params is a dictionary that may sometimes be empty. Previously, if params was empty, the output would be:

    <a href="">Reset search</a>
    

    Browsers treat this as a link to the same URL including the query parameters , so it would not clear the query parameters as intended. Now, with this change, the output will be:

    <a href="?">Reset search</a>
    

    Browsers treat ? as a link to the same URL without any query parameters , clearing them as the user would expect.

    Thanks to Django Fellow Sarah Boyce for spotting this improvement and implementing the fix in Ticket #36268 , and for Django Fellow Natalia Bidart for reviewing!

  2. Release note:

    The querystring template tag now accepts multiple positional arguments, which must be mappings, such as QueryDict or dict .

    This enhancement allows the tag to merge multiple sources of query parameters when building the output. For example, you might have a template like this:

    <a href="{% querystring request.GET super_search_params %}">Super search</a>
    

    …where super_search_params is a dictionary of extra parameters to add to make the current search “super”. The tag merges the two mappings, with later mappings taking precedence for duplicate keys.

    Thanks again to Sarah Boyce for proposing this improvement in Ticket #35529 , to Giannis Terzopoulos for implementing it, and to Natalia Bidart, Sarah Boyce, and Tom Carrick for reviewing!

Fin

That’s a wrap! Thank you for reading my highlights. There are plenty more changes to read about in the release notes .

Also, there are always many more behind-the-scenes improvements and bug fixes that don’t make it into the release notes. Optimizations and micro-improvements get merged all the time, so don’t delay, upgrade today!

Thank you to all 174 people who contributed to Django 6.0, as counted in this list by Mariusz Felisiak.

May your upgrade be swift, smooth, safe, and secure,

—Adam


😸😸😸 Check out my new book on using GitHub effectively, Boost Your GitHub DX ! 😸😸😸


One summary email a week, no spam, I pinky promise.

Related posts:

Tags:

Fake ‘One Battle After Another’ torrent hides malware in subtitles

Bleeping Computer
www.bleepingcomputer.com
2025-12-12 17:12:47
A fake torrent for Leonardo DiCaprio's 'One Battle After Another' hides malicious PowerShell malware loaders inside subtitle files that ultimately infect devices with the Agent Tesla RAT malware. [...]...
Original Article

Cinema

A fake torrent for Leonardo DiCaprio’s 'One Battle After Another' hides malicious PowerShell malware loaders inside subtitle files that ultimately infect devices with the Agent Tesla RAT malware.

The malicious torrent file was discovered by Bitdefender researchers while investigating a spike in detections related to the movie.

One Battle After Another is a highly rated Paul Thomas Anderson movie released on September 26, 2025, starring Leonardo DiCaprio, Sean Penn, and Benicio del Toro.

Cybercriminals taking advantage of interest around new movies by uploading malicious torrents isn't anything new, but Bitdefender notes this case stands out for its unusually complex and stealthy infection chain.

"It's impossible to estimate how many people downloaded the files, but we saw that the supposed movie had thousands of seeders and leechers," explained Bitdefender .

Launching malware from subtitles

The downloaded One Battle After Another movie torrent used in the attacks contains various files, including a movie file (One Battle After Another.m2ts), two image files (Photo.jpg, Cover.jpg), a subtitles file (Part2.subtitles.srt), and a shortcut file (CD.lnk) that appears as a movie launcher.

When the CD shortcut is executed, it launches Windows commands that extract and run a malicious PowerShell script embedded in the subtitle file between lines 100 and 103.

Malicious PowerShell script hidden in subtitles
Malicious PowerShell script hidden in subtitles

This PowerShell script will then extract numerous AES-encrypted data blocks from the subtitles file again to reconstruct five PowerShell scripts that are dropped to 'C:\Users\<USER>\AppData\Local\Microsoft\Diagnostics.'

Other encrypted PowerShell commands in the subtitles
Other encrypted PowerShell commands in the subtitles
Source: BleepingComputer

The extracted PowerShell scripts act as a malware dropper, performing the following actions on the host:

  • Stage 1 – Extracts the One Battle After Another.m2ts file as an archive using any available extractor.
  • Stage 2 – Creates a hidden scheduled task (RealtekDiagnostics) that runs RealtekCodec.bat
  • Stage 3 – Decodes embedded binary data from Photo.jpg and writes restored files to the Windows Sound Diagnostics Cache directory.
  • Stage 4 – Ensures %LOCALAPPDATA%\Packages\Microsoft.WindowsSoundDiagnostics\Cache exists.
  • Stage 5 – Extracts Cover.jpg contents into the Cache directory, including batch files and PowerShell scripts.

The files extracted in the final stage are used to check whether Windows Defender is active, install Go, extract the final payload (AgentTesla), and load it directly into memory.

AgentTesla is a long-running (since 2014) Windows RAT and information stealer, commonly used to steal browser, email, FTP, and VPN credentials, as well as to capture screenshots.

While Agent Tesla is not new, it remains widely used due to its reliability and ease of deployment.

Bitdefender has noted that in other movie titles, for example, 'Mission: Impossible – The Final Reckoning,' it has observed other families used, such as Lumma Stealer.

Torrent files from anonymous publishers often contain malware, so it is recommended that users avoid pirating new movies entirely for safety.

tines

Break down IAM silos like Bitpanda, KnowBe4, and PathAI

Broken IAM isn't just an IT problem - the impact ripples across your whole business.

This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.

Show HN: tomcp.org – Turn any URL into an MCP server

Hacker News
github.com
2025-12-12 17:10:16
Comments...
Original Article

toMCP

Turn any website into an MCP server + Chat with any website.

Convert any website URL into an MCP (Model Context Protocol) server config for your AI tools, or chat directly with any website's content.

Usage

MCP Server

Simply add tomcp.org/ before any URL:

tomcp.org/docs.stripe.com
tomcp.org/react.dev
tomcp.org/your-docs.com/api

Chat with Website

Visit tomcp.org , paste a URL, and start chatting with any website's content using AI.

Supported AI Tools

  • Cursor - ~/.cursor/mcp.json
  • Claude Desktop - ~/.claude/claude_desktop_config.json
  • Windsurf - ~/.codeium/windsurf/mcp_config.json
  • VS Code - .vscode/mcp.json
  • Cline - ~/.cline/mcp_settings.json

How It Works

MCP Config

  1. Visit tomcp.org
  2. Enter any website URL
  3. Select your AI tool
  4. Copy the generated MCP config
  5. Add it to your tool's config file
  6. Restart your AI tool

Chat

  1. Visit tomcp.org
  2. Paste any website URL
  3. Click "Start Chat"
  4. Ask questions about the website's content

Example Config

{
  "mcpServers": {
    "docs-stripe-com": {
      "url": "https://tomcp.org/docs.stripe.com"
    }
  }
}

Chat API

curl -X POST https://tomcp.org/chat \
  -H "Content-Type: application/json" \
  -d '{"url": "docs.stripe.com", "message": "How do I create a payment intent?"}'

AI Models

Free Models (No API Key Required)

These models are available for everyone with no setup:

  • Llama 3.1 8B (Meta) - Default model, fast and capable
  • Hermes 2 Pro (NousResearch) - Great for reasoning
  • Mistral 7B (Mistral) - Efficient instruction-following
  • Gemma 7B LoRA (Google) - Lightweight and fast

Premium Models (API Key Required)

Add your Cloudflare Workers AI API key to unlock these models:

  • Llama 3.3 70B (Meta) - Most powerful Llama model
  • DeepSeek R1 32B (DeepSeek) - Advanced reasoning
  • Mistral Large (Mistral) - Enterprise-grade
  • Gemma 3 12B (Google) - Latest Gemma
  • GPT OSS 120B/20B (OpenAI) - Open-source GPT variants

Adding Your API Key

You can add your own Cloudflare Workers AI API key to:

  1. Unlock all premium models - Access larger, more capable models
  2. Bypass rate limits - No daily request limits
  3. Use your own quota - Charges go to your Cloudflare account

How to Get an API Key

  1. Go to Cloudflare Workers AI
  2. Create an API token with Workers AI permissions
  3. Copy the token

How to Add Your Key

  1. Start a chat session on tomcp.org
  2. Below the chat input, you'll see "Add API key from Cloudflare Workers AI"
  3. Paste your API key and click "Save"
  4. Premium models will now be unlocked in the dropdown

Where Is the API Key Stored?

  • Your API key is stored locally in your browser using localStorage
  • Key name: tomcp_api_key
  • The key is sent with each chat request but never stored on our servers
  • You can remove it anytime by clicking "Remove" in the API key section

How It Works (Technical)

Model Fetching

The available models are fetched dynamically from the Cloudflare Workers AI API:

  1. Frontend calls GET /models endpoint on page load
  2. Worker fetches models from api.cloudflare.com/client/v4/accounts/{id}/ai/models/search
  3. Models are filtered to "Text Generation" tasks and cached for 5 minutes
  4. Frontend displays free models as enabled, premium models as disabled (until API key is added)

Chat Flow

  1. User enters a URL and starts chatting
  2. Worker fetches the website content and converts HTML to Markdown
  3. Content is sent to the selected AI model with the user's message
  4. Response is returned to the user

Rate Limiting (Free Tier)

Without an API key:

  • 5 requests per IP per day

With your API key:

  • No rate limits (uses your Cloudflare account quota)

Tech Stack

  • Frontend : Vanilla HTML/CSS/JS with Tailwind CSS
  • Backend : Cloudflare Workers
  • AI : Cloudflare Workers AI (multiple models)

Features

  • Works with any public URL
  • No setup required - just paste the config
  • Free forever - powered by Cloudflare Workers
  • Chat with any website using AI
  • Side-by-side MCP Config + Chat interface
  • Multiple AI models - Choose from Llama, Mistral, Gemma, and more
  • Bring your own API key - Unlock premium models and bypass rate limits

License

Apache 2.0

Behind the Blog: Is This Headline 'Clickbait'?

403 Media
www.404media.co
2025-12-12 17:06:05
This week, we discuss conversational AI, a behind the scenes of the zine, and more....
Original Article

This is Behind the Blog, where we share our behind-the-scenes thoughts about how a few of our top stories of the week came together. This week, we discuss conversational AI, a behind the scenes of the zine, and more.

EMANUEL: I made the terrible mistake of looking at some Hacker News comments this week for my story about a developer whose Google accounts were banned after he uploaded training data to Google Drive. Unbeknownst to him, the training data contained CSAM .

As we’ve explained in previous stories, CSAM is a subject we dread covering not only because it’s one of the most awful things one could think about, but because it’s extremely difficult and legally risky. For understandable reasons, the laws around viewing, let alone possessing CSAM, are strict and punishing, which makes verification for reporting reasons challenging. For similar reasons, it’s something we need to write about very carefully, making sure we don’t wrongfully associate or whitewash someone when it comes to such horrible behavior.

This post is for paid members only

Become a paid member for unlimited ad-free access to articles, bonus podcast content, and more.

Subscribe

Sign up for free access to this post

Free members get access to posts like this one along with an email round-up of our week's stories.

Subscribe

Already have an account? Sign in

Oracle made a $300B bet on OpenAI. It's paying the price

Hacker News
finance.yahoo.com
2025-12-12 17:01:07
Comments...
Original Article

Oracle might have an OpenAI problem.

Oracle ( ORCL ) stock has tumbled over 40% from its September peak, erasing more than $360 billion from its market capitalization. Nearly $67 billion of that decline occurred on Thursday alone, as Oracle’s second quarter results failed to assuage a key concern for investors — that the company is too heavily reliant on OpenAI ( OPAI.PVT ).

Oracle’s AI-fueled growth targets outlined in its first quarter sent the stock to a record on Sept. 10, briefly making its founder, Larry Ellison, the world's richest man . In September, the company told investors its remaining performance obligations (RPO) — or the value of its future revenue from customer contracts signed — had soared nearly 360% to $455 billion.

It was later revealed that ChatGPT developer OpenAI accounted for at least $300 billion of its customer commitments as part of the Stargate project. Since then, its stock has struggled.

Rising concerns about OpenAI’s mounting costs — set to hit $1.4 trillion due to its deal spree with firms including Nvidia ( NVDA ), CoreWeave ( CRWV ), AMD ( AMD ), and Broadcom ( AVGO ), in addition to Oracle — and increasing competition from Google's ( GOOG ) Gemini models have made investors even more wary.

"Clearly there's been a reversal in terms of the market's perception of OpenAI in the last couple of months," BNB Paribas analyst Stefan Slowinski told Yahoo Finance. “The OpenAI ecosystem obviously has been suffering as a result.”

Slowinski and other Wall Street analysts agree that OpenAI’s potential inability to pay for its wide-ranging AI infrastructure commitments is Oracle’s biggest risk.

Read more: How to protect your portfolio from an AI bubble

OpenAI CEO Sam Altman declared a “code red” last week as the upstart faces greater rivalry from Google, threatening its ability to monetize its AI products and meet its ambitious revenue targets .

"[Oracle is] in this tough situation where they have to build out [data center] capacity for this customer and borrow a lot of money to do that when there's a very high uncertainty this customer will be able to pay for that capacity," DA Davidson analyst Gil Luria said.

Oracle’s second quarter results this week only deepened investor concerns.

The company’s $12 billion in capital expenditures was higher than expected, just as its free cash flow loss of $10 billion was much heavier than the $6 billion outflow anticipated. Oracle also substantially hiked its full-year capital expenditures forecast to $50 billion from $35 billion.

Oracle office building in Irvine, Calif. (Reuters/Mike Blake)

Oracle office building in Irvine, Calif. (Reuters/Mike Blake) · Reuters / Reuters

Executives’ attempts to quell worries over the company's high debt load, rising costs, and dependence on OpenAI didn’t help.

Japan law opening phone app stores to go into effect dec.18th

Hacker News
www3.nhk.or.jp
2025-12-12 16:59:04
Comments...
Original Article

A new Japanese law is going into effect that could loosen the dominance of tech giants over smartphone services. It aims to bring users greater choice for app stores and more.

Starting December 18, firms like Apple and Google will be prohibited from blocking third party app stores on iPhone and Android devices.

The law also aims to loosen their grip on web browsers and search. The firms will now be required to give first-time users multiple choices for default services. This also applies when people update their operating system.

The Fair Trade Commission says the changes will improve convenience by encouraging new market entrants.

But some public comments released by the commission expressed concern that the legislation could undermine user security.

Async DNS

Hacker News
flak.tedunangst.com
2025-12-12 16:52:41
Comments...
Original Article

curl experimented with using pthread_cancel to timeout async DNS requests and it blew up . What else can we do?

Out of curiosity, I decided to review some alternatives and see how they work. My personal priorities are control over events; no background threads or signals or secret mechanisms.

getaddrinfo

The tried and true classic technique is to call getaddrinfo in a thread. Probably with more than one thread so you don’t get stuck behind a single slow request, but probably not boundless either. You can also use a separate process if you don’t use threads.

This is probably good enough for many uses.

getaddrinfo_a

glibc provides getaddrinfo_a which basically does the thread dance for you. Some of it. It comes with some caveats, and it’s distinctly non portable, and probably doesn’t mesh with your idea of an event loop. Passing.

c-ares

c-ares is a standalone DNS library. It supports async queries via a threaded backend or an event driven system. I think the thread backend has the same issues, in that it uses a callback and then you need to push the results back into your application.

Alas, the event system uses lots of callbacks as well. This also includes some dire warnings in the documentation. “When the associated callback is called, it is called with a channel lock so care must be taken to ensure any processing is minimal to prevent DNS channel stalls.” Everyone knows the ideal callback just sets a flag, etc., but also everyone is inevitably tempted to do just one more thing, and hey look, it works fine, wait, why did it break. And thus I have a strong preference for library interfaces where you call into it, get some results, but any time you’re in your own code, you’re free to do what you want.

But worth a try. Based on the sample code I wrote the quickest dirtiest demo I could.

c-ares code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <poll.h>
#include <arpa/inet.h>

#include <ares.h>

struct server {
    char name[32];
    char ip[16];
    int status;
};

struct everything {
    struct server servers[1];
    int nservers;
    struct pollfd pfds[4];
    int npfds;
};

static void
addrinfo_cb(void *arg, int status, int timeouts, struct ares_addrinfo *result)
{
    struct server *server = arg;
    server->status = 3;
    if (!result)
        return;
    for (struct ares_addrinfo_node *node = result->nodes; node != NULL; node = node->ai_next) {
        if (node->ai_family == AF_INET) {
            struct sockaddr_in *in_addr = (void *)node->ai_addr;
            inet_ntop(node->ai_family, &in_addr->sin_addr, server->ip, sizeof(server->ip));        }
    }
}

static void
socket_cb(void *arg, ares_socket_t fd, int readable, int writable)
{
    struct everything *state = arg;
    printf("socket: %d r/w: %d %d\n", fd, readable, writable);

    int idx = -1;
    for (int i = 0; i < 4; i++) {
        if (state->pfds[i].fd == fd) {
            idx = i;
            break;
        }
    }
    if (idx == -1) {
        for (int i = 0; i < 4; i++) {
            if (state->pfds[i].fd == -1) {
                idx = i;
                state->pfds[idx].fd = fd;
                state->npfds++;
                break;
            }
        }
    }
    if (idx == -1)
        abort();

    if (!readable && !writable) {
        state->pfds[idx].fd = -1;
        state->npfds--;
        return;
    }
    state->pfds[idx].fd = fd;
    state->pfds[idx].events = 0;
    if (readable)
        state->pfds[idx].events |= POLLIN;
    if (writable)
        state->pfds[idx].events |= POLLOUT;
}

int
main(int argc, char **argv)
{
    struct everything state;
    memset(&state, 0, sizeof(state));
    strlcpy(state.servers[0].name, argv[1], sizeof(state.servers[0].name));
    state.servers[0].status = 1;
    state.nservers = 1;
    for (int i = 0; i < 4; i++)
        state.pfds[i].fd = -1;

    ares_library_init(ARES_LIB_INIT_ALL);

    struct ares_options options;
    memset(&options, 0, sizeof(options));
    int optmask = 0;
    options.flags = ARES_FLAG_EDNS | ARES_FLAG_DNS0x20;
    optmask |= ARES_OPT_FLAGS;
    options.sock_state_cb = socket_cb;
    options.sock_state_cb_data = &state;
    optmask |= ARES_OPT_SOCK_STATE_CB;

    ares_channel_t *channel;
    ares_init_options(&channel, &options, optmask);

    ares_fd_events_t ares_fds[1];

    while (1) {
        printf("top of loop\n");
        for (int i = 0; i < state.nservers; i++) {
            printf("processing server %d\n", i);
            struct server *server = &state.servers[i];
            switch (server->status) {
            case 1:
                {
                    struct ares_addrinfo_hints hints;
                    memset(&hints, 0, sizeof(hints));
                    hints.ai_family = AF_UNSPEC;
                    hints.ai_flags  = ARES_AI_CANONNAME;
                    ares_getaddrinfo(channel, argv[1], NULL, &hints, addrinfo_cb, server);
                    server->status = 2;
                }
                break;
            case 2:
                printf("woke up while working\n");
                break;
            case 3:
                printf("got it, done: %s -> %s\n", server->name, server->ip);
                return 0;
            }
        }
        if (state.npfds == 0) {
            printf("confused. nothing to poll\n");
            return 1;
        }
        int res = poll(state.pfds, 4 /* state.npfds */, 2000);
        printf("poll results: %d\n", res);
        if (res > 0) {
            ares_fd_events_t events[4];
            int nevents = 0;
            for (int i = 0; i < 4 /* state.npfds */; i++) {
                if (!state.pfds[i].revents)
                    continue;
                events[nevents].fd = state.pfds[i].fd;
                events[nevents].events = 0;
                if (state.pfds[i].revents & (POLLERR|POLLHUP|POLLIN))
                    events[nevents].events |= ARES_FD_EVENT_READ;
                if (state.pfds[i].revents & (POLLOUT))
                    events[nevents].events |= ARES_FD_EVENT_WRITE;
                nevents++;
            }
            ares_process_fds(channel, events, nevents, 0);
        }
    }
}

It’s okay, but the callbacks are annoying. Notifying me which descriptors need watching means I’m required to pack up my poll structure so I can access it in the callbacks, etc. Everything gets bound just a little bit tighter.

wadns

Among the alternatives the c-ares project helpfully lists, is dns.c . This sounds enticing.

On the downside, it’s not clear where the demo code stops and the functional code begins. As in, there’s a getaddrinfo sample, but it incorporates a lot of other code that doesn’t seem to be public. The public header doesn’t actually expose a means to interface with an event loop. The code is meant to be integrated into a project, which is understandable and even advantageous, but it means no demo today.

asr

The asr code was written for smtpd in OpenBSD. It doesn’t use threads and requires the caller to push events. Unfortunately, a portable version currently only exists in the OpenSMTPD repo. On the plus side, it’s used as the basis for the libc resolver in OpenBSD, which means the “sample” code to replace getaddrinfo literally is getaddrinfo.c.

I rewrote the c-ares demo to use asr. It comes out quite a bit shorter, and I think clearer as well.

asr code
#include <sys/types.h>
#include <sys/socket.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <poll.h>
#include <netdb.h>
#include <asr.h>
#include <arpa/inet.h>

struct server {
    char name[32];
    char ip[16];
    int status;
    struct asr_query *aq;
    int ar_fd;
};

int
main(int argc, char **argv)
{
    struct server servers[1] = {};
    strlcpy(servers[0].name, argv[1], sizeof(servers[0].name));
    servers[0].status = 1;
    int nservers = 1;

    while (1) {
        struct pollfd pfds[4];
        int npfds = 0;
        printf("top of loop\n");
        for (int i = 0; i < nservers; i++) {
            printf("processing server %d\n", i);
            struct server *server = &servers[i];
            switch (server->status) {
            case 1:
                {
                    struct addrinfo hints;
                    memset(&hints, 0, sizeof(hints));
                    hints.ai_family = AF_UNSPEC;
                    hints.ai_socktype = SOCK_STREAM;
                    server->aq = getaddrinfo_async(server->name, "80", &hints, NULL);
                    server->status = 2;
                }
                // fallthrough
          case 2:
                {
                    printf("ready to run\n");
                    struct asr_result ar;
                    int rv = asr_run(server->aq, &ar);
                    switch (rv) {
                    case 0:
                        pfds[npfds].fd = ar.ar_fd;
                        pfds[npfds].events = 0;
                        if (ar.ar_cond == ASR_WANT_READ)
                            pfds[npfds].events = POLLIN;
                        else
                            pfds[npfds].events = POLLOUT;
                        npfds++;
                        server->ar_fd = ar.ar_fd;
                        server->status = 3;
                        break;
                    case 1:
                        {
                            struct addrinfo *res;
                            for (res = ar.ar_addrinfo; res; res = res->ai_next) {
                                if (res->ai_family == AF_INET) {
                                    struct sockaddr_in *in_addr = (void *)res->ai_addr;
                                    inet_ntop(res->ai_family, &in_addr->sin_addr, server->ip, sizeof(server->ip));
                                }
                            }
                            server->status = 4;
                        }
                        break;
                    }
                }
                break;
            case 3:
                printf("woke up while working\n");
                break;
            case 4:
                printf("got it, done: %s -> %s\n", server->name, server->ip);
                return 0;
            }
        }
        if (npfds == 0)
            continue;
        int res = poll(pfds, npfds, 2000);
        printf("poll results: %d\n", res);
        if (res > 0) {
            for (int i = 0; i < npfds; i++) {
                if (!pfds[i].revents)
                    continue;
                for (int j = 0; j < nservers; j++) {
                    if (pfds[i].fd == servers[j].ar_fd)
                        servers[j].status = 2;
                }
            }
        }
    }
}

I like this API. It’s very much like read or write in that it either gives you an answer, or tells you to come back later, and then it’s up to you to decide when that is.

Posted 25 Sep 2025 18:33 by tedu Updated: 25 Sep 2025 18:33
Tagged: c programming

EFF and 12 Organizations Urge UK Politicians to Drop Digital ID Scheme Ahead of Parliamentary Petition Debate

Electronic Frontier Foundation
www.eff.org
2025-12-12 16:48:47
The UK Parliament convened earlier this week to debate a petition signed by almost 2.9 million people calling for an end to the government’s plans to roll out a national digital ID. Ahead of that debate, EFF and 12 other civil society organizations wrote to politicians in the country urging MPs to r...
Original Article

The UK Parliament convened earlier this week to debate a petition signed by almost 2.9 million people calling for an end to the government’s plans to roll out a national digital ID. Ahead of that debate, EFF and 12 other civil society organizations wrote to politicians in the country urging MPs to reject the Labour government’s newly announced digital ID proposal.

The UK’s Prime Minister Keir Starmer pitched the scheme as a way to “cut the faff ” in proving people’s identities by creating a virtual ID on personal devices with information like names, date of birth, nationality, photo, and residency status to verify their right to live and work in the country.

But the case for digital identification has not been made.

As we detail in our joint briefing, the proposal follows a troubling global trend : governments introducing expansive digital identity systems that are structurally incompatible with a rights-respecting democracy. The UK’s plan raises six interconnected concerns:

  1. Mission creep
  2. Infringements on privacy rights
  3. Serious security risks
  4. Reliance on inaccurate and unproven technologies
  5. Discrimination and exclusion
  6. The deepening of entrenched power imbalances between the state and the public.

Digital ID schemes don’t simply verify who you are—they redefine who can access services and what those services look like. They become a gatekeeper to essential societal infrastructure, enabling governments and state agencies to close doors as easily as they open them. And they disproportionately harm those already at society’s margins, including people seeking asylum and undocumented communities, who already face heightened surveillance and risk.

Even the strongest recommended safeguards cannot resolve the core problem : a mandatory digital ID scheme that shifts power dramatically away from individuals and toward the state. No one should be coerced—technically or socially—into a digital system in order to participate fully in public life. And at a time when almost 3 million people in the UK have called on politicians to reject this proposal, the government must listen to people and say no to digital ID.

Read our civil society briefing in full here .

Related Issues

Related Updates

Privacy is For the Children (Too)

Age verification will not keep children safe online. Rather, it is a large proverbial hammer that nails everyone—adults and young people alike—into restrictive parameters of what the government deems appropriate content. That reality is more obvious and tangible now that we’ve seen age-restrictive regulations roll out in various states and...

Security Theater REALized and Flying without REAL ID

After multiple delays of the REAL ID Act of 2005 and its updated counterpart, the REAL ID Modernization Act, in the United States, the May 7th deadline of REAL ID enforcement has finally arrived. Does this move our security forward in the skies? The last 20 years says we got...

Should I Use My State’s Digital Driver’s License?

A mobile driver’s license (often called an mDL) is a version of your ID that you keep on your phone instead of in your pocket. In theory, it would work wherever your regular ID works—TSA, liquor stores, to pick up a prescription, or to get into a bar. This sounds...

Digital ID Isn't for Everybody, and That's Okay

How many times do you pull out your driver’s license a week? Maybe two to four times to purchase age restricted items, pick up prescriptions, or go to a bar. If you get a mobile driver’s license (mDL) or other forms of digital identification (ID) being offered in Google and...

Senator endorses discredited book that claims chemical treats autism, cancer

Hacker News
www.propublica.org
2025-12-12 16:37:04
Comments...
Original Article

For years, Sen. Ron Johnson has been spreading conspiracy theories and misinformation about COVID-19 and the safety of vaccines.

He’s promoted disproven treatments for COVID-19 and claimed, without evidence, that athletes are “dropping dead on the field” after getting the COVID-19 vaccination. Now the Wisconsin politician is endorsing a book by a discredited doctor promoting an unproven and dangerous treatment for autism and a host of ailments: chlorine dioxide, a chemical used for disinfecting and bleaching.

The book is “ The War on Chlorine Dioxide: The Medicine that Could End Medicine by Dr. Pierre Kory, a critical care specialist who practiced in Wisconsin hospitals before losing his medical certification for statements advocating using an antiparasite medication to treat COVID-19. The action, he’s said, makes him unemployable , even though he still has a license.

Kory has said there’s a globally coordinated campaign by public health agencies, the drug industry and the media to suppress evidence of the medicinal wonders of chlorine dioxide. His book, according to its website, contends that the “remarkable molecule” works “to treat everything from cancer and malaria to autism and COVID.”

The book jacket features a prominent blurb from Johnson calling the doctor’s treatise: “A gripping tale of corruption and courage that will open eyes and prompt serious questions.”

Chlorine dioxide is a chemical compound that has a range of applications, including as a disinfectant and deodorizer. Food processing plants apply it to sanitize surfaces and equipment. Hospitals use it to sterilize medical devices, and some municipalities use low levels to treat public water supplies. Paper mills rely on it to whiten wood pulp. Safety experts advise those who handle it to work in well-ventilated spaces and to wear protective gloves.

Concentrations in drinking water systems higher than 0.8 milligrams per liter can be harmful, especially to infants, young children and fetuses, according to the Environmental Protection Agency.

Still, for many years people in online discussion groups have been promoting the use of chlorine dioxide in a mixture that they call a “miracle mineral solution,” ingested to rid people of a host of maladies. The Food and Drug Administration has warned that drinking these chlorine dioxide mixtures can cause injury and even death .

It is not medicinal, despite Kory’s contention. “It is all lunacy. Absolutely, it’s 100% nonsense,” said Joe Schwarcz, director of McGill University’s Office for Science and Society in Montreal and an expert on the threat of pseudoscience. Schwarcz has written articles about the so-called miracle mineral solution, calling it “a poison” when it’s in high concentrations.

A white paperback featuring a white cross in a red circle and a heart monitor line with a pulse at the beginning that quickly goes flat. The book is credited to “Dr. Pierre Kory with Jenna McCarthy” and displays Sen. Ron Johnson’s blurb near the center of the cover in large type.
The cover of the paperback version of “The War on Chlorine Dioxide” features a quote from Sen. Ron Johnson. Bella Luna Press

Kory’s book, set to be released to the public in January, argues that word of chlorine dioxide’s effectiveness has been suppressed by government and medical forces that need people to remain perpetually ill to generate large profits. The use of the word “war” in the title is fitting, Kory said in a recent online video on his co-author’s Substack. “In the book I detail many, many assassination attempts of doctors who try to bring out knowledge around chlorine dioxide,” he said.

Johnson confirmed to ProPublica in an email that he authorized the statement on the cover. “After reading the entire book, yes I provided and approved that blurb,” he said. “Have you read the book?”

ProPublica asked Kory and his co-author, Jenna McCarthy, to provide an advance copy, an interview and responses to written questions. Kory did not respond. McCarthy wrote in an email to ProPublica that she was addressing some of the questions on her Substack. (She did not send a book or agree to an interview.)

The book “is a comprehensive examination of the existing evidence and a plea for open-minded inquiry and rigorous research,” she wrote on Substack. She dismissed warnings about chlorine dioxide’s toxicity in high concentrations, writing: “Everything has a toxic dose — including nutmeg, spinach, and tap water.”

She said that chlorine dioxide is being studied in controlled settings by researchers in the United States and Latin America and that “the real debate is how it should be used, at what dose, and in which clinical contexts.”

Her Substack post was signed “Jenna (& Pierre).”

Johnson did not agree to an interview and did not answer questions emailed to his office by ProPublica, including whether he views chlorine dioxide as a world-changing medical treatment and whether he believes the FDA warnings are false.

“It’s Called Snake Oil”

Johnson has been an advocate of Kory’s for years, calling the doctor as an expert witness in two 2020 Senate hearings. In one, Kory championed taking the drug ivermectin, an antiparasite medicine, to treat COVID-19.

In 2021, an analysis of data from clinical trials concluded that ivermectin could reduce deaths from COVID-19 and may produce other positive effects. McCarthy cited that analysis in her Substack response.

In 2022, however, the American Journal of Therapeutics, which had published the study, warned that suspicious data “appears to invalidate the findings” regarding ivermectin’s potential to decrease deaths.

Later clinical trials have found no beneficial effect of ivermectin for COVID-19, and the FDA has warned that taking large doses can be dangerous. The drug’s manufacturer has said it hadn’t found any scientific basis for the idea that ivermectin can effectively treat COVID-19. Kory, though, continued advocating for ivermectin.

In 2024 the American Board of Internal Medicine, which credentials physicians in certain specialties, revoked Kory’s certifications in internal medicine, pulmonary disease and critical care for making false and misleading public statements about the ability of ivermectin to treat COVID-19. Hospitals and many insurance networks typically require doctors to be board certified.

Kory vigorously fought the disciplinary action, arguing to the ABIM that he provided substantial medical and scientific evidence to support his recommendations for addressing COVID-19, though not the “consensus-driven” approach. He also sued the board in federal court, citing his free speech rights in a case that is still progressing in the 5th U.S. Circuit Court of Appeals. On Substack, McCarthy excoriated the ABIM, saying it “bullies physicians” and “enforces ideological conformity.”

In 2022, Johnson and Kory penned a Fox News op-ed opposing a California bill that would strip doctors’ licenses for espousing misinformation about COVID-19. The bill became law but was repealed after a court fight. A federal judge found the statute’s definition of misinformation to be too vague , which could infringe on doctors’ right to free speech.

Johnson, who has been in Congress since 2011, has a history of advocating for experimental treatments and viewing the government as an impediment. Dr. Peter Lurie, president and executive director of the Center for Science in the Public Interest, a public health advocacy group, said that among members of Congress, Johnson was “an early adopter of anti-science ideas.”

Lurie said that Johnson is no longer an outlier in Washington, which now has many more elected lawmakers whom he considers anti-science. “What may have started off as the cutting edge of an anti-science movement has now turned into a much more broader-based movement that is supported by millions of people,” he said.

Earlier this year, Johnson held a hearing highlighting a flawed study claiming that vaccinated children had an increased rate of serious chronic diseases when compared to children who were not vaccinated. The conclusion questions the scientific consensus that vaccines are safe. The study’s researchers chose not to publish it because of problems they found in their data and methodology.

In November, Johnson and Kory were listed among the speakers at a conference of the Children’s Health Defense, a nonprofit that stirs anti-vaccine sentiment. It was launched in 2018 by Health and Human Services Secretary Robert F. Kennedy Jr., whose FDA is considering new ways to more closely scrutinize vaccine safety.

HHS did not respond to requests from ProPublica about Kennedy’s views on chlorine dioxide. At his confirmation hearing, Kennedy praised President Donald Trump for his wide search for a COVID-19 remedy in his first term, which Kennedy said included vaccines, various drugs, “even chlorine dioxide.”

Kory’s publisher is listed as Bella Luna Press, which has issued at least two other titles by McCarthy. “Thanks to the Censorship Industrial Complex, you won’t find The War on Chlorine Dioxide on Amazon or at Barnes & Noble. We had to design and build this website, figure out formatting and printing and shipping, and manage every aspect of order processing ourselves,” the book’s website states. (A representative for Bella Luna could not be reached for comment.)

As this new book is released, the autism community is also grappling with another controversy: the unsubstantiated assertion by Kennedy that Tylenol use by pregnant women poses an increased risk of autism. In addition, under Kennedy, the Centers for Disease Control and Prevention revised its website in November to cast doubt on the long-held scientific conclusion that childhood vaccines do not cause autism.

Some parents of children with autism, desperate for a remedy, have long reached for dubious and at times dangerous panaceas, including hyperbaric oxygen chambers and chelation therapy, used for the treatment of heavy metal poisoning. Neither method has been proven effective.

Helen Tager-Flusberg, director of the Center for Autism Research Excellence at Boston University, said Johnson has “acted extremely irresponsibly” in lending his name to a book making claims about chlorine dioxide treating autism.

“Wisconsin is filled with experts — clinical experts, medical experts, scientists — who understand and have studied autism and treatments for autism for many many years,” she said. “He’s chosen to completely ignore the clinical and the scientific community.”

People with autism may take medication to reduce anxiety, address attention problems, or reduce severe irritability. Many benefit from behavioral interventions and special education services to help with learning and functional abilities. But there is no cure, said Tager-Flusberg.

Referring to chlorine dioxide, she said: “We have had examples of this probably throughout the history of medicine. There’s a word for this, it’s called snake oil.”

In her response on Substack to ProPublica, McCarthy wrote that “chlorine dioxide is being used to treat (nobody said ‘cure’) autism with life-changing results.”

The Search for Miracle Cures

The mother of an autistic son, Melissa Eaton of North Carolina , heard Kory reference his book in early November on The HighWire, an internet talk show hosted by Del Bigtree, a prominent vaccine skeptic and former communications director for Kennedy’s 2024 presidential campaign. She then looked up the book online and noticed Johnson’s endorsement.

Eaton for many years has worked to expose people who peddle chlorine dioxide and to report apparent injuries to authorities. She monitors social media forums where parents discuss giving it to their children orally or via enemas. Sometimes the families reveal that their children are sick. “They’re throwing up and vomiting and having diarrhea and rashes,” Eaton said.

Some adherents advise parents that the disturbing effects indicate that the treatment is working, ridding the body of impurities, or that the parents should alter the dosage.

“Most of these kids are nonverbal,” Eaton said. “They’re not able to say what’s hurting them or what’s happening to them. The parents feel they’re doing the right thing. That’s how they view this: They’re helping to cure autism.”

The idea that chlorine dioxide can be a miracle cure began to spread about 20 years ago when a gold prospector, Jim Humble, wrote a book claiming his team in Guyana fell ill with malaria and recovered after drinking safe amounts of chlorine dioxide.

Humble later co-founded a “health and healing” church in Florida with a man named Mark Grenon, who called himself an archbishop and sold a chlorine dioxide solution as a cure for COVID-19. They described it as a “miracle mineral solution,” or MMS.

Grenon went to prison in 2023 for conspiring to defraud the United States by distributing an unapproved and misbranded drug. The scheme took in more than $1 million, according to prosecutors.

An affidavit in the case filed by a special agent with the FDA Office of Criminal Investigations noted: “FDA has received numerous reports of adverse reactions to MMS. These adverse reactions include hospitalizations, life-threatening conditions, and death.”

Grenon, who is now out of prison, told ProPublica that he too is writing a book about chlorine dioxide. “My book will tell the truth.” He declined further comment.

Chlorine dioxide is currently used in many ways that are not harmful. It is found in some consumer products like mouthwashes, but it is not meant to be swallowed in those instances. (One popular mouthwash warns to “keep out of reach of children.”) It’s also available to consumers in do-it-yourself packages where they combine drops from two bottles of different compounds — commonly sodium chlorite and hydrochloric acid — and add it to water. Hikers often carry the drops, or tablets, using small amounts to make quarts of fresh water potable.

But numerous online shoppers post product reviews that go further, referring to it as a tonic. Various online guides, some aimed at parents of autistic children, recommend a shot-glass-size dose, sometimes given multiple times a day and even hourly. That can far exceed the threshold the EPA considers safe.

McCarthy, addressing ProPublica on Substack, wrote: “You point to various online guides that offer what could be considered dangerous dosing instructions. We agree, the internet is a terrifying wasteland of misinformation and disinformation.”

In the Substack video, Kory said he felt compelled to spread the word about chlorine dioxide much as he did about ivermectin, even though it cost him professionally.

He no longer has a valid medical license in Wisconsin or California, where he did not renew them, according to the Substack post. His medical licenses in New York and Michigan are active.

“I like to say I was excommunicated from the church of the medical establishment,” he said in the Substack video. As a result, he said, he turned to telehealth and started a practice.

In the Nov. 6 HighWire episode hosted by Bigtree, the discussion included talk not just of chlorine dioxide’s medicinal potential but also of how cheap and easy it is to obtain.

“On Amazon, it’s literally, you get two bottles, well, it comes in two,” Kory started to explain, before stopping that train of thought.

“I wouldn’t know how to make it,” he said.

Secondary school maths showing that AI systems don't think

Hacker News
www.raspberrypi.org
2025-12-12 16:32:43
Comments...
Original Article

At a time when many young people are using AI for personal and learning purposes, schools are trying to figure out what to teach about AI and how (find out more in this summer 2025 data about young people’s usage of AI in the UK ). One aspect of this is how technical we should get in explaining how AI works, particularly if we want to debunk naive views of the capabilities of the technology, such as that AI tools ‘think’. In this month’s research seminar, we found out how AI contexts can be added to current classroom maths to make maths more interesting and relevant while teaching the core concepts of AI.

At our computing education research seminar in July, a group of researchers from the CAMMP (Computational and Mathematical Modeling Program) research project shared their work:

  • Prof. Dr. Martin Frank , Founder of CAMMP (Karlsruhe Institute of Technology (KIT), Germany).
  • Assistant Prof. Dr. Sarah Schönbrodt (University of Salzburg, Austria)
  • Research Associate Stephan Kindler (Karlsruhe Institute of Technology (KIT), Germany)

They talked about how maths already taught in secondary schools can be used to demystify AI. At first glance, this seems difficult to do, as it is often assumed that school-aged learners will not be able to understand how these systems work. This is especially the case for artificial neural networks, which are usually seen as a black box technology — they may be relatively easy to use, but it’s not as easy to understand how they work. Despite this, the Austrian and German team have developed a clear way to explain some of the fundamental elements of AI using school-based maths.

Sarah Schönbrodt started by challenging us to consider that learning maths is an essential part in developing AI skills, as:

  1. AI systems using machine learning are data-driven and are based on mathematics, especially statistics and data
  2. Authentic machine learning techniques can be used to bring to life existing classroom maths concepts
  3. Real and relevant problems and associated data are available for teachers to use

A set of workshops for secondary maths classrooms

Sarah explained how the CAMMP team have developed a range of teaching and learning materials on AI (and beyond) with an overall goal to “allow students to solve authentic, real and relevant problems using mathematical modeling and computers”.

She reflected that much of school maths is set in contexts that are abstract, and may not be very interesting or relevant to students. Therefore, introducing AI-based contexts, which are having a huge impact on society and students’ lives, is both an opportunity to make maths more engaging and also a way to demystify AI.

A glance at the schoolbook diagram
Old-fashioned contexts are often used to teach classroom maths concepts. Those same concepts could be taught using real-world AI contexts. (Slide from the researchers’ presentation.)

Workshops designed and researched by the team include contexts such as privacy in social networks to learn about decision trees, personalised Netflix recommendations to learn about k-nearest neighbour, word predictions to learn about N-Grams, and predicting life expectancy to learn about regression and neural networks.

Learning about classification models: traffic lights and the support vector machine

For the seminar, Sarah walked through the steps to learn about support vector machines. This is an upper secondary workshop for students aged 17 to 18 years old. The context of the lesson is an image problem — specifically, classifying the data representing the colours of a simplified traffic light system (two lights to start with) to work out if a traffic light is red or green.

She walked through each of the steps of the maths workshop:

  • Plotting data points of two classes, the representation of green and red traffic lights
  • Finding a line that best separates the data points of both classes
  • Figuring out what best is
  • Classifying the data points in relation to the chosen (separating) line
  • Validating the model statistically to see if it is useful in classifying new data points, including using test data and creating a contingency table (also called a confusion matrix)
  • Discussing limitations, including social and ethical issues
  • Explaining how three traffic lights can be expressed as three-dimensional data by using planes
Classification problems diagram
By classifying green and red traffic light data, students are learning about lines, classifying data, and considering limitations. (Slide from the researchers’ presentation.)

Throughout the presentation, Sarah pointed out where the maths taught was linked to the Austrian and German mathematics curriculum.

Classification problems diagram
Learning about planes, separating planes, and starting to see how data can be represented in vectors. (Slide from the researchers’ presentation.)

Learning about social and ethical issues

Learning about the social and ethical issues in data-driven systems. (Slide from the researchers’ presentation.)

As well as learning about lines, planes, distances, dot product and statistical measures, learners are also engaged in discussing the social and ethical issues of the approach taken. They are encouraged to think about bias, data diversity, privacy, and the impact of errors on people. For example, if the model wrongly predicts a light as green when it is red, then an autonomous car would run through a red traffic light. This would likely be a bigger consequence than stopping at a green traffic light that was mis-predicted as red. So should the best line reduce this kind of error?

To teach the workshops, Sarah explained they have developed interactive Jupyter notebooks, where no programming skills are needed. Students fill in the gaps of example code, explore simulations, and write their ideas for discussion for the whole class. No software needs to be installed, feedback is direct, and there are in-depth tasks and staggered hints.

Learning about regression models: Weather forecasting and the toy artificial neural network

Stephan went on to introduce artificial neural networks (ANNs), which are the basis of generative AI applications like chatbots and image generation systems. He focused on regression models, such as those used in weather forecasting.

ANNs are very complex. Therefore, to start to understand the fundamentals of this technology, he introduced a ‘toy ANN’ with one input, three nodes, and one output. A function is performed on the input data at each node. With the toy network, the team wants to tackle a major and common misconception: that students think that ANN systems learn, recognise, see, and understand, when really it’s all just maths.

Tackling misconceptions about ANNs by exploring how they work in a toy version. (Slide from the researchers’ presentation.)

The learning activity starts by looking at one node with one input and one output, and can be described as a mathematical function, with a concatenation of two functions (in this case a linear and activation function). Stephan shared an online simulator that visualises how the toy neural network can be explored as students change two parameters (in this case, weight and bias of the functions). Students then look at the overall network, and the way that the output from the three nodes is combined. Again, they can explore this in the simulator. Students compare simple data about weather prediction to the model, and discover they need more functions — more nodes to better fit the data. The activity helps students learn that ANN systems are just highly adjustable mathematical functions that, by adding nodes, can approximate relationships in a given data set. But the approximation only works in the bounds (intervals) in which data points are given, showing that ANNs do not ‘understand’ or ’know’ — it’s just maths.

Stephen finished by explaining the mutual benefits of AI education and maths education. He suggested maths will enable a deeper understanding of AI, and give students a way to realistically assess the opportunities and risks of AI tools and show them the role that humans have in designing AI systems. He also explained that classroom maths education can benefit from incorporating AI contexts. This approach highlights how maths underpins the design and understanding of everyday systems, supports more effective teaching, and promotes an interdisciplinary way of learning across subjects.

Some personal reflections — which may not be quite right!

I have been researching the teaching of AI and machine learning for around five years now, since before ChatGPT and other similar tools burst on the scene. Since then, I have seen an increasing number of resources to teach about the social and ethical issues of the topic, and there are a bewildering number of learning activities and tools for students to train simple models. There are frameworks for the data lifecycle, and an emerging set of activities to follow to prepare data, compare model types, and deploy simple applications. However, I felt the need to understand and to teach about, at a very simple level, the basic building blocks of data-driven technologies. When I heard the CAMMP team present their work at the AIDEA conference in February 2025 , I was entirely amazed and I asked them to present here at our research seminar series. This was a piece of the puzzle that I had been searching for — a way to explain the ‘bottom of the technical stack of fundamental concepts’ . The team is taking very complex ideas and reducing them to such an extent that we can use secondary classroom maths to show that AI is not magic and AI systems do not think. It’s just maths. The maths is still hard, and teachers will still need the skills to carefully guide students step by step so they can build a useful mental model.

Photo of a class of students at computers, in a computer science classroom.

I think we can simplify these ideas further, and create unplugged activities, simulations, and ways for students to explore these basic building blocks of data representation, as well as classification and representing approximations of complex patterns and prediction. I can sense the beginnings of new ideas in computational thinking, though they’re still taking shape. We’re researching these further and will keep you updated.

Finding out more

If you would like to find out more about the CAMMP resources, you can watch the seminar recording , look at the CAMMP website or try out their online materials. For example, the team shared a link to the jupyter notebooks they use to teach the workshops they demonstrated (and others). You can use these with a username of ‘cammp_YOURPSEUDONYM’, where you can set ‘YOURPSEUDONYM’ to any letters, and you can choose any password. They also shared their toy ANN simulation .
The CAMMP team are not the only researchers who are investigating how to teach about AI in maths lessons. You can find a set of other research papers here .

Join our next seminar

In our current seminar series, we’re exploring teaching about AI and data science. Join us at our last seminar of the series on Tuesday, 27 January 2026 from 17:00 to 18:30 GMT to hear Salomey Afua Addo talk about using unplugged approaches to teach about neural networks .

To sign up and take part, click the button below. We’ll then send you information about joining. We hope to see you there.

The schedule of our upcoming seminars is online. You can catch up on past seminars on our previous seminars page .

[$] Best practices for linux-next

Linux Weekly News
lwn.net
2025-12-12 16:27:38
One of the key components in the kernel's development process is the linux-next repository. Every day, a large number of branches, each containing commits intended for the next kernel development cycle, is pulled into linux-next and integrated. If there are conflicts between branches, the linux-ne...
Original Article

The page you have tried to view ( Best practices for linux-next ) is currently available to LWN subscribers only.

Reader subscriptions are a necessary way to fund the continued existence of LWN and the quality of its content.

If you are already an LWN.net subscriber, please log in with the form below to read this content.

Please consider subscribing to LWN . An LWN subscription provides numerous benefits, including access to restricted content and the warm feeling of knowing that you are helping to keep LWN alive.

(Alternatively, this item will become freely available on December 25, 2025)

String Theory Inspires a Brilliant, Baffling New Math Proof

Hacker News
www.quantamagazine.org
2025-12-12 16:23:10
Comments...
Original Article

Years ago, an audacious Fields medalist outlined a sweeping program that, he claimed, could be used to resolve a major problem in algebraic geometry. Other mathematicians had their doubts. Now he says he has a proof.

Introduction

I n August, a team of mathematicians posted a paper claiming to solve a major problem in algebraic geometry — using entirely alien techniques. It instantly captivated the field, stoking excitement in some mathematicians and skepticism in others.

The result deals with polynomial equations, which combine variables raised to powers (like y = x or x 2 − 3 xy = z 2 ). These equations are some of the simplest and most ubiquitous in mathematics, and today, they’re fundamental to lots of different areas of study. As a result, mathematicians want to study their solutions, which can be represented as geometric shapes like curves, surfaces and higher-dimensional objects called manifolds .

There are infinitely many types of polynomial equations that mathematicians want to tame. But they all fall into one of two basic categories — equations whose solutions can be computed by following a simple recipe, and equations whose solutions have a richer, more complicated structure. The second category is where the mathematical juice is: It’s where mathematicians want to focus their attention to make major advances.

But after sorting just a few types of polynomials into the “easy” and “hard” piles, mathematicians got stuck. For the past half-century, even relatively simple-looking polynomials have resisted classification.

Then this summer, the new proof appeared. It claimed to end the stalemate, offering up a tantalizing vision for how to classify lots of other types of polynomials that have until now seemed completely out of reach.

The problem is that no one in the world of algebraic geometry understands it. At least, not yet. The proof relies on ideas imported from the world of string theory. Its techniques are wholly unfamiliar to the mathematicians who have dedicated their careers to classifying polynomials.

Some researchers trust the reputation of one of the paper’s authors, a Fields medalist named Maxim Kontsevich . But Kontsevich also has a penchant for making audacious claims, giving others pause. Reading groups have sprung up in math departments across the world to decipher the groundbreaking result and relieve the tension.

This review may take years. But it’s also revived hope for an area of study that had stalled. And it marks an early victory for a broader mathematical program that Kontsevich has championed for decades — one that he hopes will build bridges between algebra, geometry and physics.

“The general perception,” said Paolo Stellari , a mathematician at the University of Milan who was not involved in the work, “is that we might be looking at a piece of the mathematics of the future.”

The Rational Approach

The effort to classify all polynomials deals with the oldest kind of math: solving equations. To solve the simple polynomial y = 2 x , for instance, you just need to find values of x and y that satisfy the equation. There are infinitely many solutions to this equation, such as x = 1, y = 2. When you graph all the solutions in the coordinate plane, you get a line.

Other polynomials are harder to solve directly, and their solutions cut out more complicated, higher-dimensional shapes in space.

But for some of these equations, it turns out, there’s a really simple way to find every possible solution. Instead of separately plugging different numbers into each variable, you can get all the solutions at once by rewriting the variables in terms of a new variable, t .

Consider the polynomial x 2 + y 2 = 1, which defines a circle. Now set x equal to 2 t /(1 + t 2 ), and y equal to (1 − t 2 )/(1 + t 2 ). When you plug these new formulas back into your original equation, you get 1 = 1, a statement that’s always true, no matter what t is. This means that by choosing any real-number value for t , you’ll instantly get a solution to the original polynomial. For instance, when you set t equal to 1, you get x = 2(1)/(1 + (1) 2 ) = 1, and y = 0. And indeed, x = 1, y = 0 is a solution to the original equation: (1) 2 + (0) 2 = 1.

This straightforward way of framing all your solutions is called a rational parameterization. It’s equivalent to mapping every point on the graph of your original polynomial — in this case, a circle — to a unique point on a straight line.

Any degree-1 polynomial equation — that is, any polynomial whose terms are raised to a power of at most 1 — can be parameterized like this. It doesn’t matter how many variables the equation has: It might have two variables, or 200. Once you go beyond two variables, the solutions to your polynomial equation will form complicated higher-dimensional shapes. But because the polynomial can still be parameterized, there’s a way to map every point in your high-dimensional shape to points on a particularly simple space in the same number of dimensions (like the line). This, in turn, gives you a straightforward way to compute all the polynomial’s solutions.

Similarly, any degree-2 polynomial (whose terms are raised to a power of at most 2) has a rational parameterization.

But if an equation’s degree is 3 or more, it can’t always be parameterized. It depends on how many variables the equation has.

Take a typical kind of degree-3 polynomial: elliptic curves, like y 2 = x 3 + 1, which have only two variables. “Elliptic curves are glorious, they’re wonderful, but you can’t possibly parameterize them,” said Brendan Hassett of Brown University. There’s no simple formula for x and y that gives you all of an elliptic curve’s solutions, so there’s no way to map the curve to a straight line. “If you could, they would not be so much fun,” Hassett said.

Instead, the solutions to an elliptic curve have a far richer structure — one that’s played a vital role in number theory for centuries, and that cryptographers have taken advantage of to encode secret messages.

What about degree-3 equations with more variables, then? Are they parameterizable, or is the structure of their solutions more fun, the way it is for elliptic curves?

In 1866, the German mathematician Alfred Clebsch showed that degree-3 equations with three variables — whose solutions form two-dimensional surfaces — are usually parameterizable. More than a century later, Herbert Clemens and Phillip Griffiths published a monumental proof in which they showed that the opposite is true for most degree-3 equations with four variables. These equations, which form three-dimensional manifolds called three-folds, are not parameterizable : Their solutions can’t be mapped to a simple 3D space.

Many mathematicians suspected that the next polynomial to be classified — degree-3 equations with five variables (forming four-dimensional manifolds known as four-folds) — wouldn’t usually be parameterizable either. In fact, they figured that polynomials should never be parameterizable past a certain point. But Clemens and Griffiths’ techniques didn’t work for four-folds.

And so for decades, the classification effort lay dormant.

Converting a Prophet

Mathematicians were surprised when, at a conference in Moscow in the summer of 2019, Maxim Kontsevich got up to speak about classifying four-folds.

For one thing, Kontsevich is known for taking a high-level approach to mathematics, preferring to pose ambitious conjectures and sketch out broad programs, often leaving the subtler details and formal proof-writing to others. He’s described himself as something between a prophet and a daydreamer.

For the past three decades, he’s been focused on developing a program called homological mirror symmetry, which has its roots in string theory. In the 1980s, string theorists wanted to count the number of curves on high-dimensional manifolds to answer questions about how the building blocks of the universe might behave. To count the curves on a given manifold, they considered its “mirror image” — another manifold that, though very different from the original, had related properties. In particular, they found that an algebraic object associated to the mirror image, called a Hodge structure, could reveal the number of curves on the original manifold. The reverse was also true: If you count the curves on the mirror image, you’ll get information about the original manifold’s Hodge structure.

In 1994, Kontsevich sketched out a program to explain the underlying reason for this correspondence. His program also predicted that the correspondence extended to all kinds of manifolds beyond those relevant to string theory.

For now, no one knows how to prove Kontsevich’s mirror symmetry program. “It will be next-century mathematics,” he said. But over the years, he’s made partial progress toward a proof — while also exploring the program’s potential consequences.

In 2002, one of Kontsevich’s friends, Ludmil Katzarkov of the University of Miami, hypothesized one such consequence: that the program might be relevant to the classification of polynomial equations.

Katzarkov was familiar with Clemens and Griffiths’ 1972 proof that three-folds aren’t parameterizable. In that work, the pair looked at a given three-fold’s Hodge structure directly. They then used it to show that the three-fold couldn’t be mapped to a simple 3D space. But the Hodge structures associated with four-folds were too complicated to analyze using the same tools.

Katzarkov’s idea was to access the four-fold’s Hodge structure indirectly — by counting how many curves of a particular type lived on its mirror image. Typically, mathematicians studying the Hodge structures of four-folds don’t think about curve counts like these: They only come up in seemingly unrelated areas of math, like string theory. But if the mirror symmetry program is true, then the number of curves on the mirror image should illuminate features of the original four-fold’s Hodge structure.

In particular, Katzarkov wanted to break the mirror image’s curve count into pieces, then use the mirror symmetry program to show that there was a corresponding way to break up the four-fold’s Hodge structure. He could then work with these pieces of the Hodge structure, rather than the whole thing, to show that four-folds can’t be parameterized. If any one of the pieces couldn’t be mapped to a simple 4D space, he’d have his proof.

But this line of reasoning depended on the assumption that Kontsevich’s mirror symmetry program was true for four-folds. “It was clear that it should be true, but I didn’t have the technical ability to see how to do it,” Katzarkov said.

He knew someone who did have that ability, though: Kontsevich himself.

But his friend wasn’t interested.

Digging In

For years, Katzarkov tried to convince Kontsevich to apply his research on mirror symmetry to the classification of polynomials — to no avail. Kontsevich wanted to focus on the whole program, not this particular problem. Then in 2018, the pair, along with Tony Pantev of the University of Pennsylvania, worked on another problem that involved breaking Hodge structures and curve counts into pieces. It convinced Kontsevich to hear Katzarkov out.

Katzarkov walked him through his idea again. Immediately, Kontsevich discovered an alternative path that Katzarkov had long sought but never found: a way to draw inspiration from mirror symmetry without actually relying on it. “After you’ve spent years thinking about this, you see it happening in seconds,” Katzarkov said. “That’s a spectacular moment.”

Kontsevich argued that it should be possible to use the four-fold’s own curve counts — rather than those of its mirror image — to break up the Hodge structure. They just had to figure out how to relate the two in a way that gave them the pieces they needed. Then they’d be able to focus on each piece (or “atom,” as they called it) of the Hodge structure separately.

This was the plan Kontsevich laid out for his audience at the 2019 conference in Moscow. To some mathematicians, it sounded as though a rigorous proof was just around the corner. Mathematicians are a conservative bunch and often wait for absolute certainty to present new ideas. But Kontsevich has always been a little bolder. “He’s very open with his ideas, and very forward-thinking,” said Daniel Pomerleano , a mathematician at the University of Massachusetts, Boston, who studies mirror symmetry.

There was a major ingredient they still had no idea how to address, Kontsevich warned: a formula for how each atom would change as mathematicians tried to map the four-fold to new spaces. Only with such a formula in hand could they prove that some atom would never reach a state corresponding to a properly “simplified” four-fold. This would imply that four-folds weren’t parameterizable, and that their solutions were rich and complicated. “But people somehow got the impression that he said it was done,” Pomerleano said, and they expected a proof soon.

When that didn’t come to pass, some mathematicians began to doubt that he had a real solution. In the meantime, Tony Yue Yu , then at the French National Center for Scientific Research, joined the team. Yu’s fresh insights and meticulous style of proof, Kontsevich said, turned out to be crucial to the project.

When lockdowns began during the Covid pandemic, Yu visited Kontsevich at France’s nearby Institute for Advanced Scientific Studies. They relished the quiet of the deserted institute, spending hours in lecture halls where there were more blackboards, Yu recalled.

Meeting regularly with Pantev and Katzarkov over Zoom, they quickly completed the first part of their proof, figuring out precisely how to use the number of curves on a given four-fold to break its Hodge structure into atoms. But they struggled to find a formula to describe how the atoms could then be transformed.

What they didn’t know was that a mathematician who had attended Kontsevich’s lecture in Moscow — Hiroshi Iritani of Kyoto University — had also started pursuing such a formula. “He was enchanted by my conjecture,” Kontsevich said. “I didn’t know, but he started to work on it.”

In July 2023, Iritani proved a formula for how the atoms would change as four-folds were mapped to new spaces. It didn’t give quite as much information as Kontsevich and his colleagues needed, but over the next two years, they figured out how to hone it. They then used their new formula to show that four-folds would always have at least one atom that couldn’t be transformed to match simple 4D space. Four-folds weren’t parameterizable.

Still Processing

When the team posted their proof in August, many mathematicians were excited. It was the biggest advance in the classification project in decades, and hinted at a new way to tackle the classification of polynomial equations well beyond four-folds.

But other mathematicians weren’t so sure. Six years had passed since the lecture in Moscow. Had Kontsevich finally made good on his promise, or were there still details to fill in?

And how could they assuage their doubts, when the proof’s techniques were so completely foreign — the stuff of string theory, not polynomial classification? “They say, ‘This is black magic, what is this machinery?’” Kontsevich said.

“Suddenly they come with this completely new approach, using tools that were previously widely believed to have nothing to do with this subject,” said Shaoyun Bai of the Massachusetts Institute of Technology. “The people who know the problem don’t understand the tools.”

Bai is one of several mathematicians now trying to bridge this gap in understanding. Over the past few months, he has co-organized a “reading seminar” made up of graduate students, postdoctoral researchers and professors who hope to make sense of the new paper. Each week, a different mathematician digs into some aspect of the proof and presents it to the rest of the group.

But even now, after 11 of these 90-minute sessions, the participants still feel lost when it comes to major details of the proof. “The paper contains brilliant original ideas,” Bai said, which “require substantial time to absorb.”

Similar reading groups have been congregating in Paris, Beijing, South Korea and elsewhere. “People all over the globe are working on the same paper right now,” Stellari said. “That’s a special thing.”

Hassett likens it to Grigori Perelman’s 2003 proof of the Poincaré conjecture, which also used entirely new techniques to solve a famous problem. It was only after other mathematicians reproduced Perelman’s proof using more traditional tools that the community truly accepted it.

“There will be resistance,” Katzarkov said, “but we did the work, and I’m sure it’s correct.” He and Kontsevich also see it as a major win for the mirror symmetry program: While they’re not closer to proving it, the result provides further evidence that it’s true.

“I’m very old, and very tired,” Katzarkov said. “But I’m willing to develop this theory as long as I’m alive.”

KDE Gear 25.12 released

Linux Weekly News
lwn.net
2025-12-12 16:13:49
KDE has announced the release of KDE Gear 25.12. This release adds more "extractors" to the Itinerary travel-assistant application, improved Git support in the Kate text editor, better PDF export in Konqueror, and much more. See the changelog for all new features, improvements, and bug fix...
Original Article

[Posted December 12, 2025 by jzb]

KDE has announced the release of KDE Gear 25.12. This release adds more "extractors" to the Itinerary travel-assistant application, improved Git support in the Kate text editor, better PDF export in Konqueror , and much more. See the changelog for all new features, improvements, and bug fixes.



Epic celebrates "the end of the Apple Tax" after court win in iOS payments case

Hacker News
arstechnica.com
2025-12-12 16:04:16
Comments...
Original Article

Back in April, District Court Judge Yvonne Gonzalez Rogers delivered a scathing judgment finding that Apple was in “willful violation” of her 2021 injunction intended to open up iOS App Store payments. That contempt of court finding has now been almost entirely upheld by the Ninth Circuit Court of Appeals, a development that Epic Games’ Tim Sweeney tells Ars he hopes will “do a lot of good for developers and start to really change the App Store situation worldwide, I think.”

The ruling , signed by a panel of three appellate court judges, affirmed that Apple’s initial attempts to charge a 27 percent fee to iOS developers using outside payment options “had a prohibitive effect, in violation of the injunction.” Similarly, Apple’s restrictions on how those outside links had to be designed were overly broad; the appeals court suggests that Apple can only ensure that internal and external payment options are presented in a similar fashion.

The appeals court also agreed that Apple acted in “bad faith” by refusing to comply with the injunction, rejecting viable, compliant alternatives in internal discussions. And the appeals court was also not convinced by Apple’s process-focused arguments, saying the district court properly evaluated materials Apple argued were protected by attorney-client privilege.

While the district court barred Apple from charging any fees for payments made outside of its App Store, the appeals court now suggests that Apple should still be able to charge a “reasonable fee” based on its “actual costs to ensure user security and privacy.” It will be up to Apple and the district court to determine what that kind of “reasonable fee” should look like going forward.

Speaking to reporters Thursday night, though, Epic founder and CEO Tim Sweeney said he believes those should be “super super minor fees,” on the order of “tens or hundreds of dollars” every time an iOS app update goes through Apple for review. That should be more than enough to compensate the employees reviewing the apps to make sure outside payment links are not scams and lead to a system of “normal fees for normal businesses that sell normal things to normal customers,” Sweeney said.

A Code Centric Journey Into the Gleam Language

Lobsters
www.youtube.com
2025-12-12 16:00:36
Comments...

How Mayor Mamdani Could Turn NYPD Parking Spots Into Apartments

hellgate
hellgatenyc.com
2025-12-12 15:58:12
And other housing ideas for the new administration's first 100 days....
Original Article

When Mayor-elect Zohran Mamdani takes office on January 1, he will immediately feel the crushing weight of the housing crisis, including 25 years of astronomical rent inflation and 350,000 New Yorkers who don't have homes. Like all mayors before him, he will not have control over tariffs , the cost of multifamily lending, or other global factors that shape our financialized housing system. Instead, Mayor Mamdani will confront a Rube Goldberg machine of contradictory municipal laws; messy inter- and intra-agency dynamics; complex and contradictory federal, state, and local jurisdictional hierarchies; and complicated relationships that help shape New York City's housing landscape.

Mamdani's campaign astutely identified the Rent Guidelines Board as one important piece of the housing affordability puzzle that would allow him to make the lives of 996,600 households living in rent-stabilized apartments easier with a rent freeze. Interestingly, Eric Adams's efforts to shield his executive power with a charter revision commission and to generate a development frenzy for his second term with the City of Yes zoning reforms will make Mamdani's other housing promise—200,000 new, permanently affordable units over the next decade —a little bit easier to achieve. And a last-minute lawmaking push by the outgoing City Council may give the incoming administration additional tools, including an overhauled municipal foreclosure system and a legal pathway for community purchases of some multifamily buildings.

However, all housing policies and programs run into a temporal problem: the promise of an affordable apartment by 2036 is cold comfort if you are struggling to pay rent now. Even Mamdani's signature proposal to freeze stabilized rents would not go into effect until October 2026 (or October 2027 , if Mayor Adams and First Deputy Mayor Randy Mastro 's final "fuck you" to the city's tenants comes to pass ).

Give us your email to read the full story

Sign up now for our free newsletters.

Sign up

Framework Raises DDR5 Memory Prices by 50% for DIY Laptops

Hacker News
www.phoronix.com
2025-12-12 15:58:10
Comments...
Original Article

HARDWARE

Framework Computer had worked to keep their memory prices lower than other laptop vendors amid the ongoing memory shortages throughput the industry worldwide. But today they've finally had to cave in and increase their DDR5 memory modules for the Framework Laptop DIY Editions by 50%.

Due to the ongoing price hikes around system memory with shortages throughout the supply chain, Framework raised their DDR5 memory options today by 50% for the Framework Laptop DIY Edition. Framework Computer is keeping the prior prices for existing pre-orders and also is foregoing any price changes for their pre-built laptops or the Framework Desktop. Framework Computer also lets you order DIY laptops without any memory at all if so desired for re-using existing modules or should you score a deal elsewhere.

Framework inside with DDR5 memory

Due to their memory pricing said to be more competitive below market rates, they also adjusted their return policy to prevent scalpers from purchasing DIY Edition laptops with memory while then returning just the laptops. The DDR5 must be returned now with DIY laptop order returns.

More details on Framework Computer needing to begin raising system memory prices can be found via the Framework Blog .

OpenAI Releases GPT-5.2

Daring Fireball
openai.com
2025-12-12 15:53:32
OpenAI: In ChatGPT, GPT‑5.2 Instant, Thinking, and Pro will begin rolling out today, starting with paid plans. In the API, they are available now to all developers. Overall, GPT‑5.2 brings significant improvements in general intelligence, long-context understanding, agentic tool-calling, and vi...

I couldn't find a logging library that worked for my library, so I made one

Lobsters
hackers.pub
2025-12-12 15:44:31
Comments...
Original Article

When I started building Fedify , an ActivityPub server framework, I ran into a problem that surprised me: I couldn't figure out how to add logging.

Not because logging is hard—there are dozens of mature logging libraries for JavaScript. The problem was that they're primarily designed for applications , not for libraries that want to stay unobtrusive.

I wrote about this a few months ago , and the response was modest—some interest, some skepticism, and quite a bit of debate about whether the post was AI-generated. I'll be honest: English isn't my first language, so I use LLMs to polish my writing. But the ideas and technical content are mine.

Several readers wanted to see a real-world example rather than theory.

The problem: existing loggers assume you're building an app

Fedify helps developers build federated social applications using the ActivityPub protocol. If you've ever worked with federation, you know debugging can be painful. When an activity fails to deliver, you need to answer questions like:

  • Did the HTTP request actually go out?
  • Was the signature generated correctly?
  • Did the remote server reject it? Why?
  • Was there a problem parsing the response?

These questions span multiple subsystems: HTTP handling, cryptographic signatures, JSON-LD processing, queue management, and more. Without good logging, debugging turns into guesswork.

But here's the dilemma I faced as a library author: if I add verbose logging to help with debugging, I risk annoying users who don't want their console cluttered with Fedify's internal chatter. If I stay silent, users struggle to diagnose issues.

I looked at the existing options. With winston or Pino, I would have to either:

  • Configure a logger inside Fedify (imposing my choices on users), or
  • Ask users to pass a logger instance to Fedify (adding boilerplate)

There's also debug , which is designed for this use case. But it doesn't give you structured, level-based logs that ops teams expect—and it relies on environment variables, which some runtimes like Deno restrict by default for security reasons.

None of these felt right. So I built LogTape —a logging library designed from the ground up for library authors. And Fedify became its first real user.

The solution: hierarchical categories with zero default output

The key insight was simple: a library should be able to log without producing any output unless the application developer explicitly enables it.

Fedify uses LogTape's hierarchical category system to give users fine-grained control over what they see. Here's how the categories are organized:

Category What it logs
["fedify"] Everything from the library
["fedify", "federation", "inbox"] Incoming activities
["fedify", "federation", "outbox"] Outgoing activities
["fedify", "federation", "http"] HTTP requests and responses
["fedify", "sig", "http"] HTTP Signature operations
["fedify", "sig", "ld"] Linked Data Signature operations
["fedify", "sig", "key"] Key generation and retrieval
["fedify", "runtime", "docloader"] JSON-LD document loading
["fedify", "webfinger", "lookup"] WebFinger resource lookups

…and about a dozen more. Each category corresponds to a distinct subsystem.

This means a user can configure logging like this:

await configure({
  sinks: { console: getConsoleSink() },
  loggers: [
    // Show errors from all of Fedify
    { category: "fedify", sinks: ["console"], lowestLevel: "error" },
    // But show debug info for inbox processing specifically
    { category: ["fedify", "federation", "inbox"], sinks: ["console"], lowestLevel: "debug" },
  ],
});

When something goes wrong with incoming activities, they get detailed logs for that subsystem while keeping everything else quiet. No code changes required—just configuration.

Request tracing with implicit contexts

The hierarchical categories solved the filtering problem, but there was another challenge: correlating logs across async boundaries.

In a federated system, a single user action might trigger a cascade of operations: fetch a remote actor, verify their signature, process the activity, fan out to followers, and so on. When something fails, you need to correlate all the log entries for that specific request.

Fedify uses LogTape's implicit context feature to automatically tag every log entry with a requestId :

await configure({
  sinks: {
    file: getFileSink("fedify.jsonl", { formatter: jsonLinesFormatter })
  },
  loggers: [
    { category: "fedify", sinks: ["file"], lowestLevel: "info" },
  ],
  contextLocalStorage: new AsyncLocalStorage(),  // Enables implicit contexts
});

With this configuration, every log entry automatically includes a requestId property. When you need to debug a specific request, you can filter your logs:

jq 'select(.properties.requestId == "abc-123")' fedify.jsonl

And you'll see every log entry from that request—across all subsystems, all in order. No manual correlation needed.

The requestId is derived from standard headers when available ( X-Request-Id , Traceparent , etc.), so it integrates naturally with existing observability infrastructure.

What users actually see

So what does all this configuration actually mean for someone using Fedify?

If a Fedify user doesn't configure LogTape at all, they see nothing. No warnings about missing configuration, no default output, and minimal performance overhead—the logging calls are essentially no-ops.

For basic visibility, they can enable error-level logging for all of Fedify with three lines of configuration. When debugging a specific issue, they can enable debug-level logging for just the relevant subsystem.

And if they're running in production with serious observability requirements, they can pipe structured JSON logs to their monitoring system with request correlation built in.

The same library code supports all these scenarios—whether the user is running on Node.js, Deno, Bun, or edge functions, without extra polyfills or shims. The user decides what they need.

Lessons learned

Building Fedify with LogTape taught me a few things:

Design your categories early. The hierarchical structure should reflect how users will actually want to filter logs. I organized Fedify's categories around subsystems that users might need to debug independently.

Use structured logging. Properties like requestId , activityId , and actorId are far more useful than string interpolation when you need to analyze logs programmatically.

Implicit contexts turned out to be more useful than I expected. Being able to correlate logs across async boundaries without passing context manually made debugging distributed operations much easier. When a user reports that activity delivery failed, I can give them a single jq command to extract everything relevant.

Trust your users. Some library authors worry about exposing too much internal detail through logs. I've found the opposite—users appreciate being able to see what's happening when they need to. The key is making it opt-in.

Try it yourself

If you're building a library and struggling with the logging question—how much to log, how to give users control, how to avoid being noisy—I'd encourage you to look at how Fedify does it.

The Fedify logging documentation explains everything in detail. And if you want to understand the philosophy behind LogTape's design, my earlier post covers that.

LogTape isn't trying to replace winston or Pino for application developers who are happy with those tools. It fills a different gap: logging for libraries that want to stay out of the way until users need them. If that's what you're looking for, it might be a better fit than the usual app-centric loggers.

Berlin Approves New Expansion of Police Surveillance Powers

Hacker News
reclaimthenet.org
2025-12-12 15:29:46
Comments...
Original Article

Berlin’s regional parliament has passed a far-reaching overhaul of its “security” law, giving police new authority to conduct both digital and physical surveillance.

The CDU-SPD coalition, supported by AfD votes, approved the reform of the General Security and Public Order Act (ASOG) , changing the limits that once protected Berliners from intrusive policing.

Interior Senator Iris Spranger (SPD) argued that the legislation modernizes police work for an era of encrypted communication, terrorism, and cybercrime. But it undermines core civil liberties and reshapes the relationship between citizens and the state.

One of the most controversial elements is the expansion of police powers under paragraphs 26a and 26b. These allow investigators to hack into computers and smartphones under the banner of “source telecommunications surveillance” and “online searches.”

Police may now install state-developed spyware, known as trojans, on personal devices to intercept messages before or after encryption.

If the software cannot be deployed remotely, the law authorizes officers to secretly enter a person’s home to gain access.

This enables police to install surveillance programs directly on hardware without the occupant’s knowledge. Berlin had previously resisted such practices, but now joins other federal states that permit physical entry to install digital monitoring tools.

More: Germany Turns Its Back on Decades‑Old Privacy Protections with Sweeping Surveillance Bill

IT security experts caution that maintaining hidden system vulnerabilities for state use exposes everyone to greater cyber risk. They also question the constitutional legitimacy of combining digital espionage with physical intrusion into private homes.

The revised law also changes how police use body cameras. Paragraph 24c permits activation of bodycams inside private homes when officers believe there is a risk to life or limb.

The government presents this as a measure for officer safety, but many view it as an open door to video surveillance within citizens’ most private settings.

Paragraph 26e expands “cell tower queries,” allowing police to obtain data on every mobile phone connected to a specific tower during a chosen timeframe.

This form of data collection can identify the movements of thousands of uninvolved individuals, including people who might simply have attended a protest.

Under paragraph 24d, automatic license plate recognition systems will be used to record and cross-check vehicle plates with databases. Paragraph 24h also grants police the ability to neutralize or even take control of drones in certain situations.

Paragraph 28a introduces biometric face and voice matching, using publicly available information from the internet.

This gives Berlin’s police the ability to compare surveillance footage with images posted on social media platforms. This as a major step toward automated identification of individuals in public life.

A further innovation, paragraph 42d, authorizes the use of real investigative data, such as photos, videos, and text messages, for “training and testing” artificial intelligence systems.

This breaks the principle that data collected for one purpose cannot later be reused. Because AI models can reveal patterns from the original material, this clause risks turning police archives into training sets for machine learning systems.

The law also lengthens preventive detention periods. Under paragraph 33, individuals may now be held for up to five days, or up to seven in terrorism-related cases.

Lawmakers discussed this provision in connection with protests by the environmental group “Last Generation,” whose civil resistance actions have triggered repeated detentions.

The group NoASOG denounced the law as an attack on civil society, while the Society for Civil Rights (GFF) announced plans to prepare a constitutional complaint.

Berlin’s data protection commissioner, Meike Kamp, had already warned that approving the state trojan amounts to “a frontal attack on the IT security of all citizens.” She said the overall framework creates “a constitutionally highly questionable density of surveillance.”

Berlin now joins the list of German states that have widened police authority in recent years, but the scope of this legislation stands out. It links physical home entry, digital interception, and artificial intelligence analysis under one legal structure, reducing the barriers between policing and private life.

The range of new powers granted to police shifts the balance decisively toward state control of personal information.

Berlin is a city once known for strong privacy traditions and the ASOG reform marks a decisive moment. Whether it withstands constitutional review will determine how far Germany’s commitment to individual privacy can bend in the name of security.

Kali Linux 2025.4 released with 3 new tools, desktop updates

Bleeping Computer
www.bleepingcomputer.com
2025-12-12 15:27:16
Kali Linux has released version 2025.4, its final update of the year, introducing three new tools, desktop environment improvements, and enhanced Wayland support. [...]...
Original Article

Kali Linux

Kali Linux has released version 2025.4, its final update of the year, introducing three new tools, desktop environment improvements, and enhanced Wayland support.

Kali Linux is a distribution designed for cybersecurity professionals and ethical hackers to perform red-teaming, penetration testing, security assessments, and network research.

The distribution is available as an installable operating system or a live environment and supports a wide range of hardware, including Raspberry Pi devices and compatible Android phones through Kali NetHunter.

New tool added to Kali Linux 2025.4

Every new Kali release brings a few fresh tools to play with, and this update is no exception.

This time, we're getting three new additions:

  • bpf-linker - Simple BPF static linker
  • evil-winrm-py - Python-based tool for executing commands on remote Windows machines using the WinRM
  • hexstrike-ai - MCP server that lets AI agents autonomously run tools

Desktop environment updates

Kali Linux 2025.4 brings many new updates to its desktop environments, including Gnome 49, KDE Plasma, and Xfce.

GNOME 49 includes refreshed themes, a new Showtime video player, reorganized tool folders in the app grid, and new shortcuts for quickly opening a terminal. GNOME also entirely removes X11 support in this release, now running solely on Wayland.

The developers also added support for keyboard shortcuts to open a terminal quickly.

"Another quality-of-life improvement is the addition of a shortcut to quickly open a terminal (finally!), using Ctrl+Alt+T or Win+T - just like in our other desktops," explains the Kali Linux 2025.4 announcement .

Gnome app grid layout
Gnome app grid layout
Source: Kali

KDE Plasma has been updated to version 6.5, introducing improved window tiling, an enhanced screenshot tool, easier clipboard access, and more flexible fuzzy search in KRunner.

Xfce now supports color themes that offer functionality similar to that already available in GNOME and KDE, allowing users to adjust icons and interface colors more easily.

With GNOME now running entirely on Wayland, the Kali Linux team has added full VM guest utilities support for VirtualBox, VMware, and QEMU.

Kali Nethunter updates

Kali NetHunter received new updates with this release, including expanded device support for Android 16 on the Samsung Galaxy S10 and the OnePlus Nord, and Android 15 on Xiaomi Mi 9.

The NetHunter Terminal has also been restored with updated compatibility for Magisk versions that use interactive mode. This prevents terminal sessions from closing when pressing CTRL+C.

Wifipumpkin3 also sees enhancements, including updated phishing templates and the addition of a preview tab in the NetHunter app.

Other changes

This release also includes additional updates and improvements, including:

  • The Kali Live image is now distributed only via BitTorrent, as its size has grown too large for traditional HTTP downloads.
  • Three new community mirrors have been added in Asia and one in the United States to improve download availability.
  • Kali Cloud and the Kali WSL app received several behind-the-scenes improvements and reliability fixes.

How to get Kali Linux 2025.4

To start using Kali Linux 2025.4, you can upgrade your existing installation, select a platform , or directly download ISO images for new installs and live distributions.

For those updating from a previous version, you can use the following commands to upgrade to the latest version.

echo "deb http://http.kali.org/kali kali-rolling main contrib non-free non-free-firmware" | sudo tee /etc/apt/sources.list

sudo apt update && sudo apt -y full-upgrade

cp -vrbi /etc/skel/. ~/

[ -f /var/run/reboot-required ] && sudo reboot -f

If you are running Kali on the Windows Subsystem for Linux, upgrade to WSL2 for a better experience, which includes the ability to use graphical apps.

You can check the WSL version used by Kali with the 'wsl -l -v' command in a Windows command prompt.

Once done upgrading, you can check if the upgrade was successful by using the following command:

grep VERSION /etc/os-release

You can view the complete changelog for Kali 2025.4 on Kali's website.

tines

Break down IAM silos like Bitpanda, KnowBe4, and PathAI

Broken IAM isn't just an IT problem - the impact ripples across your whole business.

This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.

CM0 – a new Raspberry Pi you can't buy

Hacker News
www.jeffgeerling.com
2025-12-12 15:19:19
Comments...
Original Article

Raspberry Pi CM0

This little postage stamp is actually a full Raspberry Pi Zero 2, complete with eMMC storage and WiFi.

But you can't get one. Well, not unless you buy the CM0NANO development board from EDAtec , or you live in China.

This little guy doesn't have an HDMI port, Ethernet, or even USB. It's a special version of the 'Compute Module' line of boards. Little Raspberry Pi 'System on Modules' (SoMs), they're called.

Compute Modules are entire Linux computers about the size of a regular desktop CPU that you 'plug in' to another board, to give it life.

Compute modules are everywhere, in kiosks, signage, 3D printers , and even the new Ableton Move . If you just need a little bit of Linux for networking and remote control, these are perfect for that.

And the CM0 is now the smallest version, a little bigger than a postage stamp.

Raspberry Pi CM0 back - castellated edges

But unlike all the other Compute Modules, the CM0 has castellated edges like a Pico. That way, a company integrating this into their product can just pick and place it and solder it onto their main PCB, instead of working with more delicate board-to-board connectors.

But why is this only in China? I'll get to that, but first I wanted to thank EDAtec for sending a CM0 and their CM0NANO dev board for testing. Without them, I don't think I'd ever be able to show these Pis to you.

Video

I posted this story to my YouTube channel, but if you're on the blog already, chances are you favor reading over video, so scroll on!

ED-CM0NANO

EDAtec's CM0NANO seems to be the official IO board for the CM0. It breaks out every feature on the RP3A0 chip at the heart of the Pi Zero 2 and CM0.

EDAtec CM0NANO with Pi CM0

There's 10/100 Ethernet through a little USB to Ethernet chip ( CoreChips SR9900A ), two USB 2.0 ports, full-size HDMI, and USB-C for power and flashing the eMMC. Then there are display and camera connectors, GPIO, and a few more headers.

To flash the onboard eMMC, I had to switch the RPI_BOOT_SW switch towards the RTC battery slot, then use rpiboot to mount it on my Mac. Then I used Raspberry Pi Imager to flash Pi OS 13 on it.

The eMMC on here is very slow compared to what I'm used to with the Pi 5 generation, like on the CM5. Its top speed seems to be around 19-20 MB/sec.

Once it's flashed, it's a full Linux computer, complete with Raspberry Pi's desktop environment.

EDAtec has a firmware support package you can install from their package repository, and once that's done, I did what nobody should do on this small of a computer: fired up Chromium.

Browsing the web on here is almost completely out of the question, since it only has 512 Megs of RAM—which is so little it pops a warning saying Chromium should only be used with 1 GB of more of RAM!

I did try browsing this website, and it took something like a minute to just quit the browser, after I was clicking the X to close the tab over and over again!

But with WiFi, Ethernet, USB, HDMI, and everything else the Pi ecosystem has to offer, some products that just want to slap a well-supported Linux environment on top of their product (and not integrate an SoC, memory, storage, and wireless chip) now have this.

Global distribution possibilities

Do I think companies and makers here in the US and over in other parts of the world would also benefit from the CM0? Yes. Do I think it'll happen? Doubtful.

The Zero 2 W and CM0 share something in common, besides their entire architecture:

When Hackster asked Eben Upton about global availability , he was noncommittal:

No plans to make it available outside China at the moment, but we'll see how we get on.

That was back before the RAM shortages got bad.

Pi Zero 2 W and CM0

I followed up asking a Pi engineer about it, and it sounds like one big problem is the RP3A0 chip that integrates an LPDDR2 RAM chip stacked on top of the Pi's SoC.

He said the CM0 would compete with Pi Zero 2 for LPDDR2 memory, which is in shorter supply these days (it's not being produced anymore, so stocks will only become more limited over time), and they want to make sure the popular Zero 2 W can stay in stock for makers and education.

The CM0 is targeted squarely at the lower end market, integrated into products built on assembly lines. So because of that, it's anyone's guess if the CM0 will ever make it out of China.

I'm not doing a full review of the board here , because:

  1. It's practically the same as the Pi Zero 2 W, which I already reviewed .
  2. It's not like you can get one (standalone, at least) anyway, at least not for the foreseeable future.

I think there was a chance, before the DRAM manufacturers went all-in on an AI cash grab, but for now, stick to the Pi Zero 2's that you're used to.

You can find a little more detail and benchmark results on my sbc-reviews issue for the CM0 .

In a shocking twist, Keir Starmer’s TikToks are borderline competent

Guardian
www.theguardian.com
2025-12-12 15:08:05
The PM’s social media sortie has not been a total embarrassment, which may be a shame for him The scene opens on the interior of an aeroplane. A suited man in a luxurious seat looks pensively out the window, his face partially obscured, his chin delicately resting on his hand. Continue reading......
Original Article

The scene opens on the interior of an aeroplane.

A suited man in a luxurious seat looks pensively out the window, his face partially obscured, his chin delicately resting on his hand.

Dreamy synths reverberate as the camera pans to show a fighter jet, hovering above the clouds just past the plane’s wing.

It turns and flies away, its dark shadow set against the warm yellow sunset.

“I’d explain, but it’s classified,” the TikTok video’s caption reads, the username above revealing the identity of the mystery man: Keir Starmer .

In the comment section, one user puts a voice to the question on a thousand lips.

“Why is our prime minister aura farming?”

Allow TikTok content?

This article includes content provided by TikTok . We ask for your permission before anything is loaded, as they may be using cookies and other technologies. To view this content, click 'Allow and continue' .

When the UK prime minister launched his TikTok account earlier this week, I assumed we’d get the same slate of cringeworthy content that so many elected officials have given us before.

Stiff line delivery, policy talking points awkwardly shoehorned into already outdated memes, and the general feeling a PR person is holding them at gunpoint just out of shot.

Alas, no. In a shocking twist, Starmer’s TikToks are borderline competent.

The majority of the videos seem to be attempts at ultra short-form cinéma vérité: a camera operator following the prime minister around, catching snippets of him saying good morning to security guards, questioning where chief mouser, Larry the cat, is and greeting the Ukrainian president, Volodymyr Zelenskyy.

Allow TikTok content?

This article includes content provided by TikTok . We ask for your permission before anything is loaded, as they may be using cookies and other technologies. To view this content, click 'Allow and continue' .

The “peek behind the curtain” style is clearly designed to make the prime minister feel more relatable to young UK voters, and while there’s definitely potential here, all his videos share the same fatal flaw.

Starmer cares about looking cool.

“Aura farming” is an internet term for someone posting content trying to seem effortlessly suave, handsome or charismatic.

And look, for my own sanity, I have to assume Starmer’s team was being tongue-in-cheek when they captioned that plane video: “I’d explain, but it’s classified” – that they were poking fun at people trying to seem cool on the internet.

But the more I look through his TikToks, with every shot so carefully curated to make Starmer seem competent and in control, the more I began to feel the accusation of “aura farming” fitted.

And on a platform such as TikTok, which trades off vulnerability and intimacy, being caught trying to seem aloof is a crime worse than murder. (Or at least worse than the “millennial pause”, and that’s pretty bad.)

There are some rare examples of politicians feeling authentically at home on the app: in the US, Alexandria Ocasio-Cortez has found great success speaking frankly to her iPhone camera from her living room couch. Even a lower-profile politician such as the Australian MP Julian Hill has cultivated a dedicated following by sharing his frustrations with the opposition from his cluttered parliamentary office.

Allow TikTok content?

This article includes content provided by TikTok . We ask for your permission before anything is loaded, as they may be using cookies and other technologies. To view this content, click 'Allow and continue' .

The politicians that truly succeed on TikTok are the ones where you can suspend your disbelief just enough to believe they’re actually hitting “post” themselves. Where a little part of you is holding out hope that they might actually reply to your comment.

But Starmer never gets within a metre of the camera lens, let alone a comment section keyboard. A style, no doubt, influenced by the fact that TikTok is technically banned on government phones, due to data security concerns, and his team is, no doubt, terrified to imply he might actually have the app downloaded.

Allow TikTok content?

This article includes content provided by TikTok . We ask for your permission before anything is loaded, as they may be using cookies and other technologies. To view this content, click 'Allow and continue' .

Numbers wise, the videos are going well, two of them already crack 1m views, but that lack of intimacy comes at a cost. The comments under any politician’s post are going to be filled with far more vitriol than praise – that’s just how the internet works. What’s notable about Starmer’s is just how generic the comments are.

It’s all “get this clown out”, “vote reform” and the occasional “best prime minister ever”, but barely any mention of the actual content at hand.

Because ultimately, the videos don’t have any content – besides a fleeting sense of novelty, there’s no reason I would ever send them to friends, let alone bring them up at the pub. These videos only exist to prove what a cool guy Starmer is. And he isn’t.

So no, the UK prime minister’s first foray into the world of TikTok hasn’t been an utter embarrassment. But it might have been better if it was.

Like most politicians, Starmer is an innately dorky man and if he is really serious about winning the hearts (and votes) of young people, his TikTok needs to embrace and celebrate that, not unconvincingly hide it away.

I have some ideas for what he could do, and I would explain, but hey, it’s classified.