CiviCRM Scotland Meetup, Thursday 29 January 2026

CiviCRM
civicrm.org
2025-12-09 11:10:13
In this Meetup, we explore CiviBooking - an extension for organisations that hire out rooms or resources that they want to track through CiviCRM. We're delighted that Mathieu Lu, Co-founder at Coop SymbioTIC (Montreal) and a maintainer of the CiviBooking extension, will join us to discuss what CiviB...
Original Article

In this Meetup, we explore CiviBooking - an extension for organisations that hire out rooms or resources that they want to track through CiviCRM. We're delighted that Mathieu Lu, Co-founder at Coop SymbioTIC (Montreal) and a maintainer of the CiviBooking extension, will join us to discuss what CiviBooking can do.

For existing CiviCRM users, there will be opportunities to meet and discuss CiviCRM with other organisations using the software in their day-to-day work, and to ask questions of experts.

You are invited to join us in person or online. The event is free, conveniently situated at The Melting Pot, next to Edinburgh Waverley train station - and there will be tea and biscuits!

Hosted by CiviCRM Bronze Partner, Pooka & Co .

[$] LWN.net Weekly Edition for December 11, 2025

Linux Weekly News
lwn.net
2025-12-11 00:18:09
Inside this week's LWN.net Weekly Edition: Front: Rust in CPython; Python frozendict; Bazzite; IETF post-quantum disagreement; Distrobox; 6.19 merge window; Leaving the TAB. Briefs: Let's Encrypt retrospective; PKI infrastructure; Rust in kernel to stay; CNA series; A...
Original Article

The page you have tried to view ( LWN.net Weekly Edition for December 11, 2025 ) is currently available to LWN subscribers only.

Reader subscriptions are a necessary way to fund the continued existence of LWN and the quality of its content.

If you are already an LWN.net subscriber, please log in with the form below to read this content.

Please consider subscribing to LWN . An LWN subscription provides numerous benefits, including access to restricted content and the warm feeling of knowing that you are helping to keep LWN alive.

(Alternatively, this item will become freely available on December 18, 2025)

Why no one talks about React2Shell?

Hacker News
elenacross7.medium.com
2025-12-10 23:58:38
Comments...

★ The Full Text of Marco Rubio’s Directive on State Department Typography, Re-Establishing Times New Roman

Daring Fireball
daringfireball.net
2025-12-10 23:53:51
Good on Rubio for rescinding a bad decision, and even better for doing so with a fair and informative explanation....
Original Article

I’m a big believer in reading original source material. For example, when Apple provided me, alongside only a handful of other outlets, with a statement regarding their decision to delay the “more personalized Siri” back in March, I ran the full statement , verbatim. I added my own commentary, but I wanted to let Apple’s own statement speak for itself first. It drives me nuts when news sites in possession of a statement or original document do not make the full original text available, even if only in a link at the bottom, and choose only to quote short excerpts.

With regard to today’s news regarding Marco Rubio’s directive re-establishing Times New Roman as the default font for U.S. State Department documents (rescinding the Biden administration’s 2023 change to Calibri), I very much wanted to read the original. The New York Times broke the news, stated that they had obtained the memo, and quoted phrases and words from it, but they did not provide a copy of the original.

The State Department has not made this document publicly available, and to my knowledge, no one else has published it. I have obtained a copy from a source, and have made it available here in plain text format . The only change I’ve made is to replace non-breaking spaces ( U+00A0 ) with regular spaces. 1

Please do read it yourself , and do so with an open mind.

It seems clear to me that The New York Times did Rubio dirty in their characterization of the directive. The Times story, credited to reporters Michael Crowley and Hamed Aleaziz, ran under the headline “ At State Dept., a Typeface Falls Victim in the War Against Woke ”, and opens thus:

Secretary of State Marco Rubio waded into the surprisingly fraught politics of typefaces on Tuesday with an order halting the State Department’s official use of Calibri, reversing a 2023 Biden-era directive that Mr. Rubio called a “wasteful” sop to diversity.

While mostly framed as a matter of clarity and formality in presentation, Mr. Rubio’s directive to all diplomatic posts around the world blamed “radical” diversity, equity, inclusion and accessibility programs for what he said was a misguided and ineffective switch from the serif typeface Times New Roman to sans serif Calibri in official department paperwork.

Rubio’s memo ran about 950 words. Here are the full quotes the Times pulled from it, consisting of just 56 words, aside from the memo’s subject line (“Return to Tradition: Times New Roman 14-Point Font Required for All Department Paper”):

“wasteful”

“radical”

“restore decorum and professionalism to the department’s written work.”

“informal”

“clashes”

“was not among the department’s most illegal, immoral, radical or wasteful instances of D.E.I.A.”

“accessibility-based document remediation cases”

“Switching to Calibri achieved nothing except the degradation of the department’s official correspondence.”

“generally perceived to connote tradition, formality and ceremony”

Rubio’s memo wasn’t merely “mostly framed as a matter of clarity and formality in presentation”. That’s entirely what the memo is about. Serif typefaces like Times New Roman are more formal. It was the Biden administration and then-Secretary of State Antony Blinken who categorized the 2023 change to Calibri as driven by accessibility. I do not have access to Blinken’s memo making that change (under the cringe-inducing subject line “The Times (New Roman) are a-Changin”), but it was first reported by John Hudson and Annabelle Timsit at The Washington Post , where they wrote:

The secretary’s decision was motivated by accessibility issues and not aesthetics, said a senior State Department official familiar with the change.

Rubio’s memo makes the argument — correctly — that aesthetics matter, and that the argument that Calibri was in any way more accessible than Times New Roman was bogus. Rubio’s memo does not lash out against accessibility as a concern or goal. He simply makes the argument that Blinken’s order mandating Calibri in the name of accessibility was an empty gesture. Purely performative, at the cost of aesthetics. Going back to that 2023 story at the Post, they quote from Blinken’s memo thus:

In its cable, the State Department said it was choosing to shift to 14-point Calibri font because serif fonts like Times New Roman “can introduce accessibility issues for individuals with disabilities who use Optical Character Recognition technology or screen readers. It can also cause visual recognition issues for individuals with learning disabilities,” it said.

The bit here about OCR is utter nonsense , a voodoo belief. No OCR or screen-reader software in use today has any problem whatsoever with Times New Roman. That’s just made-up nonsense, and I’d like to see sources for the claim about “visual recognition issues for individuals with learning disabilities”. I don’t think it’s true, and citing it alongside a provably wrong claim about OCR software makes me even more skeptical.

Rubio brings actual numbers to make his case, which is more than can be said for anyone I’ve found arguing that Calibri is somehow more accessible than Times New Roman. Rubio’s argument is alluded to in the Times’s article thus:

But Mr. Rubio called it a failure by its own standards, saying that “accessibility-based document remediation cases” at the department had not declined.

Here’s the full passage from Rubio’s memo:

And although switching to Calibri was not among the Department’s most illegal, immoral, radical, or wasteful instances of DEIA (see, e.g., Executive Orders 14151, 14173, 14281, and Memorandum on Removing Discrimination and Discriminatory Equity Ideology From the Foreign Service (DCPD202500375)) it was nonetheless cosmetic: the switch was promised to mitigate “accessibility issues for individuals with disabilities,” and employees were promised, “Your adoption supports the Department’s commitment to create a more accessible workplace,” but these promises were false. In fact, the number of accessibility-based document remediation cases at the Department of State was the same in the year after adopting Calibri as in the year before (1,192 cases in FY2024 versus 1,193 cases in FY2022). And the costs of remediation actually increased by $145,000 in that period — nearly a 20% jump. Switching to Calibri achieved nothing except the degradation of the Department’s official correspondence.

2024 was a Biden year, not a Trump year, so there’s no reason to think the remediation numbers were counted differently. The change to Calibri was the worst kind of accessibility effort: one that was founded on nothing more than feel-good performance. It was a change everyone could see and notice, but one that had no practical benefit whatsoever. Good on Rubio for rescinding a bad decision, and even better for doing so with a fair and informative explanation. (His memo even explains, “Fonts are specific variations of a typeface.... Through common use, the word font has come to mean both typeface and font.”)

Nature's many attempts to evolve a Nostr

Lobsters
newsletter.squishy.computer
2025-12-10 23:52:09
Comments...
Original Article

Here is the architecture of a typical app: a big centralized server in the cloud supporting many clients. The web works this way. So do apps.

This architecture grants the server total control over users. The server owns your data, owns your account, and owns the cryptographic keys used to secure it .

That last bit is obscure, but important. Cryptographic keys are how we enforce security, privacy, ownership, and control in software. Not your keys, not your data .

The architecture of apps is fundamentally feudal . Apps own the keys and use them to erect a cryptographic wall around the hoard of data us peasants produce. You “sign in” to cross the drawbridge, and the castle can pull up the drawbridge at any time, shutting you out.

"Centralization" is the state of affairs where a single entity or a small group of them can observe, capture, control, or extract rent from the operation or use of an Internet function exclusively.
( RFC 9518: Centralization, Decentralization, and Internet Standards )

Powerful network effects build up inside those castle walls. These network effects can be leveraged to generate further centralization, extract rents, and shut down competition.

We are seeing the consequences of this centralized architecture play out today, as platforms like the App Store enter their late-stage phase. When growth slows, the kings of big castles become bad emperors .

The Internet has succeeded in no small part because of its purposeful avoidance of any single controlling entity.
( RFC 9518: Centralization, Decentralization, and Internet Standards )

So, apps are centralized. How might we fix this? Well, the first thing we could do is bridge the gap between apps.

This is called federation . Users talk to the server, and servers talk to each other, trading messages so you can talk to users on other servers. Now you have the benefit of choice: which castle do you want to live in?

Email works this way. So do Mastodon and Matrix . My email is @gmail.com , yours @protonmail.com . We live on different domains, use different apps run by different companies, yet we can freely email each other.

The great thing about federation is that it’s easy to implement. It’s just an ordinary client-server architecture with a protocol bolted onto the back. We don’t have to build exotic technology, just exapt existing infrastructure . That’s why Mastodon, for example, is just an ordinary Ruby on Rails app .

But there’s a wrinkle…

Why does this happen? Well, networks centralize over time, converging toward an exponential distribution of size, power, wealth. This centralization is inevitable . You see it on the web, in social networks, airline routes, power grids, trains, banks, Bitcoin mining, protein interactions, ecological food webs, neural networks, and oligarchies. Network theory tells us why:

  • Preferential attachment : more connections means more network effect means more connections, leading to the emergence of densely-connected hub nodes.

  • N^2 scaling : if every fed has to talk to every other fed to exchange messages, the number of connections will scale exponentially with each additional node (n * (n -1)) . This leads to the emergence of large hubs that aggregate and relay world state.

  • Fitness pressure : Small nodes get taken down by large spikes in traffic, while large nodes stick around. Small nodes have fewer resources, large nodes have lots. Unreliable nodes attract fewer connections, while reliable nodes attract connections just by virtue of staying alive.

  • Efficiency : exponentially-distributed networks are ultra-small worlds. You can get from anywhere to anywhere in just a few hops through hubs.

  • Resilience : exponential networks survive random failures, because the chances are exponential that the node that fails will be from the long tail.

This is called the scale-free property , and it emerges in all evolving networks . Federated networks are no exception. Take email for example:

Email is not distributed anymore. You just cannot create another first-class node of this network.

Email is now an oligopoly, a service gatekept by a few big companies which does not follow the principles of net neutrality.

I have been self-hosting my email since I got my first broadband connection at home in 1999. I absolutely loved having a personal web+email server at home, paid extra for a static IP and a real router so people could connect from the outside. I felt like a first-class citizen of the Internet and I learned so much.

Over time I realized that residential IP blocks were banned on most servers. I moved my email server to a VPS. No luck. I quickly understood that self-hosting email was a lost cause. Nevertheless, I have been fighting back out of pure spite, obstinacy, and activism. In other words, because it was the right thing to do.

But my emails are just not delivered anymore. I might as well not have an email server.

( After self-hosting my email for twenty-three years I have thrown in the towel , Carlos Fenollosa, 2022)

We can see the outlines of a similar consolidation beginning to emerge in the Fediverse. In 2023, Facebook Threads implemented ActivityPub and it instantly became the largest node in the Fediverse. This made some people angry and lead to demands for defederation . But Threads is already over 10x larger than the rest of the Fediverse . Defederation is hardly an effective blockade. The network has consolidated. Network science strikes again.

At scale, federated systems experience many of the same problems as centralized apps. That’s because feds are still feudal. They own your data, they own your account, they own your keys.

Large feds occupy a strategically central location in the network topology, and they have powerful influence over the rest of the network. They can leverage their network effect to pull up the drawbridge, by inventing new features that don’t federate, or cutting off contact with other feds.

So, federated networks become oligopolies. We can choose our server, as long as it’s blessed by the oligopoly. Still, an oligopoly is better than a dictatorship, email better than Facebook. But can we do even better?

Ok, forget servers. What if we could connect to each other directly? This is called peer-to-peer networking.

In a P2P network, each participant runs a peer that can find other peers and send them messages. Users own their keys, and use them to sign, verify, and encrypt messages. This is great! We have all the ingredients for credible exit and minimal user agency .

However, P2P presents some tricky engineering challenges. There is no central source of truth, so various peers will will have different points of view of the network state. That means we need to design for eventual consistency and the ability to merge potentially conflicting states. Other things, like timestamps, are also hard. Decentralized protocols are hard ! All of this is headwind compared to ordinary app engineering.

We also run into some practical networking challenges. We no longer have centralized servers, so many requests take several hops, from peer-to-peer-to-peer, to get to their destination.

Also, peers are unreliable. They are bandwidth-constrained and blink in and out of existence. Close your laptop, your peer disappears. This adds a cost to peer discovery. You dial a previously available peer, but it’s gone now, so you dial another, and another. Unreliable peers plus multiple hops equals long delays, and occasionally, the inability to reach portions of the network.

The same evolutionary pressures that apply to other networks apply to P2P networks, and some of them, like fitness pressure on reliability , are exaggerated by peer availability. This leads to the evolution of superpeers : high-bandwidth, high-availability peers who’s job is to serve other peers on the network.

Peer-to-Peer (P2P) networks have grown to such a massive scale that performing an efficient search in the network is non-trivial. Systems such as Gnutella were initially plagued with scalability problems as the number of users grew into the tens of thousands. As the number of users has now climbed into the millions, system designers have resorted to the use of supernodes to address scalability issues and to perform more efficient searches.
(Hadaller, Regan, Russell, 2012. The Necessity of Supernodes )

Instead of connecting directly, we connect to one of the high-bandwidth, high-availability superpeers. Peer discovery is no longer a problem, and everything is just one or two hops away… an ultra-small world.

Wait… That just sounds like centralization with extra steps!

Like feds, superpeers occupy a strategically central location in the network topology, and have powerful influence over the rest of the network. Our P2P network has converged toward an exponential distribution. Network science strikes again.

Well, but on a P2P network we do own our keys, and this is a big improvement. Trustless protocols are better than trustful ones , and by owning our keys we have the foundations for minimal user agency .

Still, we’ve done a lot of hard engineering to support a flat P2P network that will never exist in the end. Is there a simpler way?

Let’s start at the end and work backwards.

  • All networks require large servers at scale

  • Not your keys, not your data

Can we design a distributed architecture that admits these two facts? What might such an architecture look like?

Take some ordinary, off-the-shelf servers. Treat them as dumb, untrusted pipes. Their job is just to relay information. They don’t own the keys—you own your keys. You sign messages with your key, then post them to one or more relays. Other users follow one or more relays. When they get a message, they use your key to verify you sent it. That’s it!

This is the Nostr protocol . I want to claim that Nostr has discovered a new fundamental architecture for distributed protocols. Not federated, not P2P… Relay .

Relays cut to the chase:

  • Relays are simple . They use boring technology, like plain old servers. You benefit from all of the tailwinds of traditional app development.

  • Relays take advantage of economies of scale . Big dumb servers in the cloud have high availability and high uptime, and they’re commodity infrastructure.

  • Relays sidestep the N^2 scaling problem : Relays don’t talk to each other, and users only need to join a small number of relays to gain autonomy—at least two, and certainly less than a dozen. We never really hit the scale where the n^2 scaling problem matters.

  • Relays support user-ownership . You own your data, your account, and most importantly, your keys . Relays are large, but they aren’t in charge. If a relay goes down or shuts you down, no problem! Your account doesn’t change, and your data is already mirrored to other relays. Credible exit !

…Most importantly, relays are what you would get in the end anyway . It’s fewer steps for the same result.

Google ads for shared ChatGPT, Grok guides push macOS infostealer malware

Bleeping Computer
www.bleepingcomputer.com
2025-12-10 23:50:56
A new AMOS infostealer campaign is abusing Google search ads to lure users into Grok and ChatGPT conversations that appear to offer "helpful" instructions but ultimately lead to installing the AMOS info-stealing malware on macOS. [...]...
Original Article

Google ads for shared ChatGPT, Grok guides push macOS infostealer malware

A new AMOS infostealer campaign is abusing Google search ads to lure users into Grok and ChatGPT conversations that appear to offer “helpful” instructions but ultimately lead to installing the AMOS info-stealing malware on macOS.

The campaign was first spotted by researchers at cybersecurity company Kaspersky yesterday, while Huntress managed security platform published a more detailed report earlier today.

The ClickFix attack begins with victims searching for macOS-related terms, such as maintenance questions, problem-solving, or for Atlas - OpenAI's AI-powered web browser for macOS.

Google advertisement link directly to ChatGPT and Grok conversations that had been publicly shared in preparation for the attack. The chats are hosted on the legitimate LLM platforms and contain the malicious instructions used to install the malware.

Malicious ChatGPT (left) and Grok (right) conversations
Malicious ChatGPT (left) and Grok (right) conversations
Source: Huntress

"During our investigation, the Huntress team reproduced these poisoned results across multiple variations of the same question, 'how to clear data on iMac,' 'clear system data on iMac,' 'free up storage on Mac,' confirming this isn't an isolated result but a deliberate, widespread poisoning campaign targeting common troubleshooting queries," Huntress researchers explain .

If users fall for the trick and execute the commands from the AI chat in macOS Terminal, a base64-encoded URL decodes into a bash script (update) that loads a fake password prompt dialog.

The bash script
The bash script
Source: Huntress

When the password is provided, the script validates, stores, and uses it to execute privileged commands, such as downloading the AMOS infostealer and executing the malware with root-level privileges.

AMOS was first documented in April 2023 . It is a malware-as-a-service (MaaS) operation that rents the infostealer $1,000/month, targeting macOS systems exclusively.

Earlier this year, AMOS added a backdoor module that lets operators execute commands on infected hosts, log key strokes, and drop additional payloads.

AMOS is dropped on /Users/$USER/ as a hidden file (.helper). When launched, it scans the applications folder for Ledger Wallet and Trezor Suite. If found, it overwrites them with trojanized versions that prompt the victim to enter their seed phrase "for security" reasons.

Looking for crypto wallet apps to overwrite
Replacing crypto wallet apps with trojanized versions
Source: Huntress

AMOS also targets cryptocurrency wallets from Electrum, Exodus, MetaMask, Ledger Live, Coinbase Wallet, and others; browser data such as cookies, saved passwords, autofill data, and session tokens; macOS Keychain data such as app passwords and Wi-Fi credentials; and files on the filesystem.

Persistence is achieved via a LaunchDaemon (com.finder.helper.plist) running a hidden AppleScript which acts as a watchdog loop, restarting the malware within one second if terminated.

These latest ClickFix attacks are yet another example of threat actors experimenting with new ways to exploit legitimate, popular platforms like OpenAI and X.

Users need to be vigilant and avoid executing commands they found online, especially if they don't fully understand what they do.

Kaspersky noted that, even after reaching these manipulated LLM conversations, a simple follow-up question asking ChatGPT if the provided instructions are safe to execute reveals that they aren't.

tines

Break down IAM silos like Bitpanda, KnowBe4, and PathAI

Broken IAM isn't just an IT problem - the impact ripples across your whole business.

This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.

Deconstructing the `CAP theorem' for CM and DevOps

Lobsters
markburgess.org
2025-12-10 23:34:53
Comments...
Original Article

Deconstructing the `CAP theorem' for CM and DevOps

Part 1: The Special Theory of Relativity for distributed systems

As software engineering and operations forge a new cultural bond around continuous improvement of applications and infrastructure, the database is something "dev" and "ops" have in common -- and there are things to learn from both perspectives on distributed data.

In computing, the so-called CAP theorem (1999-2002) has become both an icon and a bone of contention in the world of databases -- a supposed truth about distributed systems. A lot has been written about it since it was formulated, especially around the recent debates on `SQL/noSQL', but its many hearsay formulations are beset with a number of problems.

In this essay (in two parts), I want to explain the issues around CAP more carefully and add a broader perspective from the viewpoint of time. The aim is to talk about its potential impact on the world of infrastructure operations. I shall try to pick apart what CAP is about, what it does and doesn't mean, and what is important to understand in the discussion for DevOps.

*

[PART 1] [PART 2]

Synopsis

"I wrote you a long letter, as I didn't have time to write a short one."
--Mark Twain/B. Pascal??

  • The CAP story is about "consistency" or uniformity of information, i.e. what different parts of a system can agree upon at a given moment, and whether that information is available to others or not, given the effect of system boundaries (or "partitions") that prevent knowledge from spreading.
  • In relativity, information, location and time interact in sometimes unexpected ways.
  • The definition of consistency in distributed systems does not usually include the end user, which itself seems inconsistent. Thus a database can be defined as consistent and users can still end up with different answers beyond the system boundary. Why don't we care about this?
  • The notion of bringing about consistency hinges on the concept of availability, even in the trivial case where data are consistently unavailable, so these properties are inseparable. To define availability, we need an independent measure of time. Without availability, we cannot define "simultaneous" or "consistency".
  • All consistency is eventual in real time, i.e. the user has to wait for it (Re: ACID versus BASE in databases). Distributed consistency of information is a form of equilibration of the total system. This is the same concept of equilibrium as in thermodynamics.
  • Atomic transaction locks use control of availability as a throttle to allow equilibration to happen out of sight: "Don't peek until I'm ready!"
  • Systems that are changing so fast that information cannot travel to all parts of the system before another change enters, cannot be globally consistent, as equilibration takes longer than this. This is the trade-off between availability and consistency.
  • As systems get faster and more distributed, ultimately the effects of Einsteinian relativity will come into play and make the notion of consistency even more complex, as each users' experience of time and distance becomes unique.

Relativity

A plague upon us! Distributed multi-threading! Parallelism! Frames of reference!

Wrapping our heads around behaviours that happen in parallel, at multiple locations, and from different available views, is hard. In science, the analysis of this is called relativity theory, and many of the great minds of philosophy have struggled with versions of it, including Galileo, Einstein and Kripke. The problem is no easier in computing, but in distributed systems we have to do it all the time.

The so-called CAP theorem was a trigger that unleashed a public discussion about this in the distributed database world, and today its name is brandished both as a badge of honour, and even as a weapon, in debates about consistency of viewpoint in distributed systems -- what Einstein called simultaneity , and what databasers call distributed consistency .

Stories about simultaneity generally end up with the same answer in these theories: simultaneity is not a helpful concept in the world, and multiple agents, players, actors, or whatever we would call them, are doomed (indeed encouraged) to see the world from unique perspectives, and be contented to disagree about its consistency. Why? Because ultimately it is the responsibility of each observer of the world to find their own consistency from their own perspective, and based on what they can observe impartially about it. Sometimes, we will be able to calculate others' perspectives from our own, but only within the limits of communication, and no faster than the rate at which messages can be sent.

There is much to be learned from thinking about relativity (and I don't mean just Einstein's famous kind), and with all the attention given to CAP lately, I think that this point of view can help us to understand what CAP does and doesn't mean in a wider range of contexts than just databases. My plan then is to visit the issues, as originally presented for databases, and then see what this means in general for distributed infrastructure.

Brewer's conjecture

Originally a conjecture made during a keynote presentation in 2000 [1,10] (see numbered references at the end), what we refer to today as the "CAP theorem" was as a reaction to the developments in database clustering in the 1990s. "Brewer's conjecture", as Lynch and Gilbert called it[3], was a remark about how certain trade-offs must be made about a user's expectations of consistency and availability of information (see his later perspective on this in [2]). Lynch and Gilbert made a theorem inspired by it, in a paper in 2002[3].

Brewer's simple observation was that hoping for a perfect world with global and consistent viewpoints on knowledge was subject to certain limitations, especially if the parts of that world were distributed in a network with potentially imperfect connectivity. This much is indisputable.

The reading list at the end of this article is well worth digesting to get a full account from a variety of perspectives. I will not add much new to these articles, but hope for a slightly more plausible formulation of the issues, using Promise Theory as my guide.

Is it a theorem?

The first question to ask is: is the CAP theorem a real theorem? Alas it is not; it remains a "truism" or a hypothesis that is unprovable. Brewer's original conjecture has not been proven with mathematical rigour -- indeed, the formulation in terms of C, A and P is too imprecise for that to happen. So the status of CAP as a theorem is something of an urban myth, but read on.

The CAP conjecture motivated something following the forms of a theorem in [3], and by that we mean that -- for a given set of assumptions and definitions, a partial proof was demonstrated supporting about 1/6th of the asserted conjecture. However, if you read [3], you will see that what is proven is not really what is stated as being the CAP conjecture, Also, as a piece of scientific work, the paper is somewhat difficult to understand today, because it builds on a very particular set of unstated assumptions, jargon and mode of thinking that was rooted in the time of its writing. Moreover, it provides no precise definitions of any of the quantities referred to, particularly the much-discussed "P" ("tolerance" of packet loss). Part of the problem is that the conjecture is about network fault tolerance, but it never models the network at all.

So, as theorems go, [3] is not high art, either in its formulations or its proof technique, and what it proves is only implicitly related to the CAP conjecture, making the paper of limited value for posterity. But all is not lost.

In spite of this, CAP itself is not useless, as it has served the ultimate purpose of science: to stimulate thought. Various authors (see reading list) have proposed modifications or simplifications of CAP that bring different insights. Many of these viewpoints reflect the search for clarity in formulating the question itself -- which is what science is for.

Perhaps more seriously, "The CAP Theorem" is often touted as a reason for making certain technology trade-offs in actual systems, and thus we expect statements about it to not misrepresent reality. It is particularly cited in current discussions about "noSQL" databases, who have challenged traditional database transaction lore, even stirring up great offense amongst unilateralists. But let's be clear, even a fully proven theorem is not necessarily a justification for doing anything, it is simply a hopefully clear statement about something that tries to establish the boundaries of truth, within a framework of assumptions. What we do with that truth, and the extent to which we hold it as an authority, is our own business. So, with all that said, let's try to figure out what it all means, and how we can respond to it.

What is it really about?

The clearest exposition of what was intended by vintage CAP seems to be in [10], and we can retrofit the idea, without worrying about proof. Suppose we hand-wave the following ideas about data across a distributed system (imagine a number of distributed agents sharing data with one another, and with different parts of the system getting data updates randomly):

  • C: consistency/uniformity of data at all locations (no one lags behind updates once they have been made somewhere within the system - there is only one value for each key globally).
  • A: availability of the data service (we can access data in a reasonable time)
  • P: partition tolerance (if we break up the system into pieces that can't communicate, it continues to work "correctly")

If you think these ideas are vague, you are not alone. They are certainly unsuitable for theorem proving. For instance, are we talking about a single observer (user) making requests and a back-end of many servers with different data, or are we talking about replicated data and multiple users? The distinction turns out to matter for other definitions. Is the user considered to be part of the system or outside of it? How is a user different from, say an aggregator (like a front-end web-tier to a distributed database). These questions actually matter if you are going to formulate something precise enough to prove. In fact, much confusion seems to arise from thinking too much in terms of interfaces (APIs perhaps) as if they were adequate representatives of the underlying system.

Availability . For instance, "A" means the availability, but of what? The entire database, or just one particular part of it we need? How long should we wait for a response from a server before saying it is unavailable? What is the acceptable latency? If the whole thing goes away, then that seems to give a completely consistent view of the data (nothing). If only parts we don't need go away for a while, that would not affect an end-user, but it would be formally inconsistent. If a redundant replica (like a secondary DNS server) does not agree with the master source then it is not consistent, but it might be good enough for a while. When we think about what availability means, partial or global, we immediately wonder what consistency means.


Fig 1. Availability of all or part of the data, or agents.
What goes missing affects what we can say about consistency.

A similar problem comes up with "P" or partition tolerance. A partition is easy to define: a loss of connectivity, or some packet loss in the network. but tolerating that situation by behaving "correctly" is fraught with ambiguity -- again, the seriousness of situation and the "correct behaviour" depends on the kind of system we are thinking of.

Horizontal partition tolerance Suppose we have multiple data stores for load balancing, or backup (see fig 2), that are supposed to be synchronized. If an end user has access to both source and replica, but source and replica are partitioned (e.g. a firewall or bad network route), then the observer could see different data about the same keys/values. This kind of thing happens in DNS all the time, of course, and we live with it. Caching of data values anywhere leads to this issue. In a database about bank transactions, the problem is usually considered to be more serious.


Fig 2. Partition on internal consistency (can't back up or secure data redundancy).
We can still reach data, but if some systems suddenly failed, the backup would be out of sync.
The user might even be able to see both versions.

Transaction Locking or "atomicity" is the classical answer to consistency versus availability, i.e. copy-on-write (lock read/write access everywhere, update all copies, release lock) is the usual answer to distributed data. This introduces a lot of forced unavailability of data as distributed waiting grinds the system to a halt to brute-force global consistency.

Vertical partition tolerance If the partition actually prevents us from accessing any sub-part of a primary source (e.g. if we have horizontal scaling on the back end because a single server just can't contain all the data, or say multiple sources located in different countries), then we may not have access to certain keys/values. Then this kind of partition must lead to partial availability , and as long as there is no overlap between the source shards, there is also consistency within the partial system that a reader/observer/user can observe.


Fig 3. Partition on external availability (can't get complete access to data)
but the part we see can still be self-consistent.

Clearly partitioning and availability cannot be totally separate concepts either, and somewhere in here is the essence of CAP. So, at this point, let's simply cut to the chase and write down the intuitive formulation of CAP from [10], then we can retrofit meaning to C, A and P. CAP says that we can only have 2 out of 3 of these properties simultaneously, meaning there are 3 kinds of distributed system (or three phases) we can make:

  • CA without P (local consistency, no errors)

    If we can't be fault-tolerant of a partitioning (communication breakdown) in the system, then we can still have consistency and availability of the data shared by agents within each partition, by ignoring other partitions. e.g. if we put a firewall in the middle of a distributed system then we can't expect data to be consistent across this partition, but on each side agents can still share and harmonize their data values so that they are locally consistent by talking to each other. This is local rather than global consistency.

    In this configuration, we mean: local C for a partial system, 100% A for the partial system, and no P does not exclude several partitions from existing with their own internal CA. Partitioning thus means having multiple independent systems with 100% CA that do not need to interact (a trivial case).

  • CP without A (transaction locking)

    If a system is allowed to not answer requests at all (turn off "A"), we can have the artificial consistency by moratorium on change. We claim to tolerate partitioning/faults, because we simply block all responses if a partition occurs, assuming that we cannot continue to function correctly without the data on the other side of a partition. Once the partition is healed and consistency can once again be verified, we can restore availability and leave this mode.

    In this configuration, we mean: global C, and global correct behaviour in P is to block access to replica sets that are not in synch -- this requires forcibly blocking "A". CAP suggests that this "no A" affects the whole database, but it only need affect the data affected by the partition. In order to tolerate P at any time, we must be ready to dispense with A at any time for global consistency, so the use of terms P and A refer to 100% assurance, but the remedy can still be local. This is basically the transaction lock.

  • AP without C (best effort, fault tolerant)

    Finally, if we don't care about about global consistency (i.e. simultaneity), then every part of the system can make available what it knows. If there is a partition, then each part might be able to answer someone, even though the system as a whole has been broken up into incommunicable regions. In that sense one can tolerate partitioning of the total system by caching and not caring about whether all data are entirely up to date.

    In this configuration without C means without the assurance of global C at all times. The system might be consistent, but we cannot verify it -- we just don't know. When we claim A and P together, we have to mean partial availability within a partition. If we assume that agents can cache data values, then this might be equivalent to CA without P if agents are not completely isolated.

We see that the terms C, A and P are used to mean quite different things in each of the cases, making it very hard to use them in any proof. But the intended meaning is starting to become clearer.

What is wrong with this story?

On the surface, there seems little wrong with this at the gut level, modulo some sloppy terminology, but questions jump out at you from these 3 supposed cases. Does availability mean 100% availability at all times (to make it consistent with the idea of partitioning)? Does A take into account caching or redundancy, which further complicates the discussion of freshness or consistency of data?

At the root of the discussion of consistency is a naive view of time itself, and the absence of a model of the network infrastructure on which the whole thing depends. Availability is a question of time limits: when do we give up trying to get an answer? How long does one of the three CAP phases last? Can we go from one phase to another at some moment in time?

The entire formulation is also typically static, in the way that the world of theorems tends to be. It pretends that the system moves from one crystalline state to another, in such a way that the same perfect view can be available to everyone in the system simultaneously. This is a deeply flawed view, which is ultimately a physical impossibility because of the finite speed of information.

The definition of consistency is a problem. It makes no clear promise but has an implicit assumption of verification through continuous look-up. It is missing a model of user expectation , over what time-scale is this consistency required? A real distributed system is a messy melange of change happening at random locations at random times. Consistency requires a more sophisticated model that takes into account the change processes. Are we doing everything as fast as possible in case of sudden failure, or do we really need to look up data that fast?

The confusions arise from the lack of separation of concepts, and the absence of a model of time and space (where and when) and signal propagation. CAP is all about networks, but the one thing it doesn't have is a model of the network itself.

In other scientific theories of relativity, the punch line has always been that global consistency is impossible because of limitation on the rate of communication. The same will be true here. So at one level, the problem is that one wastes a lot of time talking about something that doesn't exist. Ultimately, in the most extreme distributed system we would literally have to deal with Einsteinian relativity, and we would find that the CAP model of consistency is literally impossible to achieve.

A promise theory view of the relativity of knowledge

Let's start again and try to get a little clarity around the concepts. My tool of choice is Promise Theory, which is a framework I developed originally for describing CFEngine (a system that has to deal with all of these issues), and which handles CAP-like questions simply and clearly. Promise Theory [12] was introduced to offer an atomic theory of distributed systems, making only the most elemental assumptions about agents. (For a more complete discussion of distributed data and consistency in promise theory, see [13].)

Without getting too technical, let's try to see if we can define C, A and P somewhat better, taking into account time and space. Whether or not a system is available or consistent, etc, is an assessment that any agent within a distributed system should be able to make based on the evidence it can observe. Consistency will be about whether agents (can) make the same assessments. Considering different observational viewpoints brings us back to relativity, and promise theory has a few principles to help us.

  • An agent can only make promises about its own behaviour.
  • It can make promises about its behaviour that rely on other agents' cooperation, conditional on them keeping their promises.
  • Each agent knows only its own state, and any information it promises to assimilate from other agents, which in turn promise to share their own state with it.
  • All promises are statements of best effort and represent the best outcome of the system, but do not offer any guarantees.

To use promise theory, we try to construct a promise for some desired property without breaking these rules. Availability is fundamental to being able to measure the state of data at different locations, so let's start with that.

  • A is for Availability

    To make a promise theory model, we need to include all the agents that play a role in providing access to data. We can keep this simple and take the following agents, providing keeping `services' promises in this interaction.

    1. C: A client who will make a request for data/state and give up waiting for a response after a time interval Δt max . This limit is the policy that distinguishes available from unavailable.
    2. S: A server that will answer that request at a time of its choosing.
    3. N: The network that promises to transport the data in between, at a time and rate of its choosing.
    4. O: An observer to whom the answer would be reported by the client (as a final arbiter).

    Each of these agents can only make promises about their own behaviour. See my notes in the figure below about how to construct this interaction.


    From my notebook: How to define availability with promise theory.
    What independent agents can offer and channel for others.
    Availability can, at best, be promised with a time delay.

    What we get out of this analysis is the following conclusion. An agent S is said to be available to another agent C , iff C receives a reply to its request from S within a finite amount of time Δt < Δt max . Notice that, in accordance with promise theory, this definition is purely from the observable viewpoint of the agent C and so makes no assumptions about what happens outside of C . It takes the promises made by other other agents under advisement and makes its own promises conditionally on the assumption that they will try to keep their promises. The definition has to refer to time, as this is the only way to resolve whether a reply has been received or not. At what point do we stop waiting to know the answer? C must have some kind of internal clock to make this assessment, else we have no way of measuring availability.

    In this example, I added the observer as a separate agent to show the possibility of caching information from S locally at C . The observer sees cached information as available, because it replies quickly, but we then have to worry about whether the cache is consistent with the value of the remote data. We see easily that availability and consistency need each other in different ways at different places and times -- it does not makes sense to talk about the availability of the whole system, but only of the agents individually. Availability of S is needed to establish consistency of data at C , and availability of C has to be curtailed to guarantee that the observer cannot see inconsistent values between cache and that retrieved from S .

    According to the rules of promise theory, we would model the smallest part of the system that can change independently as a separate agent that makes individual promises. Thus, for a database, this tells us that the availability promise applies to each data value represented here by S independently, not the `whole database' interface, represented by C . So PT tells us to lock individual data -- we just rediscovered transaction locking.

    In conclusion: yes, we can define availability, in a way conducive to talking about distributed data. To do so, we need to refer to some form of independent real time, distinct from transaction counting.

  • C is for Consistency

    The intended meaning of consistency in CAP seems to be the following. If information or knowledge K is recorded as having some value K →V , at some global time t 1 , by any agent in a distributed system, then at any later time t 2 > t 1 , all agents will see K → V (both the fact that it exists and has a fixed value).

    Notice that to promise consistency, any agent needs to agree about the global time at which changes occur on another agent, i.e. they need to be able to agree about arbitrary points in time, or effectively have a common clock.

    If there is some number of users (consumers) of data C , and a single provider S , as in the availability discussion above, this seems easy (but read on). The provider S promises to provide the latest value, or nothing. Then consistency is trivially achieved, there is only one value at a time which is V or "failed" in the case of a fault. If multiple processes are changing the data in parallel, transaction locking at S . All the promises we need to make this happen were made in the analysis above.

    But now, suppose that we have several users looking at the same information, but with different latencies. Suddenly information is propagated to different locations at different rates, and the global view is lost. Global time does not help us at all because the finite speed of information-transfer leads to inevitable delays. No one can compare their results at any theoretical "same time". At best they could agree to compare results averaged over a coarse grain of time, or a time interval "bucket".

    To regain a global view, we can, of course, wait again until all the parts catch up. This is a process we can call equilibration, as we are most familiar with in thermodynamics. To reach equilibrium, all N agents in a system need to coordinate and agree on the same data values. The final state is then called a state of maximum entropy, meaning that the same average state is maximally distributed -- and it is this that CAP wants to have (see my notes in the figure below). This takes time during which the data have to converge to a static value. If any other outside data are coming in causing confusion, this would interfere with the harmonization and make the result non-deterministic.


    From my notebook: Consistency is not automatic - it depends on where/when.
    Distributed consistency takes time, it is a thermodynamics equilibrium.

    However, this determinism comes at a price. By forcing any other change to wait for equilibration of the existing store, we distort the system's record of the time at which data are allowed to enter, because we are blocking and forcing things that want to be parallel into a serial stream. If the data represent a quickly changing source, then the meaning of the data is utterly compromised by that waiting.

    Just as availability could only be promised by a recipient to an observer, conditionally on matters beyond its control, so consistency (being fundamentally a stateful function of availability) can only be promised to an observer -- in an aggregate view -- subject to a trust in other agents' promises to correctly report on all the copies of the data around the network; and this is further subject to a minimum delay inherent in the finite speed of information transfer. In other words, the best promise we can make is: if you wait t seconds, I promise the data can be uniform, assuming that all agents honoured their promises to block further changes. In this sense, all consistency has to be "eventual" (Re: ACID/BASE) in the sense that a delay must be introduced somewhere.

    Banks do this deliberately, of course, when they cease trading at the end of each day and force customers to wait three working days for transaction verification, in spite of modern communications. In some circumstances, one might place this determinism ahead of accuracy of timing, but there is a trade-off here too. However we choose to define consistency, we have to deal with distortion of the history of data and/or artificial slow-down in capture performance.

    There is one final point that I have never seen discussed: why are the end users not expected to have consistent knowledge too? If we really care about consistency, then what about the consumers of information from the distributed data? Do we put a lot of effort into making all the data in a database be the same at the same moment in time and then not control the time at which users look for answers?

    If one user looks up data earlier than another, or from a greater distance than another user, the latency can easily lead to them reading different values. The technology does nothing to help them resolve this issue (if they actually care). They are on their own. Is this itself not an inconsistency in the insistence on consistency?

    If your head is now spinning, then you are not alone. Relativity is hard. Even the Nobel committee refused to award Einstein for his work on this subject as it was too controversial at the time.

    In summary:

    • Consistency does not happen by itself, because the finite speed of information makes different values current at different times and places.
    • Data values must come to equilibrium by cross checking the values at different locations.
    • We have to deny access to the entire system until all values are trusted or verified to be the same. This requires us to choose an arbitrary boundary at which we stop caring.
    • What is the entire system? Does it include the end users?
    Git
    An example that illustrates the issues nicely is the distributed versioning system git where the full distributed system is actually intended to be inconsistent . Different branches or "partitions" have very different data. There is no sense in which a user could look at an arbitrary user's repository and expect to see a consistent view of the whole. But one could imagine that, if changes ever stopped happening, all the different versions could be merged into a single state of equilibrium. Whether or not this happens, it is perfectly possible for users to have a useful and locally consistent data experience without being forced into global consistency.
    CFEngine
    CFEngine promises to provide samples of distributed systems with 5 minute resolution at all times. Since the latency of the network cannot be determined uniformly, all data values are stored with their "last seen" time and location, indicating when they arrived within the local view and from which IP address/context.

    The conclusion is that consistent use of the data is entirely up to the end-user no matter what happens inside the distributed system. Since the end-user has to take responsibility for the data anyway, this surely means that actual use of the data cannot be strongly affected by whether there is complete distributed consistency within the distributed database, because it is only what happens inside the end-user that matters. The end user's experience depends only on the specific history of observations it has experienced, not on the global state. This is basically the argument made by Rich Hickey in [11] and Nathan Marz [8]. As long as we label data with their spacetime coordinates (location and sample time), a consistent computation can always be carried out. This is a move from single values to tuples that label values with space-time context.

    V → (t, x, V)

    This is exactly what Einstein had to do to make sense out of high speed travel. This is called a frame of reference . The strategies for maintaining consistency (copy on write, read/write locking, atomicity, etc) are all about setting a boundary between The System and The User, i.e. about handing over responsibility. But The User cannot be separated from The System if we are serious about data consistency in the system, and if we do not need strong consistency then these locking mechanisms merely slow down a dynamic system pointlessly in order to equilibrate it.

    This suggests that ACID versus BASE debate is no debate at all. It is question of an arbitrary placement of a line or, dare I say, partition. It is just a question of where the limits of the system are defined, -- it all depends on the user.

  • P is for Partition-Tolerance

    "P" is the vaguest of the properties in the CAP literature. The meaning of a partition is clear enough, but what is partition tolerance ? Well, it refers to what agents promise to do in the event that they need to rely on information that is located in a different partition (i.e. someone they can't talk to). An agent may be said to be partition tolerant if it promises to deliver a "correct" response to a query within a finite amount of time Δt < Δt max , even if its response depends on information from another agent in a different partition.

    Although this sounds very ad hoc, it is theoretically possible to define a correct response for chosen regions of a distributed system, for every conceivable scenario, as a piecewise function, and get agents to keep such promises. However, these are of course dependent on the definitions of "C" and "A", and the work required to do this is not useful. If we follow a promise theory perspective, or equivalently that of [11], then we have a description where the concept of partitions is irrelevant. Then the only useful thing we can say is that partition tolerance cannot be defined for push based systems at all, because there is no way to know whether equilibration happened.

Where are we?

The CAP theorem claims that a distributed system can have two out of three of these properties, but not all of them at the same time. In other words, according to the theorem, we can have CA, CP, and AP systems.

We can define "A" and to some extent "C", but we also see that the concepts as discussed for databases are somewhat ad hoc , and more interesting is how one deals with the issue of end-user viewpoints. Hickey [11] has pointed out that, if you want to model causation from a consistent viewpoint, without needing to stop the world to do it, a simple model is to build a private spanning tree of the causal process [8]. This was recently used as an approach to `beating the CAP theorem' [11]. And, of course, it is exactly what git does.

What these examples say is that the premise on which CAP is built is itself the problem.

Time and the many worlds interpretation

"If you have one clock, you know what the time is. If you have two, you are not sure."

Anonymous

I used the mundane view of time in these definitions to make them workable. As pointed out in [3], agents in a distributed system do not necessarily have a knowledge of a global time; and [3] also discussed whether local time is available. In practice there are no systems without an internal clock (Ethernet, MPLS, ATM, etc need a clock, for example), so one can always measure time intervals Δt needed for "A".

To achieve "C", we said that agents need to be able to coordinate global state by equilibration. This does not require a notion of global time. Each agent only needs its own local time to make consistent computations -- but once users are involved, the collaboration of multiple users around the distributed system requires time labels to make sense to everyone and be comparable.

Agents cannot use the arrival of transactions themselves as the measure of time, because this is not independently predictable, so there must be a separate clock.

What information systems need to capture is the law of cause and effect: causes precede effects in a given time-line. The idea here is that the future is a function of the past. The laws of physics are actually formulated in this way (there is a subtlety about functional recursion here, which I will not go into). This does not imply that there must be only one time-line, however.

The End User is responsible

Now you might ask; why do we need to equilibrate all the time? Not all changes to data affect all users and processes at the same time, as different users are interested in different parts of the distributed data -- and this is absolutely right. Version control systems like git , as well as techniques like A/B testing for continuous deployment, make explicit use of this, and create private copies of reality for each user so that it doesn't matter that different users will see different versions. So why all the fuss about consistency at all? Since the end user has ultimate responsibility for using the data ("at their own risk") surely what is important is to make a predictable environment for each user to enable him or her to deal with the information available in whatever way suits them best?

This is essentially the argument in [11,8] and this mode of thinking has a long history (no pun intended) in science. By assigning each computation its own private context, the world diverges into a so-called Many Worlds interpretation , as championed by Kripke in logic, and later by Everett in physics. These worlds need never have to meet, but if they can be re-merged into one another by the process of equilibration again.

Maintenance is equilibration

If our distributed data actually refers to our IT infrastructure, which is the case with configuration management, then all the same things apply. It would be a mistake to think that CAP is only about SQL/noSQL databases. The difference between a data store and an infrastructure is that infrastructure is designed to have certain characteristics over time. Random accumulations of inputs are not. That means we can formulate a policy for state, which is part of the design.

What the CAP discussion tells us here boils down to what I would call Maintenance and Convergence. These are the properties of configuration management that lead to equilibration of relative state.

Policy is also data about the system, but at a meta-level. It can be used to maintain the stability of raw data files that govern the behaviour of the infrastructure, thus in turn equilibrating the behaviour. It purpose is to represent consistency from a slowly varying perspective. We can separate out a predictable part that we can rely on and then automatic processes can run in real time to counteract noise in the system. This is what CFEngine was designed to do and the speed of operation is growing more mission critical by the day.

Somewhere along the way, when pushing out releases became more frequent, we lost sight of this need for continuous maintenance. But this is not good enough to maintain host equilibrium or consistency over time.

Promise theory tells us that it is not real or virtual hosts that we should think of as the smallest unit of infrastructure but each data and program resource individually. Thus to make several of these work together on a host, they need to equilibrate. That is why regular maintenance is needed to cope with change.

Mission criticality, even over short intervals, is becoming increasingly desirable for success in a high speed world. Then CAP says that the maintenance or local correctness of your infrastructure depends on the configuration repair system being online and available within t max . This, for instance, is why CFEngine went from examining its repair schedule every day in 1993, to every hour, to every 15 minutes and today every five minutes by default. Some users are known to execute configuration repairs every minute.

Continuous deployment

In continuous deployments, either in web operations or software releases, rapid change is most naturally handled by allowing different branches of reality to fork off and co-exist until the right time to equilibrate them. If these changes did not split off into their own reality, if we tried to bring everything into a single time-line immediately, that single time line would be completely unpredictable and unusable for everyone. A globally consistent view had better not change too quickly. This is the thinking behind git branches, A/B testing, staging environments, and so on.

How quickly does infrastructure need to respond? Deployments or changes are not truly continuous is a mathematical sense of "functions over the reals" of course. They happen over intervals of time that are quite long compared to the rate at which data can change in our information systems. From 2000-2003 I spent some time examining this idea of data resolution in system administration and developed the models, using CFEngine 2 as the proof of concept, to balance stability (equilibrium) and change. The result is CFEngine's 5 minute model of locally consistent viewpoints that are then reevaluated on each run to adapt to changes in the environment. User who don't run configuration checks every five minutes risk losing consistency due to random changes from users and other systems like package managers. Again, we have to acknowledge that infrastructure is not a transactionally safe system that maintains integrity by itself [15].

Probability of consistency: Push and pull

When engineers still believed that computers were transactional and deterministic, the traditional way of updating them was to use push-based methods like rdist or some kind of package update from a central source. Thanks to CFEngine's investigations in the 1990s, most configuration management systems today use pull based approaches to management so that availability and consistency can be measured.

If we only "throw things over the wall", then every push creates a system partition -- where we cannot observe what goes on on the other side. This is the ultimate parallel universe scenario where the worlds never come back together.

If CAP says anything about operations, it is this: don't just push data to hosts from a central source: no network shells or distribution mechanisms. The end user client needs to control its own reality in order to have a consistent view of the world even for itself. For there to be global consistency requires a lot more than that.

What did we learn?

The other terror that scares us from self-trust is our consistency; a reverence for our past act or word because the eyes of others have no other data for computing our orbit than our past acts, and we are loath to disappoint them ... A foolish consistency is the hobgoblin of little minds...
--Ralph Waldo Emerson

In the first part of this essay, I have tried to pick apart the original intention of the CAP conjecture and clarify what it can and cannot mean for distributed systems. I have also tried to point out some of the inconsistencies in the idea of consistency itself, and challenge you with the question: how important is consistency to you and over what time scale?

I am keen to underline that CAP is not a theorem, because there is a real danger when rumours and truisms assume the authority of truths about the world, and are used uncritically, often for `techno-political' purposes. The authority in the term "theorem" should not be abused to justify certain viewpoints that can easily be justified by more careful thinking. And a good theorem should be inescapable and basically self-evident. There are indeed truths about systems that can be proven, but CAP is too vague for that.

The original conjecture pointed to three modes of operation which were labelled:

  • CA without P,
  • CP without A,
  • AP without C.

While I showed that these cannot cover the full story of distributed systems, in the vague meanings that were originally intended, they do still represent three identifiable modes of operation for distributed systems that illuminate valid trade-offs. Thus we can still use them as points of reference to discuss systems, even beyond simple databases (which is my original intention with this essay). In addition, there is the deeper question of user relativity that is the root cause of inconsistency.

Consistency is often presented as a property for databases or services, but it also applies as an emergent property for other distributed systems. Database traditionalists, from the privileged pedestal of a tightly controlled environment, will probably roll their eyes at the idea of emergent properties. There is still a hard core in computing who believes in the possibility to control absolute determinism [14]. But as distributed systems and their infrastructure become larger and faster, we need to get out of that delusion as quickly as possible. Relativity -- i.e. the difference in viewpoints resulting from separation in time and space between user and data -- makes consistency more complicated than the simple stories of the past, and as the rate of change in large IT increases, the effects of relativity become more of a headache.

This first part of my essay only scratches the surface of what consistency and availability mean to distributed systems, but now that we have some of the background on record, for the second part I want to think about a larger issue of what this means for the greatest distributed system of them all: our IT infrastructure.

[PART 2]

Acknowledgment

I am grateful to John Allspaw, John Willis and Diego Zamboni for helpful remarks on a draft version of this article. They are, of course, absolved of any blame regarding the content. I would also like to thanks Ted Zlatanov for suggesting the topic.

Valuable reading on CAP theorem

  1. Eric Brewer, Towards Robust Distributed Systems?
  2. Eric Brewer, CAP Twelve Years Later: How the "Rules" Have Changed
  3. Nancy Lynch and Seth Gilbert, Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , ACM SIGACT News, Volume 33 Issue 2 (2002), pg. 51-59.
  4. Daniel Abadi,Problems with CAP, and Yahoo?s little known NoSQL system
  5. Michael Stonebraker, Errors in Database Systems, Eventual Consistency, and the CAP Theorem
  6. Henry Robinson CAP Confusion: Problems with "partition tolerance"
  7. Coda Hale You Can't Sacrifice Partition Tolerance
  8. Nathan Marz, How to beat the CAP theorem
  9. NoCAP - Part II Availability and Partition tolerance
  10. Armando Fox and Eric Brewer, Harvest, Yield and Scalable Tolerant Systems (1999)
  11. Rich Hickey, Are we there yet? (2009)
  12. M. Burgess and J. Bergstra, A Foundational Theory of Promises (2004-)
  13. D. Aredo, M. Burgess and S. Hagen, A Promise Theory View on the Policies of Object Orientation and the Service Oriented Architecture (2005-2006)
  14. M. Burgess, Three myths holding system adminisration back...
  15. M. Burgess and A. Couch, Proof that rollback is a "total" fiction .

Lessons on Scale Organizing from Siemens Workers United

OrganizingUp
convergencemag.com
2025-12-10 23:15:30
Workers at the Siemens Mobility manufacturing plant in Sacramento, California lost their March 2025 bid to form a union by a vote of 838 to 538. While the results dashed our hopes to transform what it is like to work at Siemens, this joint campaign between International Brotherhood of Electrical Wor...

Using edge detection to preserve significant features while downsampling

Lobsters
yogthos.net
2025-12-10 23:10:02
Most pixelation libraries take the lazy route where they just downscale the image and then upscale it back with nearest neighbor interpolation. It's fast but the results are usually pretty messy because the grid just cuts right through important features like faces or distinct edges. You lose a lot ...
Original Article

Original Image

Pixelated Image

With Projection

How to use: Upload an image, adjust the pixel size (up to 128) and color limit sliders, then click "Pixelate Image" to see the result. Enable "Edge-Aware Pixelation" for adaptive grid optimization - you can control how many optimization iterations the algorithm performs and adjust edge sharpness (0-1, default: 0.8 for strong edges). Enable "Apply Rotation" to see a projective transformation example.

Rubio orders return to Times New Roman font over 'wasteful' Calibri

Hacker News
www.bbc.com
2025-12-10 22:10:25
Comments...
Original Article

US Secretary of State Marco Rubio has ordered diplomats to return to using Times New Roman font instead of Calibri, reversing a change made under the Biden administration.

Rubio's predecessor Antony Blinken had adopted Calibri in 2023, saying it was more accessible for people with visual disabilities. But Rubio said this was a "wasteful" diversity move and that Times New Roman was "more formal and professional".

The new changes go into effect on 10 December, and apply to both external and internal documents.

Lucas de Groot, the Dutch designer who created the Calibri typeface, told BBC Newshour the change was both "sad and hilarious".

"Calibri was designed to facilitate reading on modern computer screens - it was chosen to replace TNR - the typeface that Rubio wants to go back to now," Mr de Groot said.

A state department spokesperson told the BBC the change to Times New Roman aligns with President Donald Trump's mission to "present a unified, professional voice in all communications".

"Aligning the (state) department's practice with this standard ensures our communications reflect the same dignity, consistency, and formality expected in official government correspondence," the spokesperson said.

Times New Roman is a serif font, which means it has small lines that stem from the ends of the letters. Courts, legislatures, and other agencies typically use the more formal-appearing font. Calibri is a sans serif font, without those lines, and is considered easier to read on screens, especially for those with vision or reading impairments.

In his order on Tuesday requiring diplomats to return to Times New Roman, Rubio called Blinken's decision to use Calibri a "wasteful" diversity initiative, according to an internal department cable seen by Reuters.

The Trump administration has made several changes to how government works in the last 11 months, aiming to eliminate diversity, equity and inclusivity initiatives.

Most recently, the Trump administration announced it would drop Martin Luther King Jr's birthday and Juneteenth - two federal holidays honouring black history - as free admission days to national parks. Instead, visitors will be given free entry on President Donald Trump's birthday, which coincides with Flag Day.

New DroidLock malware locks Android devices and demands a ransom

Bleeping Computer
www.bleepingcomputer.com
2025-12-10 21:53:12
A new Android malware called DroidLock has emerged with capabilities to lock screens for ransom payments, erase data, access text messages, call logs, contacts, and audio data. [...]...
Original Article

New DroidLock malware locks Android devices and demands a ransom

A newly discovered Android malware dubbed DroidLock can lock victims’ screens for ransom and access text messages, call logs, contacts, audio recordings, or even erase data.

DroidLLock allows its operator to take complete control of the device via the VNC sharing system and can steal the device lock pattern by placing an overlay on the screen.

According to researchers at mobile security company Zimperium, the malware targets Spanish-speaking users and is distributed through malicious websites promoting fake applications that impersonate legitimate packages.

In a report today, Zimperium says that the "infection starts with a dropper that deceives the user into installing the secondary payload that contains the actual malware."

Loader app (top) and DroidLock app (bottom)
Loader app (top) and DroidLock app (bottom)
Source: Zimperium

The malicious apps introduce the main payload via an update request and then ask for Device Admin and Accessibility Services permissions, which let it to perform fraudulent activities.

Some of the actions it can take are wiping the device, locking it, changing the PIN, password, or biometric data to prevent the user from accessing the device.

Zimperium's analysis discovered that DroidLock supports 15 commands that let it send notifications, place an overlay on the screen, mute the device, reset it to factory settings, start the camera, or uninstall apps.

Commands supported by DroidLock
Commands supported by DroidLock
Source: Zimperium

The ransomware overlay is served via WebView immediately after the corresponding command is received, instructing the victim to contact the threat actor at a Proton email address. If the user does not pay a ransom in 24 hours, the actor threatens to permanently destroy the files.

DroidLock's ransom overlay
DroidLock's ransom overlay
Source: Zimperium

Zimperium clarifies that DroidLock does not encrypt files, but by threatening to destroy them unless a ransom is paid, the same purpose is achieved. Additionally, the threat actor can deny access to the device by changing the lock code.

DroidLock can steal the lock pattern through another overlay loaded from the malicious APK's assets. When the user draws the pattern on the cloned interface, they send it directly to the attacker. The purpose of this feature is to allow remote access to the device through VNC at idle times.

Being a member of Google’s App Defense Alliance, Zimperium shares new malware findings with the Android security team, so Play Protect detects and blocks this threat from up-to-date devices.

Android users are advised not to side-load APKs from outside Google Play unless the publisher is a trusted source. They should always check if the permissions required by an app serve its purposes, and periodically scan their device with Play Protect.

tines

Break down IAM silos like Bitpanda, KnowBe4, and PathAI

Broken IAM isn't just an IT problem - the impact ripples across your whole business.

This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.

New Casino, Same Mets

hellgate
hellgatenyc.com
2025-12-10 21:48:06
You really thought this was all about a World Series, didn't you, you absolute FOOL?...
Original Article

Some curious things are happening at the New York Mets.

Earlier this week, star closer and beloved man of the trumpet Edwin Diaz, announced he was signing a $69 million contract with the Los Angeles Dodgers.

On Wednesday, Pete Alonso, the longest-tenured Met, arguably the heart and soul of the team, announced he would be taking his talents to the Orioles (THE 75-87 BALTIMORE ORIOLES), instead of staying a Met. It's reported the Mets didn't even make him an offer . For Diaz, the Mets had apparently bid $3 million less than the Dodgers.

This kind of pointless penny-pinching combined with a heart-breaking gut renovation might be expected with a team not owned by a man worth $20 billion who, almost one year ago to the day, signed Juan Soto for a MLB-record $765 million deal .

Is this the same Steve Cohen who said in 2020 after he bought the team, "If I don't win a World Series in the next three to five years, I would consider that slightly disappointing"? That Steve Cohen? What has changed between last December, and right now?

Give us your email to read the full story

Sign up now for our free newsletters.

Sign up

When Would You Ever Want Bubblesort?

Hacker News
buttondown.com
2025-12-10 21:45:11
Comments...
Original Article

There are very few universal rules in software engineering, but there are are a lot of near -universal principles. Things like "prefer composition to inheritance" is near-universal. I love finding the rare situations where these principles don't hold, like where you do want inheritance over composition . A similar near-universal principle is "don't use bubblesort ". Some would even say it's a universal rule, with Donald Knuth writing "bubble sort seems to have nothing to recommend it, except a catchy name and the fact that it leads to some interesting theoretical problems". 1 But Knuth's been wrong before , so let's see if this universal rule is only near -universal.

Theoretically, bubblesort is faster than quick or mergesort for small arrays. This makes it useful as part of a larger sorting strategy: most of the fast-in-principle sorting algorithms work by recursively sorting subpartitions of an array, ie if you apply quicksort to 2^20 random integers, at some point you're sorting 2^17 8-integer subpartitions. Switching over to bubblesort for those subpartitions would be a nice optimization.

Many production sorting algorithms do use a hybrid approach, but they overwhelmingly use insertion sort instead. Insertion sort is very fast for small arrays and it's also better at using the hardware . On some very particular hardwares bubblesort stills ends up better, like in this NVIDIA study , but you probably don't have that hardware.

So that's one use-case, albeit one still dominated by a different algorithm. It's interesting that NVIDIA used it here because gamedev has a situation that's uniquely useful to bubblesort, based on two of its properties:

  1. While the algorithm is very slow overall, each individual step is very fast and easily suspendable.
  2. Each swap leaves the array more ordered than it was before. Other sorts can move values away from their final positions in intermediate stages.

This makes it really good when you want to do a fixed amount of sorting work per frame. Say you have a bunch of objects on a screen, where some objects can occlude others. You want to render the objects closest to the camera first because then you can determine which objects it hides, and then save time rendering those objects. There's no correctness cost for rendering objects out of order, just a potential performance cost. So while your array doesn't need to be ordered, the more ordered it is the happier you are. But you also can't spend too much time running a sorting algorithm, because you have a pretty strict realtime constraint. Bubble sort works pretty well here . You can run it a little bit of a time at each frame and get a better ordering than when you started.

That reminds me of one last use-case I've heard, apocryphally. Let's say you have a random collection of randomly-colored particles, and you want to animate them sorting into a rainbow spectrum. If you make each frame of the animation one pass of bubblesort, the particles will all move smoothly into the right positions. I couldn't find any examples in the wild, so with the help of GPT4 I hammered out a crappy visualization. Code is here , put it here .

(After doing that I suspect this isn't actually done in practice, in favor of running a better sort to calculate each particles final displacement and then animating each particles moving directly, instead of waiting to move for each bubblesort pass. I haven't mocked out an example but I think that'd look a lot smoother.)

So there you go, three niche use cases for bubblesort. You'll probably never need it.


New Quanta Article!

Okay so I didn't actually write this one, but I played a role in it happening! A while back a friend visited, and we were chatting about his job at quanta. At the time he was working on this mammoth article on metacomplexity theory , so naturally the topic of problems harder than NP-complete came up and I recommend he check out Petri net reachability. So he did, and then he wrote An Easy-Sounding Problem Yields Numbers Too Big for Our Universe . Gosh this is so exciting!

U.S. Realizes It Can Seize Boats After All

Intercept
theintercept.com
2025-12-10 21:19:25
After months of extrajudicial killings in the waters off Venezuela, the Trump administration opted instead to capture an oil tanker. The post U.S. Realizes It Can Seize Boats After All appeared first on The Intercept....
Original Article

U.S. forces seized an oil tanker off the coast of Venezuela on Wednesday, two government sources familiar with the matter told The Intercept. President Donald Trump called the boat the “ largest one ever seized .”

The capture comes after the U.S. military spent the past three months conducting air strikes in the region that have destroyed at least 22 boats, killing at least 87 civilians .

The U.S. government has not yet explained its justification for capturing the Venezuelan vessel.

The two government sources said the operation was led by the U.S. Coast Guard. “We would refer you to the White House for questions,” Lt. Krystal Wolfe, a Coast Guard spokesperson, told The Intercept in response to questions.

“We don’t have a comment,” said a Pentagon spokeswoman, who also referred questions to the White House.

The White House did not immediately respond to a request for comment.

“It appears they’re now aiming to further tighten the economic noose, regardless of its impact on civilians, in pursuit of their regime change goal.”

While the U.S. once bought much of Venezuela’s oil , that trade was halted in 2019 when the first Trump administration imposed sanctions on the country’s state-owned oil company. While shipments to the United States resumed in 2023 , most of Venezuela’s oil is now exported to China . The U.S. has also imposed financial sanctions on the Venezuelan government.

“Congress and the international community should consider this as an illegal act of war, in the legal sense as well as for the surge in poverty and violence it could cause,” Erik Sperling of Just Foreign Policy, an advocacy group critical of mainstream Washington foreign policy, told The Intercept. “The Trump administration’s indiscriminate sanctions have increased hunger across the population but have failed to topple the government. It appears they’re now aiming to further tighten the economic noose, regardless of its impact on civilians, in pursuit of their regime change goal.”

The capture comes as the Pentagon has built up a force of more than 15,000 troops in the Caribbean since the summer — the largest naval flotilla in the region since the Cold War. That contingent now includes 5,000 sailors aboard the USS Gerald R. Ford, the Navy’s newest and most powerful aircraft carrier, which has more than 75 attack, surveillance, and support aircraft.

As part of a campaign of air strikes on boats, the Trump administration has secretly declared that it is engaged in a “non-international armed conflict” with 24 cartels, gangs, and armed groups including Cártel de los Soles, which the U.S. claims is “headed by Nicolás Maduro and other high-ranking Venezuelan individuals,” despite little evidence that such a group exists . Experts and insiders see this as part of a plan for regime change in Venezuela that stretches back to Trump’s first term . Maduro, the president of Venezuela, denies that he heads a cartel.

Since the attacks began, experts in the laws of war and members of Congress, from both parties , have said the strikes are illegal extrajudicial killings because the military is not permitted to deliberately target civilians — even suspected criminals — who do not pose an imminent threat of violence.

Trump has pursued an abrasive and interventionist foreign policy in the Western Hemisphere during his second term. “[W]e will assert and enforce a ‘Trump Corollary’ to the Monroe Doctrine,” reads the recently released U.S. National Security Strategy. It harkens back to President Theodore Roosevelt’s turn-of-the-20th-century “ Big Stick corollary to the Monroe Doctrine.

President James Monroe’s 1823 announcement warned the nations of Europe that the United States would not permit the establishment of new colonies in the Americas. Roosevelt’s more muscular decree held that Washington had the right to interfere in the internal affairs of countries across the Americas. In the first quarter of the 20th century, that Roosevelt corollary would be used to justify U.S. occupations of Cuba, the Dominican Republic, Haiti, Honduras, and Nicaragua .

What’s been called the “ Donroe Doctrine ” began to take shape with threats to seize the Panama Canal, acquire Greenland, and rename the Gulf of Mexico as the Gulf of America. The Trump administration also claimed the Venezuelan gang Tren de Aragua had invaded the United States, allowing the government to use the 1798 Alien Enemies Act to fast-track deportation of people it says belong to the gang. The 5th U.S. Circuit Court of Appeals eventually blocked the government from using the war-time law. “We conclude that the findings do not support that an invasion or a predatory incursion has occurred,” wrote Judge Leslie Southwick.

More recently, Trump even claimed that U.S. troops engaged in combat with members of the gang on the streets of Washington, D.C. during the summer or early fall – an apparent fiction that the White House press office refuses to address.

While the Trump administration claims that Tren de Aragua is acting as “a de facto arm of” Maduro’s government, the Office of the Director of National Intelligence determined earlier this year that the “Maduro regime probably does not have a policy of cooperating with TDA and is not directing TDA movement to and operations in the United States.”

The U.S. also maintains that Tren de Aragua is both engaging in irregular warfare against the United States and that it is in a non-international armed conflict with the United States. These are, however, mutually exclusive designations which cannot occur simultaneously.

Trump also renewed long-running efforts , which failed during his first term , to topple Maduro’s government. Maduro and several close allies were indicted in a New York federal court in 2020 on federal charges of narco-terrorism and conspiracy to import cocaine. Earlier this year, the U.S. doubled its reward for information leading to Maduro’s arrest to $50 million. Meanwhile, Trump pardoned Juan Orlando Hernández, the right-wing former president of Honduras who had been convicted of drug trafficking.

Trump recently told Politico that Maduro’s “days are numbered.” When asked if he might order an invasion of Venezuela, Trump replied “I wouldn’t say that one way or the other,” before launching into a confusing ramble that devolved into insults about former President Joe Biden’s IQ, a tirade about Politico, and, in response to a follow-up question about his goals regarding Venezuela, his ownership of the Doral Country Club in Miami, Florida.

Useful patterns for building HTML tools

Lobsters
simonwillison.net
2025-12-10 21:09:09
Comments...
Original Article

10th December 2025

I’ve started using the term HTML tools to refer to HTML applications that I’ve been building which combine HTML, JavaScript, and CSS in a single file and use them to provide useful functionality. I have built over 150 of these in the past year, almost all of them written by LLMs. This article presents a collection of useful patterns I’ve discovered along the way.

First, some examples to show the kind of thing I’m talking about:

  • svg-render renders SVG code to downloadable JPEGs or PNGs
  • pypi-changelog lets you generate (and copy to clipboard) diffs between different PyPI package releases.
  • bluesky-thread provides a nested view of a discussion thread on Bluesky.

These are some of my recent favorites. I have dozens more like this that I use on a regular basis.

You can explore my collection on tools.simonwillison.net —the by month view is useful for browsing the entire collection.

If you want to see the code and prompts, almost all of the examples in this post include a link in their footer to “view source” on GitHub. The GitHub commits usually contain either the prompt itself or a link to the transcript used to create the tool.

The anatomy of an HTML tool

These are the characteristics I have found to be most productive in building tools of this nature:

  1. A single file: inline JavaScript and CSS in a single HTML file means the least hassle in hosting or distributing them, and crucially means you can copy and paste them out of an LLM response.
  2. Avoid React, or anything with a build step. The problem with React is that JSX requires a build step, which makes everything massively less convenient. I prompt “no react” and skip that whole rabbit hole entirely.
  3. Load dependencies from a CDN. The fewer dependencies the better, but if there’s a well known library that helps solve a problem I’m happy to load it from CDNjs or jsdelivr or similar.
  4. Keep them small. A few hundred lines means the maintainability of the code doesn’t matter too much: any good LLM can read them and understand what they’re doing, and rewriting them from scratch with help from an LLM takes just a few minutes.

The end result is a few hundred lines of code that can be cleanly copied and pasted into a GitHub repository.

Prototype with Artifacts or Canvas

The easiest way to build one of these tools is to start in ChatGPT or Claude or Gemini. All three have features where they can write a simple HTML+JavaScript application and show it to you directly.

Claude calls this “Artifacts”, ChatGPT and Gemini both call it “Canvas”. Claude has the feature enabled by default, ChatGPT and Gemini may require you to toggle it on in their “tools” menus.

Try this prompt in Gemini or ChatGPT:

Build a canvas that lets me paste in JSON and converts it to YAML. No React.

Or this prompt in Claude:

Build an artifact that lets me paste in JSON and converts it to YAML. No React.

I always add “No React” to these prompts, because otherwise they tend to build with React, resulting in a file that is harder to copy and paste out of the LLM and use elsewhere. I find that attempts which use React take longer to display (since they need to run a build step) and are more likely to contain crashing bugs for some reason, especially in ChatGPT.

All three tools have “share” links that provide a URL to the finished application. Examples:

Switch to a coding agent for more complex projects

Coding agents such as Claude Code and Codex CLI have the advantage that they can test the code themselves while they work on it using tools like Playwright. I often upgrade to one of those when I’m working on something more complicated, like my Bluesky thread viewer tool shown above.

I also frequently use asynchronous coding agents like Claude Code for web to make changes to existing tools. I shared a video about that in Building a tool to copy-paste share terminal sessions using Claude Code for web .

Claude Code for web and Codex Cloud run directly against my simonw/tools repo, which means they can publish or upgrade tools via Pull Requests (here are dozens of examples ) without me needing to copy and paste anything myself.

Load dependencies from CDNs

Any time I use an additional JavaScript library as part of my tool I like to load it from a CDN.

The three major LLM platforms support specific CDNs as part of their Artifacts or Canvas features, so often if you tell them “Use PDF.js” or similar they’ll be able to compose a URL to a CDN that’s on their allow-list.

Sometimes you’ll need to go and look up the URL on cdnjs or jsDelivr and paste it into the chat.

CDNs like these have been around for long enough that I’ve grown to trust them, especially for URLs that include the package version.

The alternative to CDNs is to use npm and have a build step for your projects. I find this reduces my productivity at hacking on individual tools and makes it harder to self-host them.

Host them somewhere else

I don’t like leaving my HTML tools hosted by the LLM platforms themselves for a couple of reasons. First, LLM platforms tend to run the tools inside a tight sandbox with a lot of restrictions. They’re often unable to load data or images from external URLs, and sometimes even features like linking out to other sites are disabled.

The end-user experience often isn’t great either. They show warning messages to new users, often take additional time to load and delight in showing promotions for the platform that was used to create the tool.

They’re also not as reliable as other forms of static hosting. If ChatGPT or Claude are having an outage I’d like to still be able to access the tools I’ve created in the past.

Being able to easily self-host is the main reason I like insisting on “no React” and using CDNs for dependencies—the absence of a build step makes hosting tools elsewhere a simple case of copying and pasting them out to some other provider.

My preferred provider here is GitHub Pages because I can paste a block of HTML into a file on github.com and have it hosted on a permanent URL a few seconds later. Most of my tools end up in my simonw/tools repository which is configured to serve static files at tools.simonwillison.net .

Take advantage of copy and paste

One of the most useful input/output mechanisms for HTML tools comes in the form of copy and paste .

I frequently build tools that accept pasted content, transform it in some way and let the user copy it back to their clipboard to paste somewhere else.

Copy and paste on mobile phones is fiddly, so I frequently include “Copy to clipboard” buttons that populate the clipboard with a single touch.

Most operating system clipboards can carry multiple formats of the same copied data. That’s why you can paste content from a word processor in a way that preserves formatting, but if you paste the same thing into a text editor you’ll get the content with formatting stripped.

These rich copy operations are available in JavaScript paste events as well, which opens up all sorts of opportunities for HTML tools.

  • hacker-news-thread-export lets you paste in a URL to a Hacker News thread and gives you a copyable condensed version of the entire thread, suitable for pasting into an LLM to get a useful summary.
  • paste-rich-text lets you copy from a page and paste to get the HTML—particularly useful on mobile where view-source isn’t available.
  • alt-text-extractor lets you paste in images and then copy out their alt text.

Build debugging tools

The key to building interesting HTML tools is understanding what’s possible. Building custom debugging tools is a great way to explore these options.

clipboard-viewer is one of my most useful. You can paste anything into it (text, rich text, images, files) and it will loop through and show you every type of paste data that’s available on the clipboard.

Clipboard Format Viewer. Paste anywhere on the page (Ctrl+V or Cmd+V). This shows text/rtf with a bunch of weird code, text/plain with some pasted HTML diff and a Clipboard Event Information panel that says Event type: paste, Formats available: text/rtf, text/plain, 0 files reported and 2 clipboard items reported.

This was key to building many of my other tools, because it showed me the invisible data that I could use to bootstrap other interesting pieces of functionality.

More debugging examples:

  • keyboard-debug shows the keys (and KeyCode values) currently being held down.
  • cors-fetch reveals if a URL can be accessed via CORS.
  • exif displays EXIF data for a selected photo.

Persist state in the URL

HTML tools may not have access to server-side databases for storage but it turns out you can store a lot of state directly in the URL.

I like this for tools I may want to bookmark or share with other people.

Use localStorage for secrets or larger state

The localStorage browser API lets HTML tools store data persistently on the user’s device, without exposing that data to the server.

I use this for larger pieces of state that don’t fit comfortably in a URL, or for secrets like API keys which I really don’t want anywhere near my server —even static hosts might have server logs that are outside of my influence.

  • word-counter is a simple tool I built to help me write to specific word counts, for things like conference abstract submissions. It uses localStorage to save as you type, so your work isn’t lost if you accidentally close the tab.
  • render-markdown uses the same trick—I sometimes use this one to craft blog posts and I don’t want to lose them.
  • haiku is one of a number of LLM demos I’ve built that request an API key from the user (via the prompt() function) and then store that in localStorage . This one uses Claude Haiku to write haikus about what it can see through the user’s webcam.

Collect CORS-enabled APIs

CORS stands for Cross-origin resource sharing . It’s a relatively low-level detail which controls if JavaScript running on one site is able to fetch data from APIs hosted on other domains.

APIs that provide open CORS headers are a goldmine for HTML tools. It’s worth building a collection of these over time.

Here are some I like:

  • iNaturalist for fetching sightings of animals, including URLs to photos
  • PyPI for fetching details of Python packages
  • GitHub because anything in a public repository in GitHub has a CORS-enabled anonymous API for fetching that content from the raw.githubusercontent.com domain, which is behind a caching CDN so you don’t need to worry too much about rate limits or feel guilty about adding load to their infrastructure.
  • Bluesky for all sorts of operations
  • Mastodon has generous CORS policies too, as used by applications like phanpy.social

GitHub Gists are a personal favorite here, because they let you build apps that can persist state to a permanent Gist through making a cross-origin API call.

  • species-observation-map uses iNaturalist to show a map of recent sightings of a particular species.
  • zip-wheel-explorer fetches a .whl file for a Python package from PyPI, unzips it (in browser memory) and lets you navigate the files.
  • github-issue-to-markdown fetches issue details and comments from the GitHub API (including expanding any permanent code links) and turns them into copyable Markdown.
  • terminal-to-html can optionally save the user’s converted terminal session to a Gist.
  • bluesky-quote-finder displays quotes of a specified Bluesky post, which can then be sorted by likes or by time.

LLMs can be called directly via CORS

All three of OpenAI, Anthropic and Gemini offer JSON APIs that can be accessed via CORS directly from HTML tools.

Unfortunately you still need an API key, and if you bake that key into your visible HTML anyone can steal it and use to rack up charges on your account.

I use the localStorage secrets pattern to store API keys for these services. This sucks from a user experience perspective—telling users to go and create an API key and paste it into a tool is a lot of friction—but it does work.

Some examples:

Don’t be afraid of opening files

You don’t need to upload a file to a server in order to make use of the <input type="file"> element. JavaScript can access the content of that file directly, which opens up a wealth of opportunities for useful functionality.

Some examples:

  • ocr is the first tool I built for my collection, described in Running OCR against PDFs and images directly in your browser . It uses PDF.js and Tesseract.js to allow users to open a PDF in their browser which it then converts to an image-per-page and runs through OCR.
  • social-media-cropper lets you open (or paste in) an existing image and then crop it to common dimensions needed for different social media platforms—2:1 for Twitter and LinkedIn, 1.4:1 for Substack etc.
  • ffmpeg-crop lets you open and preview a video file in your browser, drag a crop box within it and then copy out the ffmpeg command needed to produce a cropped copy on your own machine.

You can offer downloadable files too

An HTML tool can generate a file for download without needing help from a server.

The JavaScript library ecosystem has a huge range of packages for generating files in all kinds of useful formats.

Pyodide can run Python code in the browser

Pyodide is a distribution of Python that’s compiled to WebAssembly and designed to run directly in browsers. It’s an engineering marvel and one of the most underrated corners of the Python world.

It also cleanly loads from a CDN, which means there’s no reason not to use it in HTML tools!

Even better, the Pyodide project includes micropip —a mechanism that can load extra pure-Python packages from PyPI via CORS.

WebAssembly opens more possibilities

Pyodide is possible thanks to WebAssembly. WebAssembly means that a vast collection of software originally written in other languages can now be loaded in HTML tools as well.

Squoosh.app was the first example I saw that convinced me of the power of this pattern—it makes several best-in-class image compression libraries available directly in the browser.

I’ve used WebAssembly for a few of my own tools:

Remix your previous tools

The biggest advantage of having a single public collection of 100+ tools is that it’s easy for my LLM assistants to recombine them in interesting ways.

Sometimes I’ll copy and paste a previous tool into the context, but when I’m working with a coding agent I can reference them by name—or tell the agent to search for relevant examples before it starts work.

The source code of any working tool doubles as clear documentation of how something can be done, including patterns for using editing libraries. An LLM with one or two existing tools in their context is much more likely to produce working code.

I built pypi-changelog by telling Claude Code:

Look at the pypi package explorer tool

And then, after it had found and read the source code for zip-wheel-explorer :

Build a new tool pypi-changelog.html which uses the PyPI API to get the wheel URLs of all available versions of a package, then it displays them in a list where each pair has a "Show changes" clickable in between them - clicking on that fetches the full contents of the wheels and displays a nicely rendered diff representing the difference between the two, as close to a standard diff format as you can get with JS libraries from CDNs, and when that is displayed there is a "Copy" button which copies that diff to the clipboard

Here’s the full transcript .

See also Running OCR against PDFs and images directly in your browser

Record the prompt and transcript

I like keeping (and publishing) records of everything I do with LLMs, to help me grow my skills at using them over time.

For HTML tools I built by chatting with an LLM platform directly I use the “share” feature for those platforms.

For Claude Code or Codex CLI or other coding agents I copy and paste the full transcript from the terminal into my terminal-to-html tool and share that using a Gist.

In either case I include links to those transcripts in the commit message when I save the finished tool to my repository. You can see those in my tools.simonwillison.net colophon .

Go forth and build

I’ve had so much fun exploring the capabilities of LLMs in this way over the past year and a half, and building tools in this way has been invaluable in helping me understand both the potential for building tools with HTML and the capabilities of the LLMs that I’m building them with.

If you’re interested in starting your own collection I highly recommend it! All you need to get started is a free GitHub repository with GitHub Pages enabled (Settings -> Pages -> Source -> Deploy from a branch -> main) and you can start copying in .html pages generated in whatever manner you like.

Bonus transcript : Here’s how I used Claude Code and shot-scraper to add the screenshots to this post.

Useful patterns for building HTML tools

Simon Willison
simonwillison.net
2025-12-10 21:00:59
I've started using the term HTML tools to refer to HTML applications that I've been building which combine HTML, JavaScript, and CSS in a single file and use them to provide useful functionality. I have built over 150 of these in the past year, almost all of them written by LLMs. This article presen...
Original Article

10th December 2025

I’ve started using the term HTML tools to refer to HTML applications that I’ve been building which combine HTML, JavaScript, and CSS in a single file and use them to provide useful functionality. I have built over 150 of these in the past year, almost all of them written by LLMs. This article presents a collection of useful patterns I’ve discovered along the way.

First, some examples to show the kind of thing I’m talking about:

  • svg-render renders SVG code to downloadable JPEGs or PNGs
  • pypi-changelog lets you generate (and copy to clipboard) diffs between different PyPI package releases.
  • bluesky-thread provides a nested view of a discussion thread on Bluesky.

These are some of my recent favorites. I have dozens more like this that I use on a regular basis.

You can explore my collection on tools.simonwillison.net —the by month view is useful for browsing the entire collection.

If you want to see the code and prompts, almost all of the examples in this post include a link in their footer to “view source” on GitHub. The GitHub commits usually contain either the prompt itself or a link to the transcript used to create the tool.

The anatomy of an HTML tool

These are the characteristics I have found to be most productive in building tools of this nature:

  1. A single file: inline JavaScript and CSS in a single HTML file means the least hassle in hosting or distributing them, and crucially means you can copy and paste them out of an LLM response.
  2. Avoid React, or anything with a build step. The problem with React is that JSX requires a build step, which makes everything massively less convenient. I prompt “no react” and skip that whole rabbit hole entirely.
  3. Load dependencies from a CDN. The fewer dependencies the better, but if there’s a well known library that helps solve a problem I’m happy to load it from CDNjs or jsdelivr or similar.
  4. Keep them small. A few hundred lines means the maintainability of the code doesn’t matter too much: any good LLM can read them and understand what they’re doing, and rewriting them from scratch with help from an LLM takes just a few minutes.

The end result is a few hundred lines of code that can be cleanly copied and pasted into a GitHub repository.

Prototype with Artifacts or Canvas

The easiest way to build one of these tools is to start in ChatGPT or Claude or Gemini. All three have features where they can write a simple HTML+JavaScript application and show it to you directly.

Claude calls this “Artifacts”, ChatGPT and Gemini both call it “Canvas”. Claude has the feature enabled by default, ChatGPT and Gemini may require you to toggle it on in their “tools” menus.

Try this prompt in Gemini or ChatGPT:

Build a canvas that lets me paste in JSON and converts it to YAML. No React.

Or this prompt in Claude:

Build an artifact that lets me paste in JSON and converts it to YAML. No React.

I always add “No React” to these prompts, because otherwise they tend to build with React, resulting in a file that is harder to copy and paste out of the LLM and use elsewhere. I find that attempts which use React take longer to display (since they need to run a build step) and are more likely to contain crashing bugs for some reason, especially in ChatGPT.

All three tools have “share” links that provide a URL to the finished application. Examples:

Switch to a coding agent for more complex projects

Coding agents such as Claude Code and Codex CLI have the advantage that they can test the code themselves while they work on it using tools like Playwright. I often upgrade to one of those when I’m working on something more complicated, like my Bluesky thread viewer tool shown above.

I also frequently use asynchronous coding agents like Claude Code for web to make changes to existing tools. I shared a video about that in Building a tool to copy-paste share terminal sessions using Claude Code for web .

Claude Code for web and Codex Cloud run directly against my simonw/tools repo, which means they can publish or upgrade tools via Pull Requests (here are dozens of examples ) without me needing to copy and paste anything myself.

Load dependencies from CDNs

Any time I use an additional JavaScript library as part of my tool I like to load it from a CDN.

The three major LLM platforms support specific CDNs as part of their Artifacts or Canvas features, so often if you tell them “Use PDF.js” or similar they’ll be able to compose a URL to a CDN that’s on their allow-list.

Sometimes you’ll need to go and look up the URL on cdnjs or jsDelivr and paste it into the chat.

CDNs like these have been around for long enough that I’ve grown to trust them, especially for URLs that include the package version.

The alternative to CDNs is to use npm and have a build step for your projects. I find this reduces my productivity at hacking on individual tools and makes it harder to self-host them.

Host them somewhere else

I don’t like leaving my HTML tools hosted by the LLM platforms themselves for a couple of reasons. First, LLM platforms tend to run the tools inside a tight sandbox with a lot of restrictions. They’re often unable to load data or images from external URLs, and sometimes even features like linking out to other sites are disabled.

The end-user experience often isn’t great either. They show warning messages to new users, often take additional time to load and delight in showing promotions for the platform that was used to create the tool.

They’re also not as reliable as other forms of static hosting. If ChatGPT or Claude are having an outage I’d like to still be able to access the tools I’ve created in the past.

Being able to easily self-host is the main reason I like insisting on “no React” and using CDNs for dependencies—the absence of a build step makes hosting tools elsewhere a simple case of copying and pasting them out to some other provider.

My preferred provider here is GitHub Pages because I can paste a block of HTML into a file on github.com and have it hosted on a permanent URL a few seconds later. Most of my tools end up in my simonw/tools repository which is configured to serve static files at tools.simonwillison.net .

Take advantage of copy and paste

One of the most useful input/output mechanisms for HTML tools comes in the form of copy and paste .

I frequently build tools that accept pasted content, transform it in some way and let the user copy it back to their clipboard to paste somewhere else.

Copy and paste on mobile phones is fiddly, so I frequently include “Copy to clipboard” buttons that populate the clipboard with a single touch.

Most operating system clipboards can carry multiple formats of the same copied data. That’s why you can paste content from a word processor in a way that preserves formatting, but if you paste the same thing into a text editor you’ll get the content with formatting stripped.

These rich copy operations are available in JavaScript paste events as well, which opens up all sorts of opportunities for HTML tools.

  • hacker-news-thread-export lets you paste in a URL to a Hacker News thread and gives you a copyable condensed version of the entire thread, suitable for pasting into an LLM to get a useful summary.
  • paste-rich-text lets you copy from a page and paste to get the HTML—particularly useful on mobile where view-source isn’t available.
  • alt-text-extractor lets you paste in images and then copy out their alt text.

Build debugging tools

The key to building interesting HTML tools is understanding what’s possible. Building custom debugging tools is a great way to explore these options.

clipboard-viewer is one of my most useful. You can paste anything into it (text, rich text, images, files) and it will loop through and show you every type of paste data that’s available on the clipboard.

Clipboard Format Viewer. Paste anywhere on the page (Ctrl+V or Cmd+V). This shows text/rtf with a bunch of weird code, text/plain with some pasted HTML diff and a Clipboard Event Information panel that says Event type: paste, Formats available: text/rtf, text/plain, 0 files reported and 2 clipboard items reported.

This was key to building many of my other tools, because it showed me the invisible data that I could use to bootstrap other interesting pieces of functionality.

More debugging examples:

  • keyboard-debug shows the keys (and KeyCode values) currently being held down.
  • cors-fetch reveals if a URL can be accessed via CORS.
  • exif displays EXIF data for a selected photo.

Persist state in the URL

HTML tools may not have access to server-side databases for storage but it turns out you can store a lot of state directly in the URL.

I like this for tools I may want to bookmark or share with other people.

Use localStorage for secrets or larger state

The localStorage browser API lets HTML tools store data persistently on the user’s device, without exposing that data to the server.

I use this for larger pieces of state that don’t fit comfortably in a URL, or for secrets like API keys which I really don’t want anywhere near my server —even static hosts might have server logs that are outside of my influence.

  • word-counter is a simple tool I built to help me write to specific word counts, for things like conference abstract submissions. It uses localStorage to save as you type, so your work isn’t lost if you accidentally close the tab.
  • render-markdown uses the same trick—I sometimes use this one to craft blog posts and I don’t want to lose them.
  • haiku is one of a number of LLM demos I’ve built that request an API key from the user (via the prompt() function) and then store that in localStorage . This one uses Claude Haiku to write haikus about what it can see through the user’s webcam.

Collect CORS-enabled APIs

CORS stands for Cross-origin resource sharing . It’s a relatively low-level detail which controls if JavaScript running on one site is able to fetch data from APIs hosted on other domains.

APIs that provide open CORS headers are a goldmine for HTML tools. It’s worth building a collection of these over time.

Here are some I like:

  • iNaturalist for fetching sightings of animals, including URLs to photos
  • PyPI for fetching details of Python packages
  • GitHub because anything in a public repository in GitHub has a CORS-enabled anonymous API for fetching that content from the raw.githubusercontent.com domain, which is behind a caching CDN so you don’t need to worry too much about rate limits or feel guilty about adding load to their infrastructure.
  • Bluesky for all sorts of operations
  • Mastodon has generous CORS policies too, as used by applications like phanpy.social

GitHub Gists are a personal favorite here, because they let you build apps that can persist state to a permanent Gist through making a cross-origin API call.

  • species-observation-map uses iNaturalist to show a map of recent sightings of a particular species.
  • zip-wheel-explorer fetches a .whl file for a Python package from PyPI, unzips it (in browser memory) and lets you navigate the files.
  • github-issue-to-markdown fetches issue details and comments from the GitHub API (including expanding any permanent code links) and turns them into copyable Markdown.
  • terminal-to-html can optionally save the user’s converted terminal session to a Gist.
  • bluesky-quote-finder displays quotes of a specified Bluesky post, which can then be sorted by likes or by time.

LLMs can be called directly via CORS

All three of OpenAI, Anthropic and Gemini offer JSON APIs that can be accessed via CORS directly from HTML tools.

Unfortunately you still need an API key, and if you bake that key into your visible HTML anyone can steal it and use to rack up charges on your account.

I use the localStorage secrets pattern to store API keys for these services. This sucks from a user experience perspective—telling users to go and create an API key and paste it into a tool is a lot of friction—but it does work.

Some examples:

Don’t be afraid of opening files

You don’t need to upload a file to a server in order to make use of the <input type="file"> element. JavaScript can access the content of that file directly, which opens up a wealth of opportunities for useful functionality.

Some examples:

  • ocr is the first tool I built for my collection, described in Running OCR against PDFs and images directly in your browser . It uses PDF.js and Tesseract.js to allow users to open a PDF in their browser which it then converts to an image-per-page and runs through OCR.
  • social-media-cropper lets you open (or paste in) an existing image and then crop it to common dimensions needed for different social media platforms—2:1 for Twitter and LinkedIn, 1.4:1 for Substack etc.
  • ffmpeg-crop lets you open and preview a video file in your browser, drag a crop box within it and then copy out the ffmpeg command needed to produce a cropped copy on your own machine.

You can offer downloadable files too

An HTML tool can generate a file for download without needing help from a server.

The JavaScript library ecosystem has a huge range of packages for generating files in all kinds of useful formats.

Pyodide can run Python code in the browser

Pyodide is a distribution of Python that’s compiled to WebAssembly and designed to run directly in browsers. It’s an engineering marvel and one of the most underrated corners of the Python world.

It also cleanly loads from a CDN, which means there’s no reason not to use it in HTML tools!

Even better, the Pyodide project includes micropip —a mechanism that can load extra pure-Python packages from PyPI via CORS.

WebAssembly opens more possibilities

Pyodide is possible thanks to WebAssembly. WebAssembly means that a vast collection of software originally written in other languages can now be loaded in HTML tools as well.

Squoosh.app was the first example I saw that convinced me of the power of this pattern—it makes several best-in-class image compression libraries available directly in the browser.

I’ve used WebAssembly for a few of my own tools:

Remix your previous tools

The biggest advantage of having a single public collection of 100+ tools is that it’s easy for my LLM assistants to recombine them in interesting ways.

Sometimes I’ll copy and paste a previous tool into the context, but when I’m working with a coding agent I can reference them by name—or tell the agent to search for relevant examples before it starts work.

The source code of any working tool doubles as clear documentation of how something can be done, including patterns for using editing libraries. An LLM with one or two existing tools in their context is much more likely to produce working code.

I built pypi-changelog by telling Claude Code:

Look at the pypi package explorer tool

And then, after it had found and read the source code for zip-wheel-explorer :

Build a new tool pypi-changelog.html which uses the PyPI API to get the wheel URLs of all available versions of a package, then it displays them in a list where each pair has a "Show changes" clickable in between them - clicking on that fetches the full contents of the wheels and displays a nicely rendered diff representing the difference between the two, as close to a standard diff format as you can get with JS libraries from CDNs, and when that is displayed there is a "Copy" button which copies that diff to the clipboard

Here’s the full transcript .

See also Running OCR against PDFs and images directly in your browser

Record the prompt and transcript

I like keeping (and publishing) records of everything I do with LLMs, to help me grow my skills at using them over time.

For HTML tools I built by chatting with an LLM platform directly I use the “share” feature for those platforms.

For Claude Code or Codex CLI or other coding agents I copy and paste the full transcript from the terminal into my terminal-to-html tool and share that using a Gist.

In either case I include links to those transcripts in the commit message when I save the finished tool to my repository. You can see those in my tools.simonwillison.net colophon .

Go forth and build

I’ve had so much fun exploring the capabilities of LLMs in this way over the past year and a half, and building tools in this way has been invaluable in helping me understand both the potential for building tools with HTML and the capabilities of the LLMs that I’m building them with.

If you’re interested in starting your own collection I highly recommend it! All you need to get started is a free GitHub repository with GitHub Pages enabled (Settings -> Pages -> Source -> Deploy from a branch -> main) and you can start copying in .html pages generated in whatever manner you like.

Bonus transcript : Here’s how I used Claude Code and shot-scraper to add the screenshots to this post.

Apple Services Experiencing Outage

Hacker News
www.apple.com
2025-12-10 20:47:15
Comments...
Original Article

The Normalization of Deviance in AI

Simon Willison
simonwillison.net
2025-12-10 20:18:58
The Normalization of Deviance in AI This thought-provoking essay from Johann Rehberger directly addresses something that I’ve been worrying about for quite a while: in the absence of any headline-grabbing examples of prompt injection vulnerabilities causing real economic harm, is anyone going to car...
Original Article

The Normalization of Deviance in AI . This thought-provoking essay from Johann Rehberger directly addresses something that I’ve been worrying about for quite a while: in the absence of any headline-grabbing examples of prompt injection vulnerabilities causing real economic harm, is anyone going to care?

Johann describes the concept of the “Normalization of Deviance” as directly applying to this question.

Coined by Diane Vaughan , the key idea here is that organizations that get away with “deviance” - ignoring safety protocols or otherwise relaxing their standards - will start baking that unsafe attitude into their culture. This can work fine… until it doesn’t. The Space Shuttle Challenger disaster has been partially blamed on this class of organizational failure.

As Johann puts it:

In the world of AI, we observe companies treating probabilistic, non-deterministic, and sometimes adversarial model outputs as if they were reliable, predictable, and safe.

Vendors are normalizing trusting LLM output, but current understanding violates the assumption of reliability.

The model will not consistently follow instructions, stay aligned, or maintain context integrity. This is especially true if there is an attacker in the loop (e.g indirect prompt injection).

However, we see more and more systems allowing untrusted output to take consequential actions. Most of the time it goes well, and over time vendors and organizations lower their guard or skip human oversight entirely, because “it worked last time.”

This dangerous bias is the fuel for normalization: organizations confuse the absence of a successful attack with the presence of robust security.

14 unexpected US gifts to give the men in your life this holiday season

Guardian
www.theguardian.com
2025-12-10 20:15:46
From Crocs to indestructible wallets, we rounded up the best guy-approved gifts they won’t know how they lived withoutThe 15 US gifts to give the women in your lifeSign up for the Filter US newsletter, your weekly guide to buying fewer, better thingsWhether you have been together for years, just mad...
Original Article

W hether you have been together for years, just made it official, or maybe you’re just shopping for your brother, one thing remains certain: he is going to claim he doesn’t need anything . The special guys in our lives often default to what they’ve always used and loved, from threadbare T-shirts to melted spatulas.

This season, give him the gift of novelty. Introduce him to a new gadget he hasn’t thought of for his man-cave (yes, our respondents still have those). Show him that it’s OK to cry with a personalized keepsake commemorating your shared history. Introduce a little color into his life if his year has been a bit bleak.

We tapped several guys to hear about the unexpected gifts they now couldn’t live without – and we heard from some of the givers, too. From headphones to keep him plugged into his favorite podcast to personalized artwork that tugs at his heartstrings, here is exactly what to get the lucky guy in your life this holiday season.


Plufl Human Dog Bed

two men holding a Plufl Human Dog Bed
Photograph: Courtesy of Plufl
$275 at Plufl

Whether he’s a family man or a dog dad , the Plufl human-sized dog bed is about to take center stage in his living space. Not only does it feature deep bolsters and orthopedic foam to support his joints and give his pup a soft place to rest, but it’s a breeze to store when not in use.

“I didn’t expect to love this human doggy bed as much as I do, but honestly, it’s become one of my favorite spots in the house,” says Amir A, a software developer. “My dog curls up in it, my kids join him, and half the time I end up climbing in there with them too. It’s so comfortable that I’ve actually fallen asleep in it a few times. This bed was supposed to be for the dog, but now we’re basically sharing it.”

Plufl

Human Dog Bed

$275


Ridge Wallet

A product photo of a Ridge Wallet
Photograph: Courtesy of Amazon
$68.99 at Amazon
$69 at Ridge

If his wallet is currently disintegrating in his pocket, with cash and cards peeking out from under its frayed edges, it’s time to upgrade him to the Ridge, made of near-indestructible materials and RFID-blocking technology that defies chip readers. “I had been complaining about my slim wallet for a while, so my wife got me the top-of-the-line one with an AirTag attachment so I never lose it,” says Jamie Gewurz, a financial planning associate.

Ridge

Wallet

from $68.99


Apple AirPods Pro 3 Wireless Earbuds

Apple

AirPods Pro 3 Wireless Earbuds

from $219.99

Apple AirPods Pro 3 Wireless Earbuds
Photograph: Courtesy of Amazon
$219.99 at Amazon
$249 at Apple

He’s got to tune out the world somehow. Let him do it in style with Apple’s new AirPods Pro 3 earbuds , which pair effective noise-cancellation with crisp sound quality. “I listen to music and podcasts all the time. When I’m engaged in a project at work, they help me lock in,” says James Anderson, a structural engineer. They can also double as hearing aids and allow live translation when paired to newer iPhones.

Apple

AirPods Pro 3 Wireless Earbuds

from $219.99


Custom ‘Met, Engaged, Married, Live’ map print

Etsy

Custom ‘Met, Engaged, Married, Live’ map print

$90.38

A Gift Of Happiness Met Engaged, Married Map Print
Photograph: Courtesy of Etsy
From $90.38 at Etsy

Matthew K, an e-commerce manager, recently received this puzzle from his wife to commemorate his first-year anniversary. It depicted exact pinned locations of their first date, their engagement spot, wedding venue, and honeymoon. “I thought it was a cool way to remember our history. It was definitely an interesting way to present our relationship milestones on paper, which happens to be customary for the first anniversary,” says Matthew.

Etsy

Custom ‘Met, Engaged, Married, Live’ map print

$90.38


HexClad 4.5qt deep sauté pan with tempered glass lid

HexClad

4.5qt deep sauté pan with tempered glass lid

from $189

saute pan with cooked vegetables inside
Photograph: Courtesy of Amazon
$189 at Amazon
$189 at HexClad

Robert Hartry, an architectural designer, received this HexClad pan from his partner for Christmas last year. “Its side walls are higher than a normal sauté pan, so it makes it very versatile,” says Hartry. “I also love that it’s fairly indestructible. Emily has tried, and this pan has yet to fail.”

HexClad

4.5qt deep sauté pan with tempered glass lid

from $189


Leatherman

Wave Plus Multi-Tool

from $119.95

a stainless steel multitool gadget
Photograph: Courtesy of REI
$119.95 at Global Industrial
$119.95 at Amazon

With 18 built-in tools from scissors to pliers, any guy with a knack for DIY projects will start to consider this Leatherman an extension of his hand. “I got it for my brother for Christmas,” says Elizabeth Tomaras, a quality manager. “He’s always outdoors for work and even for play, and I knew he had [a multitool] but definitely not as nice as this one. I don’t think he would necessarily spend that much on a tool for himself. He likes its sleek design and that there are so many options. It’s not just the basic model .”

Leatherman

Wave Plus Multi-Tool

from $119.95


Saxx Stretch Cotton Boxer Briefs

Saxx

Stretch Cotton Boxer Briefs

$68

black, blue and white boxer briefs
Photograph: Courtesy of Amazon
$68 for three-pack at Zappos

Noticing his underwear could use a little TLC? Run this thoughtful errand for him with this pack of three supportive cotton-blend pairs. “A classic gift Jamie has gotten for me, and something no guy actually buys for himself, is Saxx underwear,” says Lukas Snyder, a trader. “I get all of my underwear over the holidays, and only from the women in my life.”

Saxx

Stretch Cotton Boxer Briefs

$68


Jumbo playing cards

Jumbo

Playing Cards

from $3.54

Bicycle Jumbo Playing Cards
Photograph: Courtesy of Walmart
$3.54 at Walmart
$3.99 at CVS

Snyder’s girlfriend got a standard deck of cards and kicked it up a notch for serious sentimental value. “She bedazzled a deck of cards into a ‘52 things I love about you’ gift. She cut out small squares of paper and glued them on to each card and wrote on it,” says Snyder. “She also drew on the cards, then put a keychain around the corner to bind it all together.”

Jumbo

Playing Cards

from $3.54


Crocs clogs

Crocs

Classic Clog

from $34.99

Crocs Unisex Adult Classic Clog in Orange
Photograph: Courtesy of Amazon
$34.99 at Crocs
$38.87 at Amazon

Crocs are a lot cooler now than they once were, making them a practical gift for your brother who’s hard to shop for. “When they first came out, I thought they were ugly and I didn’t get the hype,” says Shawn Thicke, a children’s musician and performer. “As I performed for more and more kids, they started showing me their Crocs and some even told me I should get some. Funny enough, a parent let me try theirs and I was immediately sold. They were so comfy.

“My sister remembered how much I enjoyed them, so she surprised me with a pair. I now wear them all the time! I love them. So comfy! I even wear them to gigs.”

Crocs

Classic Clog

from $34.99


Amazfit Bip 6 smartwatch

Amazfit

Bip 6 smartwatch

from $79

A product photo of an Amazfit Bip 6 Smart Watch
Photograph: Courtesy of Amazon
$79 at Amazon
$79 at Walmart

For a guy who values fitness but doesn’t want all the distractions of an Apple Watch , this Amazfit smartwatch is a smart alternative. “It’s really good quality and has a lot of features you don’t need to pay monthly for, unlike other smartwatches,” says Reza T, an IT specialist. That includes GPS run tracking, more than 140 workout modes, AI fitness coaching and more.

Amazfit

Bip 6 smartwatch

from $79


Meater Pro Duo Wireless Smart Meat Thermometer

Meater

Pro Duo Wireless Smart Meat Thermometer

from $199.95

Meater Pro Duo Wireless Smart Meat Thermometer
Photograph: Courtesy of Amazon
$199.95 at Amazon
$199.95 at Meater

Jennifer Orleans got her grill-enthusiast brother this heavy-duty wireless smart meat thermometer, and there is so much your favorite carnivore is going to love about it too. “He likes how it enables a perfect cook every time,” says Orleans, a project manager. “He can go inside and will get a notification on his phone when it’s ready.”

Meater

Pro Duo Wireless Smart Meat Thermometer

from $199.95


Loop Quiet 2 Ear Plugs

Loop

Quiet 2 Ear Plugs

from $20.95

A product photo of Loop Quiet 2 Ear Plugs
Photograph: Courtesy of Amazon
$20.95 at Amazon
$24.99 at Target

Chantal Stafford-Abbott, a model, considers these viral reusable earplugs the ultimate gift for a guy who might need a little extra shuteye in less-than-quiet circumstances. “They actually work, you don’t have to buy loads from the pharmacy, and they come in the cutest little case,” she says. Four tip sizes in a comfortable, flexible silicone material ensure he finds the perfect fit, and there are no batteries to charge or replace.

Loop

Quiet 2 Ear Plugs

from $20.95


Target Darts Corona Vision Dartboard Lighting System

Target Darts

Corona Vision Dartboard Lighting System

from $89.84

Target Darts Corona Vision Dartboard Lighting System
Photograph: Courtesy of Amazon
$89.84 at Amazon
$106.49 at Walmart

If you know a guy trying to recreate his favorite watering hole at home, there’s a gift for that. When they were just dating, Matthew’s then girlfriend gifted him this ringlight for his dartboard to increase visibility and up his game. “It gives you that professional at-home setup, illuminating your space if you’re playing in the dark,” says Matthew.

Target Darts

Corona Vision Dartboard Lighting System

from $89.84


Custom family trading card

Etsy

Custom family trading card

$3.46

custom Simpsons Family Portrait in frame on table
Photograph: Courtesy of Etsy
From $3.46 at Etsy

Jacqueline Hopmeyer, a marketing director, gifted her husband this custom trading card as “a super cute Father’s Day present that could double as a holiday gift”. You can customize the sport, number of kids, hair color, jersey and more. It’s set in transparent acrylic that stands upright on its own, ready to display proudly on his shelf. “He was so moved by this,” she says.

Etsy

Custom family trading card

$3.46

The Best Big Media Merger Is No Merger at All

Electronic Frontier Foundation
www.eff.org
2025-12-10 20:08:04
The state of streaming is... bad. It’s very bad. The first step in wanting to watch anything is a web search: “Where can I stream X?” Then you have to scroll past an AI summary with no answers, and then scroll past the sponsored links. After that, you find out that the thing you want to watch was ma...
Original Article

The state of streaming is... bad. It’s very bad. The first step in wanting to watch anything is a web search: “Where can I stream X?” Then you have to scroll past an AI summary with no answers, and then scroll past the sponsored links. After that, you find out that the thing you want to watch was made by a studio that doesn’t exist anymore or doesn’t have a streaming service. So, even though you subscribe to more streaming services than you could actually name, you will have to buy a digital copy to watch. A copy that, despite paying for it specifically, you do not actually own and might vanish in a few years.

Then, after you paid to see something multiple times in multiple ways (theater ticket, VHS tape, DVD, etc.), the mega-corporations behind this nightmare will try to get Congress to pass laws to ensure you keep paying them. In the end, this is easier than making a product that works. Or, as someone put it on social media , these companies have forgotten “that their entire existence relies on being slightly more convenient than piracy.”

It’s important to recognize this as we see more and more media mergers. These mergers are not about quality, they’re about control.

In the old days, studios made a TV show. If the show was a hit, they increased how much they charged companies to place ads during the show. And if the show was a hit for long enough, they sold syndication rights to another channel. Then people could discover the show again, and maybe come back to watch it air live. In that model, the goal was to spread access to a program as much as possible to increase viewership and the number of revenue streams.

Now, in the digital age, studios have picked up a Silicon Valley trait: putting all their eggs into the basket of “increasing the number of users.” To do that, they have to create scarcity. There has to be only one destination for the thing you’re looking for, and it has to be their own. And you shouldn’t be able to control the experience at all. They should.

They’ve also moved away from creating buzzy new exclusives to get you to pay them. That requires risk and also, you know, paying creative people to make them. Instead, they’re consolidating.

Media companies keep announcing mergers and acquisitions. They’ve been doing it for a long time, but it’s really ramped up in the last few years. And these mergers are bad for all the obvious reasons. There are the speech and censorship reasons that came to a head in, of all places, late night television. There are the labor issues. There are the concentration of power issues. There are the obvious problems that the fewer studios that exist the fewer chances good art gets to escape Hollywood and make it to our eyes and ears. But when it comes specifically to digital life there are these: consumer experience and ownership.

First, the more content that comes under a single corporation’s control, the more they expect you to come to them for it. And the more they want to charge. And because there is less competition, the less they need to work to make their streaming app usable. They then enforce their hegemony by using the draconian copyright restrictions they’ve lobbied for to cripple smaller competitors, critics, and fair use.

When everything is either Disney or NBCUniversal or Warner Brothers-Discovery-Paramount-CBS and everything is totally siloed, what need will they have to spend money improving any part of their product? Making things is hard, stopping others from proving how bad you are is easy, thanks to how broken copyright law is.

Furthermore, because every company is chasing increasing subscriber numbers instead of multiple revenue streams, they have an interest in preventing you from ever again “owning” a copy of a work. This was always sort of part of the business plan, but it was on a scale of a) once every couple of years,  b) at least it came, in theory, with some new features or enhanced quality and c) you actually owned the copy you paid for. Now they want you to pay them every month for access to same copy. And, hey, the price is going to keep going up the fewer options you have. Or you will see more ads. Or start seeing ads where there weren’t any before.

On the one hand, the increasing dependence on direct subscriber numbers does give users back some power. Jimmy Kimmel’s reinstatement by ABC was partly due to the fact that the company was about to announce a price hike for Disney+ and it couldn’t handle losing users due to the new price and due to popular outrage over Kimmel’s treatment.

On the other hand, well, there's everything else.

The latest kerfuffle is over the sale of Warner Brothers-Discovery, a company that was already the subject of a sale and merger resulting in the hyphen. Netflix was competiing against another recently merged media megazord of Paramount Skydance.

Warner Brothers-Discovery accepted a bid from Netflix, enraging Paramount Skydance, which has now launched a hostile takeover .

Now the optimum outcome is for neither of these takeovers to happen. There are already too few players in Hollywood. It does nothing for the health of the industry to allow either merger. A functioning antitrust regime would stop both the sale and the hostile takeover attempt, full stop. But Hollywood and the federal government are frequent collaborators, and the feds have little incentive to stop Hollywood’s behemoths from growing even further, as long as they continue to play their role as propagandists for the American empire.

The promise of the digital era was in part convenience. You never again had to look at TV listings to find out when something would be airing. Virtually unlimited digital storage meant everything would be at your fingertips. But then the corporations went to work to make sure it never happened. And with each and every merger, that promise gets further and further away.

We Need to Talk About the 'Dystopian' PureGym Entry Tubes

hellgate
hellgatenyc.com
2025-12-10 19:50:46
Finally, a local gym that feels more like waiting on the TSA line....
Original Article

Blink, and the tubes are here.

To enter dozens of PureGym ( formerly Blink Fitness ) locations across New York City, members must now step into a tube immediately after entering the building and scanning their membership ID from their PureGym app. Once the PureGym technology has taken several seconds to affirm their true identity, they are then released from the tube, and into the fitness center. Finally, a local gym that feels more like waiting on the TSA line .

Outside the Gates Avenue PureGym in Bed-Stuy on Tuesday afternoon, a small crowd of gymgoers waited their turn to enter the two new "entry pods," which were installed by PureGym after the UK's largest gym company bought Blink.

"What the fuck? I literally don't want to do this," one man remarked, before putting his whole body into the tube. "It's like a bad sci-fi show, it's ridiculous!" another man exclaimed, while—you guessed it—he succumbed to the tube.

Give us your email to read the full story

Sign up now for our free newsletters.

Sign up

Microsoft Teams to warn of suspicious traffic with external domains

Bleeping Computer
www.bleepingcomputer.com
2025-12-10 19:32:08
Microsoft is working on a new Teams security feature that will analyze suspicious traffic with external domains to help IT administrators tackle potential security threats. [...]...
Original Article

Microsoft Teams

Microsoft is working on a new Teams security feature that will analyze suspicious traffic with external domains to help IT administrators tackle potential security threats.

As explained in a Microsoft 365 roadmap update this week, the "External Domains Anomalies Report" will help admins protect their organizations without disrupting legitimate business communications.

The new tool will do this by analyzing messaging trends to identify sharp spikes in activity, communications with new domains, or abnormal engagement patterns with entities outside their organizations.

It will provide admins with insights from monitoring communication patterns and flagging any unusual interactions that could indicate data sharing or security threats.

"This new report helps admins proactively spot unusual or risky interactions with external organizations. By analyzing communication trends and detecting sudden spikes, new domains, or abnormal engagement patterns, it provides early visibility into potential data-sharing or security risks," Microsoft said .

"As external collaboration grows, this report delivers actionable insights to safeguard your tenant while supporting productive cross-organization work."

The feature will begin rolling out worldwide in February 2026 to standard multi-tenant environments on the web platform. However, Microsoft has yet to share whether this new feature will require additional licensing or will be included with existing Teams subscriptions.

Since the start of the year, Microsoft has announced that Teams will warn users when they send or receive private messages containing links flagged as malicious, and has been working to enhance Teams' protection against malicious URLs and file types.

It is now also rolling out new Teams features that will let users report messages mistakenly flagged as security threats and automatically block screen-capture attempts during meetings.

Microsoft will also add a new call handler to speed up the Teams desktop client , improving launch times and performance on Windows 11 systems.

tines

Break down IAM silos like Bitpanda, KnowBe4, and PathAI

Broken IAM isn't just an IT problem - the impact ripples across your whole business.

This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.

Getting a Gemini API key is an exercise in frustration

Hacker News
ankursethi.com
2025-12-10 20:29:12
Comments...
Original Article

Last week, I started working on a new side-project. It’s a standard React app partly made up of run-of-the-mill CRUD views—a perfect fit for LLM-assisted programming. I reasoned that if I could get an LLM to quickly write the boring code for me, I’d have more time to focus on the interesting problems I wanted to solve.

I’ve pretty much settled on Claude Code as my coding assistant of choice, but I’d been hearing great things about Google’s Gemini 3 Pro. Despite my aversion to Google products, I decided to try it out on my new codebase.

I already had Gemini CLI installed, but that only gave me access to Gemini 2.5 with rate limits. I wanted to try out Gemini 3 Pro, and I wanted to avoid being rate limited. I had some spare cash to burn on this experiment, so I went looking for ways to pay for a Gemini Pro plan, if such a thing existed.

Thus began my grand adventure in trying to give Google my money.

What is a Gemini, really?

The name “Gemini” is so overloaded that it barely means anything. Based on the context, Gemini could refer to:

  • The chatbot available at gemini.google.com .
  • The mobile app that lets you use the same Gemini chatbot on your iPhone or Android .
  • The voice assistant on Android phones.
  • The AI features built into Google Workspace , Firebase, Colab, BigQuery, and other Google products.
  • Gemini CLI, an agentic coding tool for your terminal that works the same way as Claude Code or OpenAI Codex.
  • The Gemini Code Assist suite of products, which includes extensions for various IDEs, a GitHub app, and Gemini CLI.
  • The underlying LLM powering all these products.
  • Probably three more products by the time I finish writing this blog post.

To make things even more confusing, Google has at least three different products just for agentic coding: Gemini Code Assist (Gemini CLI is a part of this suite of products), Jules , and Antigravity .

And then there’s a bunch of other GenAI stuff that is powered by Gemini but doesn’t have the word Gemini in the name: Vertex AI Platform , Google AI Studio , NotebookLM , and who knows what else.

I just wanted to plug my credit card information into a form and get access to a coding assistant. Instead, I was dunked into an alphabet soup of products that all seemed to do similar things and, crucially, didn’t have any giant “Buy Now!” buttons for me to click.

In contrast, both Anthropic and OpenAI have two primary ways you can access their products: via their consumer offerings at claude.ai and chatgpt.com respectively, or via API credits that you can buy through their respective developer consoles . In each case, there is a form field where you can plug in your credit card details, and a big, friendly “Buy Now!” button to click.

After half an hour of searching the web, I did the obvious thing and asked the free version of Gemini (the chatbot, not one of those other Geminis) what to do:

How do I pay for the pro version of Gemini so i can use it in the terminal for writing code? I specifically want to use the Gemini 3 Pro model.

It thought for a suspiciously long time and told me that Gemini 3 Pro required a developer API key to use. Since the new model is still in preview, it’s not yet available on any of the consumer plans. When I asked follow up questions about pricing, it told me that “Something went wrong”. Which translates to: we broke something, but we won’t tell you how to fix it.

So I asked Claude for help. Between the two LLMs, I was able to figure out how to create an API key for the Gemini I wanted.

Creating an API key is easy

Google AI Studio is supposed to be the all-in-one dashboard for Google’s generative AI models. This is where you can experiment with model parameters, manage API keys, view logs, and manage billing for your projects.

I logged into Google AI Studio and created a new API key . This part was pretty straightforward: I followed the on-screen instructions and had a fresh new key housed under a project in a few seconds. I then verified that my key was working with Gemini CLI.

It worked! Now all that was left to do was to purchase some API credits. Back in Google AI Studio, I saw a link titled “Set up billing” next to my key. It looked promising, so I clicked it.

That’s where the fun really began.

Google doesn’t want my money

The “Set up billing” link kicked me out of Google AI Studio and into Google Cloud Console, and my heart sank. Every time I’ve logged into Google Cloud Console or AWS, I’ve wasted hours upon hours reading outdated documentation, gazing in despair at graphs that make no sense, going around in circles from dashboard to dashboard, and feeling a strong desire to attain freedom from this mortal coil.

Turns out I can’t just put $100 into my Gemini account. Instead, I must first create a Billing Account. After I’ve done that, I must associate it with a project. Then I’m allowed to add a payment method to the Billing Account. And then , if I’m lucky, my API key will turn into a paid API key with Gemini Pro privileges.

So I did the thing. The whole song and dance. Including the mandatory two-factor OTP verification that every Indian credit card requires. At the end of the process, I was greeted with a popup telling me I had to verify my payment method before I’d be allowed to use it.

Wait. Didn’t I just verify my payment method? When I entered the OTP from my bank?

Nope, turns out Google hungers for more data. Who’d have thunk it?

To verify my payment method for reals , I had to send Google a picture of my government-issued ID and the credit card I’d just associated with my Billing Account. I had to ensure all the numbers on my credit card were redacted by manually placing black bars on top of them in an image editor, leaving only my name and the last four digits of the credit card number visible.

This felt unnecessarily intrusive. But by this point, I was too deep in the process to quit. I was invested. I needed my Gemini 3 Pro, and I was willing to pay any price.

The upload form for the government ID rejected my upload twice before it finally accepted it. It was the same exact ID every single time, just in different file formats. It wanted a PNG file. Not a JPG file, nor a PDF file, but a PNG file. Did the upload form mention that in the instructions? Of course not.

After jumping through all these hoops, I received an email from Google telling me that my verification will be completed in a few days.

A few days ? Nothing to do but wait, I suppose.

403 Forbidden

At this point, I closed all my open Cloud Console tabs and went back to work. But when I was fifteen minutes into writing some code by hand like a Neanderthal, I received a second email from Google telling me that my verification was complete.

So for the tenth time that day, I navigated to AI Studio. For the tenth time I clicked “Set up billing” on the page listing my API keys. For the tenth time I was told that my project wasn’t associated with a billing account. For the tenth time I associated the project with my new billing account. And finally, after doing all of this, the “Quota tier” column on the page listing my API keys said “Tier 1” instead of “Set up billing”.

Wait, Tier 1? Did that mean there were other tiers? What were tiers, anyway? Was I already on the best tier? Or maybe I was on the worst one? Not important. The important part was that I had my API key and I’d managed to convince Google to charge me for it.

I went back to the Gemini CLI, ran the /settings command, and turned on the “Enable experimental features” option. I ran the /models command, which told me that Gemini 3 Pro was now available.

Success? Not yet.

When I tried sending a message to the LLM, it failed with this 403 error:

{
  "error": {
    "message": "{\n  \"error\": {\n    \"code\": 403,\n    \"message\": \"The caller does not have permission\",\n    \"status\":\"PERMISSION_DENIED\"\n  }\n}\n",
    "code": 403,
    "status": "Forbidden"
  }
}

Is that JSON inside a string inside JSON? Yes. Yes it is.

To figure out if my key was even working, I tried calling the Gemini API from JavaScript, reproducing the basic example from Google’s own documentation .

No dice. I ran into the exact same error.

I then tried talking to Gemini 3 Pro using the Playground inside Google AI Studio. It showed me a toast message saying Failed to generate content. Please try again. The chat transcript said An internal error has occurred.

At this point I gave up and walked away from my computer. It was already 8pm. I’d been trying to get things to work since 5pm. I needed to eat dinner, play Clair Obscur , and go to bed. I had no more time to waste and no more fucks to give.

Your account is in good standing at this time

Just as I was getting into bed, I received an email from Google with this subject line:

Your Google Cloud and APIs billing account XXXXXX-XXXXXX-XXXXXX is in good standing at this time.

With the message inside saying:

Based on the information you provided and further analysis by Google, we have reinstated your billing account XXXXXX-XXXXXX-XXXXXX. Your account is in good standing, and you should now have full access to your account and related Project(s) and Service(s).

I have no idea what any of this means, but Gemini 3 Pro started working correctly after I received this email. It worked in the Playground, directly by calling the API from JavaScript, and with Gemini CLI.

Problem solved, I guess. Until Google mysteriously decides that my account is no longer in good standing.

This was a waste of time

This was such a frustrating experience that I still haven’t tried using Gemini with my new codebase, nearly a week after I made all those sacrifices to the Gods of Billing Account.

I understand why the process for getting a Gemini API key is so convoluted. It’s designed for large organizations, not an individual developers trying to get work done; it serves the bureaucracy, not the people doing the work; it’s designed for maximum compliance with government regulations, not for efficiency or productivity.

Google doesn’t want my money unless I’m an organization that employs ten thousand people.

In contrast to Google, Anthropic and OpenAI are much smaller and much more nimble. They’re able to make the process of setting up a developer account quick and easy for those of us who just want to get things done. Unlike Google, they haven’t yet become complacent. They need to compete for developer mindshare if they are to survive a decade into the future. Maybe they’ll add the same level of bureaucracy to their processes as they become larger, but for now they’re fairly easy to deal with.

I’m still going to try using Gemini 3 Pro with Gemini CLI as my coding assistant, but I’ll probably cap the experiment to a month. Unless Gemini 3 Pro is a massive improvement over its competitors, I’ll stick to using tools built by organizations that want me as a customer.

glic: Turn any (npm) library into a command-line utility

Lobsters
github.com
2025-12-10 19:42:08
Comments...
Original Article
glic
----
Turn any library into a command-line utility.

EXAMPLES
  $ glic lodash camelCase hello_world
  helloWorld
  $ glic left-pad . 'foo' 5 'x'
  xxfoo
  $ glic lodash startCase hello_world
  Hello World
  $ glic slugify . "Hello, World"
  Hello-World
  $ glic qs parse 'topic=glic&count=2'
  {
    "topic": "glic",
    "count": "2"
  }

Note: If a library exports a single function, use dot (.) to reference it.

INSTALLATION
  make install

SUPPORTED RUNTIMES
  - npm/nodejs

Maybe we don't need a server

Lobsters
lecaro.me
2025-12-10 19:36:04
Comments...
Original Article

Syncthing for notes and photos

Have you heard of syncthing ? It syncs a folder between devices, like Dropbox, without the need for someone else's computer. No cloud, it's just your pc talking to your phone, directly and automatically, through your home WiFi.

It's a near perfect solution to sync your phone's photo with your computer, making google photo nearly redundant. It works great for text notes synchronization too, as long as you don't edit the same file on multiple devices while offline.

This got me thinking, how sensible could personal tech be, if it was truly serverless ?

More things could be just files

Now, this got me wondering : what if my calendar was just a bunch of text files representing events , stored on my drive. What about my mailbox, I could use pop to yank mails out of the server, add them to a list of email files that would then sync without the need to store them on the server.

What about my browser ? I use Firefox sync to synchronize my favorites , passwords and settings . Maybe those could just be files, maybe the Firefox profile folder already works when shared between android and Linux.

My contacts don't need to be on my google account. The could just be text files. Google map location history is creepy, unless it was only for me. My marked places on the map should be plain text files too. Of course my SMS should be file.

What about self hosting ?

The main private alternative to this is to self host a "cloud" service on your own hardware, be it a virtual machine on a VPS or a physical device in your home. In both cases, keeping that device reachable and secure creates a non trivial work load , requires system administration skills and it's difficult to migrate data from one app to another when the data is in db.

I've been sticking to google products for so long because I don't want to manage my own server for those things , nor do I want to switch to a different service that could shut down or enshitify tomorrow.

I wish the "data sync" and "application" parts could be separated. This would lower the chance of my data getting stuck in an app that no longer matches my needs.

Getting there

I think that for this to work we would need plugins that sync app content with a local folder of text files, in a way that doesn't break when conflict files are generated, and that generates many small text files instead of one sqlite file for example.

Sometimes we'd need full rewrite and dedicated app , for a contacts or calendar app on android for example.

Sometimes we'd use different apps on mobile and desktop to work with the same files, like I use kanbanmd on desktop and a text editor on mobile for my kanban boards.

CDKTF has been deprecated

Lobsters
github.com
2025-12-10 19:26:06
Comments...
Original Article
1 +

# The Future of Terraform CDK

2 + 3 +

## Sunset Notice

4 + 5 +

Terraform CDK (CDKTF) will sunset and be archived on December 10, 2025. HashiCorp, an IBM Company, will no longer maintain or develop the project after that date. Unfortunately, Terraform CDK did not find product-market fit at scale. HashiCorp, an IBM Company, has chosen to focus its investments on Terraform core and its broader ecosystem.

6 + 7 +

As of December 10, 2025, Terraform CDK will be archived on GitHub, and the documentation will reflect its deprecated status. The archived code will remain available on GitHub, but it will be read-only. No further updates, fixes, or improvements (including compatibility updates) will be made.

8 + 9 +

You will be able to continue to use Terraform CDK at your own risk. Terraform CDK is licensed under the Mozilla Public License (MPL). HashiCorp, an IBM Company, does not apply any additional restrictions. We encourage community forks if there’s interest in continuing development independently.

10 + 11 +

## Migration to HCL

12 + 13 +

You can use the following command to generate Terraform-compatible .tf files directly from your Terraform CDK project:

14 + 15 +

` cdktf synth --hcl `

16 + 17 +

This will produce readable HCL configuration files, making it easier to migrate away from Terraform CDK. After running the command, you can use standard Terraform CLI commands ( ` terraform init ` , ` terraform plan ` , ` terraform apply ` ) to continue managing your infrastructure. Please note that while this helps bootstrap your configuration, you may still need to review and adjust the generated files for clarity, organization, or best practices.

18 + 19 +

### Note on AWS CDK

20 + 21 +

If your infrastructure is defined in Terraform CDK but also tightly integrated with AWS CDK, you may find it more consistent to migrate directly to the AWS CDK ecosystem. If you are not using AWS CDK, we highly recommend migrating to standard Terraform and HCL for long-term support and ecosystem alignment.

22 + 23 +

## FAQ

24 + 25 +

Q: Is CDKTF still being developed?

26 + 27 +

A: No. CDKTF will sunset and be archived on December 10, 2025. HashiCorp, an IBM Company, will no longer maintain or develop the project after that date.

28 + 29 +

Q: Why is CDKTF being sunset?

30 + 31 +

A: CDKTF did not find product-market fit at scale. We’ve chosen to focus our investments on Terraform core and its broader ecosystem.

32 + 33 +

Q: Will CDKTF be removed from GitHub?

34 + 35 +

A: CDKTF will be archived on GitHub, and documentation will reflect its deprecated status.

36 + 37 +

Q: Can I still use CDKTF after it's sunset?

38 + 39 +

A: Yes, the archived code will remain available on GitHub, but it will be read-only. No further updates, fixes, or improvements will be made.

40 + 41 +

Q: Will CDKTF continue to support new versions of Terraform or providers?

42 + 43 +

A: No. Compatibility updates will not be made after the EOL date.

44 + 45 +

Q: Can I fork CDKTF and maintain it myself?

46 + 47 +

A: Yes. CDKTF is open source, and we encourage community forks if there’s interest in continuing development independently.

48 + 49 +

Q: Can I keep using CDKTF?

50 + 51 +

A: You may continue to use it at your own risk. HashiCorp, an IBM Company, will no longer be maintaining it.

52 + 53 +

Q: Is there a migration tool?

54 + 55 +

A: You can use the following command to generate Terraform-compatible .tf files directly from your CDKTF project:

56 + 57 +

` cdktf synth --hcl `

58 + 59 +

This will produce readable HCL configuration files, making it easier to migrate away from CDKTF. After running the command, you can use standard Terraform CLI commands (terraform init, terraform plan, terraform apply) to continue managing your infrastructure. Please note that while this helps bootstrap your configuration, you may still need to review and adjust the generated files for clarity, organization, or best practices.

60 + 61 +

Q: What migration guidance can we provide to customers?

62 + 63 +

A: For users looking to migrate away from CDKTF:

64 + 65 +

If your infrastructure is defined in CDKTF but also tightly integrated with AWS CDK, you may find it more consistent to migrate directly to the AWS CDK ecosystem.

66 + 67 +

If you are not using AWS CDK, we highly recommend migrating to standard Terraform and HCL for long-term support and ecosystem alignment.

68 + 69 +

---

70 + 1 71

![ ] ( https://github.com/hashicorp/terraform-cdk/workflows/Release/badge.svg )

2 72

[ ![ npm version ] ( https://badge.fury.io/js/cdktf.svg )] ( https://badge.fury.io/js/cdktf )

3 73

[ ![ PyPI version ] ( https://badge.fury.io/py/cdktf.svg )] ( https://badge.fury.io/py/cdktf )

A video on the details of how Trunk-Based Development worked at MFT Energy

Lobsters
youtu.be
2025-12-10 19:23:29
Comments...

I got an Nvidia GH200 server for €7.5k on Reddit and converted it to a desktop

Hacker News
dnhkng.github.io
2025-12-10 19:19:17
Comments...
Original Article

Grace Hopper Desktop

Introduction

Running large language models locally has always been a game of compromise. You either spend \$10,000+ on consumer GPUs that can barely handle 70 B parameter models, or you dream about enterprise hardware you’ll never afford. The Grace-Hopper platform—Nvidia’s unified CPU-GPU superchip architecture—represents the kind of dream-rig AI infrastructure LocalLlama drools over, with systems typically costing well over \$100,000 and exclusively available to data centers and research institutions.

So when I stumbled across a Grace-Hopper system being sold for 10K euro on Reddit, my first thought was “obviously fake.” My second thought was “I wonder if he’ll take 7.5K euro?”.

This is the story of how I bought enterprise-grade AI hardware designed for liquid-cooled server racks, converted it to air cooling, survived multiple near-disasters (including GPUs reporting temperatures of 16 million degrees), and ended up with a desktop that can run 235B parameter models at home. It’s a tale of questionable decisions, creative problem-solving, and what happens when you try to turn datacenter equipment into a daily driver.

If you’ve ever wondered what it takes to run truly large models locally, or if you’re just here to watch someone disassemble $80,000 worth of hardware with nothing but hope and isopropanol, you’re in the right place.

The Deal

Early this year, while browsing r/LocalLLaMA/new, I came across a ridiculously good deal . How good? These were the specs for the server offered for 10K euro, and a serious upgrade to my 4x RTX 4090 rig:

Specs:

  • 2x Nvidia Grace-Hopper Superchip
  • 2x 72-core Nvidia Grace CPU
  • 2x Nvidia Hopper H100 Tensor Core GPU
  • 2x 480GB of LPDDR5X memory with error-correction code (ECC)
  • 2x 96GB of HBM3 memory
  • 1152GB of total fast-access memory
  • NVLink-C2C: 900 GB/s of bandwidth
  • Programmable from 1000W to 2000W TDP (CPU + GPU + memory)
  • 1x High-efficiency 3000W PSU 230V to 48V
  • 2x PCIe Gen4 M.2 22110/2280 slots on board
  • 4x FHFL PCIe Gen5 x16

UPDATE :Since I bought this, DDR5 RAM prices have become insane. 960GB of fast DDR5 now costs more than what I paid for the whole Grace-Hopper system 🤯

Obviously fake I thought, because

  1. H100s cost about 30-40,000 euro each , and this system has two of them
  2. Grace-Hopper NVL2 systems are basically not for sale for consumers anyway!

The Reddit thread explained the reason the system was being sold cheap:

The main reason why is that it is a Frankensystem converted from liquid-cooled to aircooled. Also it is not very pretty and not rackable, because it has a 48V power supply attached. It is originally directly from Nvidia.

I immediately offered to buy it, because why not? If it was a scam, I could always back out, but I wanted to be first in line!

It turns out I live near the seller, and he runs an online shop that sells modified Nvidia server equipment as desktops . It still seemed pretty risky, so I did some research and found a video review of one of his Desktops on Youtube. With the deal now seeming at least plausible, and the seller only a two-hour drive away and agreeing to take cash, it was time to take a Bavarian road trip.

I arrived at a farmhouse in a small forest, and met Bernhard the proprietor of GPTshop.ai . He showed me a nice workshop (plasma cutters, an electronics lab, etc.) from which he fabricates custom cases for the high-end H100 desktops he builds. These desktops seem pretty damn nice, so it’s unfortunate that his webshop gives off shady vibes; the business registration in the Cayman Islands definitely doesn’t help. What I can say though is that this item was heavily discounted, and not what he usually sells.

Disclaimer : I have zero affiliation with GPTshop.ai beyond handing them a stack of cash and receiving a dust-covered server in return. If this were a sponsored post, they probably wouldn’t let me mention the 16 million degree GPU temperatures or the part where I had to free-solder components while praying to the electronics gods.

Disassembling the Grace Hopper server

Arrival

The server itself was not in great condition. These things run extremely loud and high-throughput fans, and these had sucked in a lot of dust, coating the mainboard so heavily I couldn’t tell the color of the PCB. However, it booted up and ran OK, so I handed over a wad of cash, strapped it into the backseat of my car with the seatbelt (it weighed ~20 kg), and drove it home.

Did I mention it’s loud? Firing up the system is physically painful. There are 8x Sunon dual-fan modules, and each is as loud as a powerful vacuum cleaner, but with a much higher and more annoying pitch. With all 8 running at full power, hearing protection is necessary - I could hear the system running in my basement with the windows closed from 50 meters away! My wife immediately (and quite fairly), banned its use at home. We both work home-office and it was simply too loud for online meetings. But I had other plans anyway…

First things first, I of course quickly decided and then proceeded to strip down the server, after first photo-documenting the various connectors between the various PCBs, modules and mainboard.

Cleaning the Server

Cleaning

The majority of the dust was vacuumed off during disassembly, but there was clearly a lot more under the Grace-Hopper modules. After removing those as well, I decided to go with a full washdown of the mainboard.

I purchased a few litres of Isopropanol, and with a soft brush I went over the whole board a few times to get the remaining fine dust from inside connectors and between SMD-component pins.

I suspected there might also be dust inside the Grace-Hopper modules, but actually, I really just wanted to pop them open to poke around.

The mainboard went on my heated floor to dry for a week, while I moved on to replacing the cooling system.

A new Water Cooling system

Adapter Plate

I had looked into building a custom water-cooling block, but I was worried about leaks, when I found cheap all-in-one water cooling systems for ~40 euro each on sale. Two per GH200 module would be sufficient, so I carefully measured the dimensions of the GPU die and CPU, as well as screw locations, and threw those into Fusion 360 to model up an adapter block.

I have a Bambu X1, which came in very handy for prototyping the adapter blocks. The tolerances have to be very tight, so I printed several cut-away versions to make sure there was solid contact to the bare GPU die, and a safe margin from contact to fragile parts.

The parts were then sent for CNC milling, and were delivered as the mainboard was finished drying. After using yet more isopropanol to clean off the machining oil, they were mounted without much fuss.

Assembling the Desktop

Assembly

My go-to material for this kind of project is ProfilAlu from eBay. It’s cheap, stiff, and delivered pre-cut for assembly. I put together a design in Fusion 360, and had the parts in a few days. The various mounts however were much more work. I needed to design a few dozen custom mounts for the various PCBs and air-filter fixings; this used up a few kilos of filament to get things just right.

Disaster(s)

Critical Fan Errors

The system didn’t start to boot anymore. Checking the logs, I saw 16 critical errors, one for each fan in the 8 pairs:

4 08/06/25 19:24:08 CEST Fan FAN_5_F Lower Critical going low Asserted Reading 0 < Threshold 2156 RPM
5 08/06/25 19:24:08 CEST Fan FAN_6_R Lower Critical going low Asserted Reading 0 < Threshold 2156 RPM
6 08/06/25 19:24:08 CEST Fan FAN_8_F Lower Critical going low Asserted Reading 0 < Threshold 2156 RPM
7 08/06/25 19:24:08 CEST Fan FAN_5_R Lower Critical going low Asserted Reading 0 < Threshold 2156 RPM
8 08/06/25 19:24:08 CEST Fan FAN_7_F Lower Critical going low Asserted Reading 0 < Threshold 2156 RPM
9 08/06/25 19:24:08 CEST Fan FAN_8_R Lower Critical going low Asserted Reading 0 < Threshold 2156 RPM…

With the fans removed, the BMC (Baseboard Management Controller) immediately panicked, and shut down the mainboard to prevent thermal damage, even with the water coolers in place. So, I disabled the fan-check subsystem.

1
2
3
4
5
# stops the service for the current session
systemctl stop phosphor-sensor-monitor.service

# prevents the service from starting on the next boot
systemctl disable phosphor-sensor-monitor.service

Who needs hardware monitoring? ¯\_(ツ)_/¯

Nuclear Fusion?

Great! I could start the boot process, and even reach login! But only about 1 time in 4… Not optimal. And even logged in, the server would crash within 2 minutes.

Looking into the BMC logs, I saw:

Sep 23 08:20:18 oberon-bmc shutdown_ok_mon[1478] event: FALLING EDGE offset: 26 timestamp: [571.615238550]
Sep 23 08:20:18 oberon-bmc power-status[1493] event: FALLING EDGE offset: 18 timestamp: [571.632491062]
Sep 23 08:20:18 oberon-bmc shutdown_ok_mon[545] SHDN_OK_L-I = 0
Sep 23 08:20:18 oberon-bmc shutdown_ok_mon[545] Asserting SYS_RST_IN_L-O to hold host in reset
Sep 23 08:20:18 oberon-bmc shutdown_ok_mon[545] gpioset SYS_RST_IN_L-O = 0
Sep 23 08:20:18 oberon-bmc power-status[697] gpioset SYS_RST_IN_L-O = 0
Sep 23 08:20:18 oberon-bmc power-status[697] Set SYS_RST_IN_L-O=0

So, a Critical Failure at 08:20:18:

  • SHDN_OK_L-I signal goes low (falling edge detected)
  • This immediately triggers a shutdown sequence
  • System powers off within ~30 seconds of successful boot

But why?!!? I had shut down the hardware monitoring.

Diving deeper into the logs:

Oct 05 10:15:00 oberon-bmc ipmid[520] thresholdChanged: Assert
Oct 05 10:15:00 oberon-bmc ipmid[520] thresholdChanged: Assert
Oct 05 10:15:00 oberon-bmc ipmid[520] thresholdChanged: Assert
Oct 05 10:15:00 oberon-bmc satellitesensor[2351] Sensor HGX_GPU_1_TEMP_1 high threshold 92 assert: value 1.67772e+07 raw data nan
Oct 05 10:15:00 oberon-bmc satellitesensor[2351] Sensor HGX_GPU_1_TEMP_1 high threshold 89 assert: value 1.67772e+07 raw data nan
Oct 05 10:15:00 oberon-bmc satellitesensor[2351] Sensor HGX_GPU_1_TEMP_1 high threshold 87 assert: value 1.67772e+07 raw data nan
Oct 05 10:15:00 oberon-bmc phosphor-fru-fault-monitor[524] /xyz/openbmc_project/logging/entry/496 created
Oct 05 10:15:00 oberon-bmc phosphor-fru-fault-monitor[524] /xyz/openbmc_project/logging/entry/497 created
Oct 05 10:15:00 oberon-bmc sensor-monitor[499] Starting 1000ms HardShutdownAlarmHigh shutdown timer due to sensor /xyz/openbmc_project/sensors/temperature/HGX_GPU_0_TEMP_1 value 16777214

Warning: Your GPU should not reach 16,777,214 Celsius during boot. Imagine what would happen under load!

This took some time to debug, as I was quite sure the sensors could not physically handle reading temperatures over 16 million Celsius… But then I noticed something interesting about that specific number:

Decimal Binary Hex
16,777,214 1111 1111 1111 1111 1111 1110 0xFFFFFE

This is 2²⁴ - 2 , which is suspiciously close to the maximum value of a 24-bit unsigned integer. In the hardware world, this is the equivalent of a sensor throwing up its hands and screaming “I have no idea what’s happening!” When hardware can’t read a value properly—whether due to a loose connection, damaged circuit, or initialization failure—it often returns the maximum (or near-maximum) representable value. It’s like the digital version of a shrug.

The logs confirmed this theory: seeing 1.67772e+07 (16,777,214) wasn’t evidence that my GPU had achieved nuclear fusion temperatures 🔥—it was evidence that the temperature sensor had simply stopped working. And if a sensor error is intermittent, the most likely culprit is a loose connection or physical damage.

After spending way too long pursuing software solutions (because who wants to disassemble everything again ?), I finally accepted the inevitable and broke out the screwdrivers.

Fix

I happened to have bought a new microscope earlier this year, and it turned out to be the perfect tool for diagnosing and fixing the issue. Near one of the modules, I found some damaged surface mount components. The damage must have happened after cleaning, probably during the reassembly of the modules with the copper adapters. They weigh over 2 kg, so a slight bump would have easily caused this damage. Amazingly, the tiny components were still attached to the traces, and so I could measure them easily: a 100 nF capacitor, and 4.7k resistor (both of which I had on-hand, as they are standard values for decoupling circuits). The bad news? I had huge “0805” sized parts (2mm long), these were tiny “0402” (1mm long). And one of the traces was just gone.

With some very fiddly soldering, and scratching off the solder mask on the PCB to expose more trace, I was able to ‘free solder’ the parts into a wonderful 3D sculpture which was then liberally coated in UV-curing mask resin, set, and then held in place with sticky tape. Very professional. After reassembly, the system booted smoothly.

Final Touches

Fix

I 3D printed a few extra parts:

  • Mounts for the E1.S 8TB SSD I found cheap online
  • A full rear-panel, that mounts the 3KW 48V power supply
  • Cool-looking mesh to protect the water-cooling radiators and dust filters

Getting the actual GPU working was also painful, so I’ll leave the details here for future adventurers:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Data Center/HGX-Series/HGX H100/Linux aarch64/12.8 seem to work!
wget https://us.download.nvidia.com/tesla/570.195.03/NVIDIA-Linux-aarch64-570.195.03.run

# Tell the driver to completely ignore the NVLINK and it should allow the GPUs to initialise independently over PCIe !!!!   This took a week of work to find, thanks Reddit!

# create a modprobe config file:
sudo nano /etc/modprobe.d/nvidia-disable-nvlink.conf

# add the driver option
options nvidia NVreg_NvLinkDisable=1

# update the boot files:
sudo update-initramfs -u

# reboot
sudo reboot

Benchmarks

That’s what you’re here for, maybe? I have only just started, but after compiling the latest Llama.cpp version using 144 cores in 90 seconds, here’s some benchmarks on larger LLMs:

Benchmarks

Model Prompt Processing Token Generation
gpt-oss-120b-Q4_K_M 2974.79 195.84
GLM-4.5-Air-Q4_K_M 1936.65 100.71
Qwen3-235B-A22B-Instruct-2507-Q4_K 1022.79 65.90

This is pretty unoptimized, but it’s looking promising so far! During the LLM tests I hit around 300W per GPU, far from the 900W max.

Cost Breakdown

Here’s what the entire build actually cost me, from the initial purchase to the final touches:

Component Description Cost (EUR)
Grace-Hopper Server 2x GH200 superchips with H100 GPUs (the Frankenstein special) €7,500
Storage ‘like-new’ used 8TB E1.S NVMe SSD €250
Custom Water Cooling Adapters 2x CNC-milled copper mounting plates for AIO coolers €700
AIO Water Coolers 4x Arctic Liquid Freezer III 420 (B-Ware) €180
Structural Frame Extruded aluminum profiles, pre-cut and delivered €200
3D Printing Filament 1kg black PLA for custom mounts and brackets €20
Hardware Nuts, bolts, and mounting hardware €50
Cleaning Supplies 5 liters of 99.9% isopropanol (used liberally throughout) €20
Aesthetics LED lighting strip (because RGB makes it faster) €10
Total €8,930

Not included: hearing protection (absolutely necessary), the microscope I already owned (but proved essential), several failed 3D prints, and the emotional cost of seeing “16,777,214°C” in system logs.

Conclusion

So, was it worth it? I now have a desktop that can run 235B parameter models at home for less than the cost of a single H100. It required disassembling $80,000 worth of enterprise hardware, debugging sensors that reported temperatures approaching the surface of the sun, and free-soldering components under a microscope. Your mileage may vary. Literally: I had to drive two hours to pick this thing up.

The future of Terraform CDK

Hacker News
github.com
2025-12-10 19:14:03
Comments...
Original Article

The Future of Terraform CDK

Sunset Notice

Terraform CDK (CDKTF) will sunset and be archived on December 10, 2025. HashiCorp, an IBM Company, will no longer maintain or develop the project after that date. Unfortunately, Terraform CDK did not find product-market fit at scale. HashiCorp, an IBM Company, has chosen to focus its investments on Terraform core and its broader ecosystem.

As of December 10, 2025, Terraform CDK will be archived on GitHub, and the documentation will reflect its deprecated status. The archived code will remain available on GitHub, but it will be read-only. No further updates, fixes, or improvements (including compatibility updates) will be made.

You will be able to continue to use Terraform CDK at your own risk. Terraform CDK is licensed under the Mozilla Public License (MPL). HashiCorp, an IBM Company, does not apply any additional restrictions. We encourage community forks if there’s interest in continuing development independently.

Migration to HCL

You can use the following command to generate Terraform-compatible .tf files directly from your Terraform CDK project:

cdktf synth --hcl

This will produce readable HCL configuration files, making it easier to migrate away from Terraform CDK. After running the command, you can use standard Terraform CLI commands ( terraform init , terraform plan , terraform apply ) to continue managing your infrastructure. Please note that while this helps bootstrap your configuration, you may still need to review and adjust the generated files for clarity, organization, or best practices.

Note on AWS CDK

If your infrastructure is defined in Terraform CDK but also tightly integrated with AWS CDK, you may find it more consistent to migrate directly to the AWS CDK ecosystem. If you are not using AWS CDK, we highly recommend migrating to standard Terraform and HCL for long-term support and ecosystem alignment.

FAQ

Q: Is CDKTF still being developed?

A: No. CDKTF will sunset and be archived on December 10, 2025. HashiCorp, an IBM Company, will no longer maintain or develop the project after that date.

Q: Why is CDKTF being sunset?

A: CDKTF did not find product-market fit at scale. We’ve chosen to focus our investments on Terraform core and its broader ecosystem.

Q: Will CDKTF be removed from GitHub?

A: CDKTF will be archived on GitHub, and documentation will reflect its deprecated status.

Q: Can I still use CDKTF after it's sunset?

A: Yes, the archived code will remain available on GitHub, but it will be read-only. No further updates, fixes, or improvements will be made.

Q: Will CDKTF continue to support new versions of Terraform or providers?

A: No. Compatibility updates will not be made after the EOL date.

Q: Can I fork CDKTF and maintain it myself?

A: Yes. CDKTF is open source, and we encourage community forks if there’s interest in continuing development independently.

Q: Can I keep using CDKTF?

A: You may continue to use it at your own risk. HashiCorp, an IBM Company, will no longer be maintaining it.

Q: Is there a migration tool?

A: You can use the following command to generate Terraform-compatible .tf files directly from your CDKTF project:

cdktf synth --hcl

This will produce readable HCL configuration files, making it easier to migrate away from CDKTF. After running the command, you can use standard Terraform CLI commands (terraform init, terraform plan, terraform apply) to continue managing your infrastructure. Please note that while this helps bootstrap your configuration, you may still need to review and adjust the generated files for clarity, organization, or best practices.

Q: What migration guidance can we provide to customers?

A: For users looking to migrate away from CDKTF:

If your infrastructure is defined in CDKTF but also tightly integrated with AWS CDK, you may find it more consistent to migrate directly to the AWS CDK ecosystem.

If you are not using AWS CDK, we highly recommend migrating to standard Terraform and HCL for long-term support and ecosystem alignment.


npm version PyPI version NuGet version Maven Central

CDK for Terraform

Cloud Development Kit for Terraform (CDKTF) allows you to use familiar programming languages to define cloud infrastructure and provision it through HashiCorp Terraform. This gives you access to the entire Terraform ecosystem without learning HashiCorp Configuration Language (HCL) and lets you leverage the power of your existing toolchain for testing, dependency management, etc.

We currently support TypeScript, Python, Java, C#, and Go.

terraform platform

CDKTF includes two packages:

  • cdktf-cli - A CLI that allows users to run commands to initialize, import, and synthesize CDK for Terraform applications.
  • cdktf - A library for defining Terraform resources using programming constructs.

Get Started

Choose a language:

Hands-on: Try the tutorials in the CDK for Terraform collection on HashiCorp Learn.

Documentation

Refer to the CDKTF documentation for more detail about how to build and manage CDKTF applications, including:

  • Application Architecture : Learn the tools and processes that CDKTF uses to leverage the Terraform ecosystem and convert code into Terraform configuration files. It also explains the major components of a CDKTF application and how those pieces fit together.

  • Project Setup : Learn how to create a new CDKTF project from a pre-built or custom template. Also learn how to convert an existing HCL project into a CDKTF application.

  • Unit Tests : Learn how to test your application in Typescript with jest.

  • Examples : Reference example projects in every supported language and review explanatory videos and other resources.

Community

The development team would love your feedback to help guide the project.

Build

For prerequisites, refer to the following .

Clone the project repository.

git clone https://github.com/hashicorp/terraform-cdk.git

Download dependencies.

cd terraform-cdk/
yarn install

Build the project and packages.

Super Mario 64 for the PS1

Hacker News
github.com
2025-12-10 18:58:55
Comments...
Original Article

This repo does not include all assets necessary for compiling the game. An original copy of the game is required to extract the assets.

sm64
├── actors: object behaviors, geo layout, and display lists
├── assets: animation and demo data
│   ├── anims: animation data
│   └── demos: demo data
├── bin: C files for ordering display lists and textures
├── build: output directory
├── data: behavior scripts, misc. data
├── doxygen: documentation infrastructure
├── enhancements: example source modifications
├── include: header files
├── levels: level scripts, geo layout, and display lists
├── lib: N64 SDK code
├── sound: sequences, sound samples, and sound banks
├── src: C source code for game
│   ├── audio: audio code
│   ├── buffers: stacks, heaps, and task buffers
│   ├── engine: script processing engines and utils
│   ├── game: behaviors and rest of game source
│   ├── goddard: rewritten Mario intro screen
│   ├── goddard_og: backup of original Mario intro screen
│   ├── menu: title screen and file, act, and debug level selection menus
│   └── port: port code, audio and video renderer
├── text: dialog, level names, act names
├── textures: skybox and generic texture data
└── tools: build tools

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Terrain Diffusion: A Diffusion-Based Successor to Perlin Noise

Hacker News
arxiv.org
2025-12-10 18:37:27
Comments...
Original Article

View PDF HTML (experimental)

Abstract: For decades, procedural worlds have been built on procedural noise functions such as Perlin noise, which are fast and infinite, yet fundamentally limited in realism and large-scale coherence. We introduce Terrain Diffusion, an AI-era successor to Perlin noise that bridges the fidelity of diffusion models with the properties that made procedural noise indispensable: seamless infinite extent, seed-consistency, and constant-time random access. At its core is InfiniteDiffusion, a novel algorithm for infinite generation, enabling seamless, real-time synthesis of boundless landscapes. A hierarchical stack of diffusion models couples planetary context with local detail, while a compact Laplacian encoding stabilizes outputs across Earth-scale dynamic ranges. An open-source infinite-tensor framework supports constant-memory manipulation of unbounded tensors, and few-step consistency distillation enables efficient generation. Together, these components establish diffusion models as a practical foundation for procedural world generation, capable of synthesizing entire planets coherently, controllably, and without limits.

Submission history

From: Alexander Goslin [ view email ]
[v1] Tue, 9 Dec 2025 07:10:35 UTC (12,075 KB)

10 Years of Let's Encrypt Certificates

Linux Weekly News
lwn.net
2025-12-10 18:29:51
Let's Encrypt has published a retrospective that covers the decade since it published its first publicly trusted certificate in September 2015: In March 2016, we issued our one millionth certificate. Just two years later, in September 2018, we were issuing a million certificates every day. In 2020...
Original Article

Let's Encrypt has published a retrospective that covers the decade since it published its first publicly trusted certificate in September 2015:

In March 2016, we issued our one millionth certificate. Just two years later, in September 2018, we were issuing a million certificates every day. In 2020 we reached a billion total certificates issued and as of late 2025 we're frequently issuing ten million certificates per day. We're now on track to reach a billion active sites, probably sometime in the coming year.


Kroah-Hartman: Linux CVEs, more than you ever wanted to know

Linux Weekly News
lwn.net
2025-12-10 18:24:35
Greg Kroah-Hartman is writing a series of blog posts about Linux becoming a Certificate Numbering Authority (CNA): It's been almost 2 full years since Linux became a CNA (Certificate Numbering Authority) which meant that we (i.e. the kernel.org community) are now responsible for issuing all CVEs f...
Original Article

Greg Kroah-Hartman is writing a series of blog posts about Linux becoming a Certificate Numbering Authority (CNA):

It's been almost 2 full years since Linux became a CNA (Certificate Numbering Authority) which meant that we (i.e. the kernel.org community) are now responsible for issuing all CVEs for the Linux kernel. During this time, we've become one of the largest creators of CVEs by quantity, going from nothing to number 3 in 2024 to number 1 in 2025. Naturally, this has caused some questions about how we are both doing all of this work, and how people can keep track of it.

So far, Kroah-Hartman has published the introductory post, as well as a detailed post about kernel version numbers that is well worth reading.



Intermittent hypoxia increases blood flow and benefits executive function

Hacker News
onlinelibrary.wiley.com
2025-12-10 18:24:13
Comments...

Over 10,000 Docker Hub images found leaking credentials, auth keys

Bleeping Computer
www.bleepingcomputer.com
2025-12-10 18:22:25
More than 10,000 Docker Hub container images expose data that should be protected, including live credentials to production systems, CI/CD databases, or LLM model keys. [...]...
Original Article

Over 10,000 Docker Hub images found leaking credentials, auth keys

More than 10,000 Docker Hub container images expose data that should be protected, including live credentials to production systems, CI/CD databases, or LLM model keys.

The secrets impact a little over 100 organizations, among them are a Fortune 500 company and a major national bank.

Docker Hub is the largest container registry where developers upload, host, share, and distribute ready-to-use Docker images that contain everything necessary to run an application.

Developers typically use Docker images to streamline the entire software development and deployment lifecycle. However, as past studies have shown , carelessness in creating these images can result in exposing secrets that remain valid for extended periods.

After scanning container images uploaded to Docker Hub in November, security researchers at threat intelligence company Flare found that 10,456 of them exposed one or more keys.

The most frequent secrets were access tokens for various AI models (OpenAI, HuggingFace, Anthropic, Gemini, Groq). In total, the researchers found 4,000 such keys.

When examining the scanned images, the researchers discovered that 42% of them exposed at least five sensitive values.

"These multi-secret exposures represent critical risks, as they often provide full access to cloud environments, Git repositories, CI/CD systems, payment integrations, and other core infrastructure components," Flare notes in a report today.

Size of secret exposure
Size of secret exposure
Source: Flare

Analyzing 205 namespaces enabled the researchers to identify a total of 101 companies, mostly small and medium-sized businesses, with a few large enterprises being present in the dataset.

Based on the analysis, most of the organizations with exposed secrets are in the software development sector, followed by entities in the market and industrial, and AI and intelligent systems.

More than 10 finance and banking companies had their sensitive data exposed.

Types of firms that exposed secrets on Docker Hub in November
Types of firms that exposed secrets on Docker Hub in November
Source: Flare

According to the researchers, one of the most frequent errors observed was the use of .ENV files that developers use to store database credentials, cloud access keys, tokens, and various authentication data for a project.

Additionally, they found hardcoded API tokens for AI services being hardcoded in Python application files, config.json files, YAML configs, GitHub tokens, and credentials for multiple internal environments.

Some of the sensitive data was present in the manifest of Docker images, a file that provides details about the image.

Many of the leaks appear to originate from the so-called 'shadow IT' accounts, which are Docker Hub accounts that fall outside of the stricter corporate monitoring mechanisms, such as those for personal use or belonging to contractors.

Flare notes that roughly 25% of developers who accidentally exposed secrets on Docker Hub realized the mistake and removed the leaked secret from the container or manifest file within 48 hours.

However, in 75% of these cases, the leaked key was not revoked, meaning that anyone who stole it during the exposure period could still use it later to mount attacks.

Exposed secrets exploitation diagram
Exposed secrets exploitation diagram
Source: Flare

Flare suggests that developers avoid storing secrets in container images, stop using static, long-lived credentials, and centralize their secrets management using a dedicated vault or secrets manager.

Organizations should implement active scanning across the entire software development life cycle and revoke exposed secrets and invalidate old sessions immediately.

tines

Break down IAM silos like Bitpanda, KnowBe4, and PathAI

Broken IAM isn't just an IT problem - the impact ripples across your whole business.

This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.

SWIM: Outsourced Heartbeats

Lobsters
benjamincongdon.me
2025-12-10 17:59:53
Comments...
Original Article

How does a distributed system reliably determine when one of its members has failed? This is a tricky problem: you need to deal with unreliably networks, the fact that nodes can crash at arbitrary times, and you need to do so in a way that can scale to thousands of noes. This is the role of a failure detection system, and is one of the most foundational parts of many distributed systems.

There are many rather simple ways to solve this problem, but one of the most elegant solutions to distributed failure detection is an algorithm that I first encountered in undergraduate Computer Science: the SWIM Protocol . 1 SWIM here stands for “Scalable Weakly Consistent Infection-style Process Group Membership”.

What is a Failure Detector

SWIM solves the problem of distributed failure detection. Lets sketch out this problem with a bit more detail. Suppose you have a dynamic pool of servers (or “processes” as they’re often called in distributed algorithms). Processes will enter the pool as they’re turned on, and leave the pool as they are turned off, crash, or when they become inaccessible from the network. The problem statement is that we want to be able to detect when any one of these servers goes offline.

The uses for such failure detection systems abound. Things like: determining node failures on workload schedulers like Kubernetes, maintaining membership list for peer-to-peer protocols, and determining if a replica has failed in a distributed database like Cassandra.

As a toy example, let’s pick something concrete: let’s say we’re building a distributed key-value store. We’re just getting started with our KV-store. For today just need to set up the membership system, wherein distributed replicas of the data can discover each other. Each replica needs to know of the existence of all the other replicas, so they can run more complicated distributed systems protocols – things like consensus, leader election, distributed locking, etc. All those other algorithms are fun, but we first need the handshake of “who am I talking to?” and “I need to know when one of my peers fails”. That’s failure detection.

What properties do we want for our failure detector?

  • If a server fails, we should know about it within some bounded time. This is Detection Time
  • Faster detection time is preferable.
  • If a process doesn’t die, we shouldn’t mark it as dead. If we relax this to allow for “false positives”, we ideally want a quite low False Positive Rate .
  • The amount of load on each node should scale sublinearly with the number of nodes in the pool. That is, ideally the amount of work per node is the same whether we have 100 nodes or 1,000,000 nodes.

What properties do we assume about nodes and the network?

  • The network can drop packets or become partitioned.
  • Nodes can crash at arbitrary times during program execution. We cannot rely on nodes doing anything (like sending an “I’m crashing” message) just prior to crashing, for example.
  • All nodes can send network packets to all other nodes.
  • Packets on the network have propagation time.

Reasoning up to Failure Detectors

What is the most naive functional failure detector that we could build? Well, for some node $N_1$ in a node pool of ${ N_1, N_2, …, N_{10} }$, $N_1$ could ping each of nodes ${ N_2, …, N_{10} }$. Each of those pinged nodes could send an acknowledgment of the ping, as soon as the ping was received. When $N_1$ receives a ping from, say, $N_4$, then $N_1$ knows that $N_2$ is still “alive”.

Ping and acknowledgment

Ping and acknowledgment

Looking at this figure, if $N_1$ waits for some chunk of time, maybe retries a few times, but still never hears back from, say, $N_4$, then it can mark $N_4$ as “dead”.

Every time we hear a ping back from another node, we know it’s alive “now”. But we need to know the state of all members over time, so we run this procedure in sequence many times. We can call the time between each ping we send out as $T_{ping}$. And we can call the time we wait to hear back as $T_{fail}$. Both of these become tunable parameters in our system. If we increase $T_{fail}$, we decrease the chance that we accidentally mark a neighbor as “dead” due to e.g. a transient network blip. But if we increase it too long, then we would also allow an actually dead node to still sit in our membership list for a long time – which we don’t want either.

Time between pings

Time between pings

Generalizing this, each of ${ N_1, N_2, …, N_{10} }$ all independently run this ping approach. Finally, to bootstrap the process, we can hardcode all the member node IPs in at startup. If a node crashes but later comes back online, it can broadcast a ping to each of its peers, who then mark it as “alive” and start pinging it regularly again.

All-to-all heartbeating

All-to-all heartbeating

Woohoo, we’ve just invented a basic form of heartbeating !

What are the characteristics of our basic system?

  • Number of messages: $k$ per $T_{ping}$ seconds, for each node, where $k$ is the number of nodes in our pool
  • Time to first detection: $T_{fail}$.

What about accuracy? Well… since we allow for an unreliable network, it’s sadly provably impossible to have both completeness (all failures detected) and accuracy (no false positives). See this paper if you’re curious. In a perfect network with no packet loss, we would have strong accuracy, however.

Time to SWIM

Surely we can do better than “the most naive thing we could think of”. This is where SWIM comes in.

SWIM combines two insights: first, that all-to-all heartbeating like our first example results in a LOT of overlapping communication. If nodes were able to “share” the information they gathered more effectively, we could cut down on the number of messages sent. Second, network partitions often only affect parts of a network. Just because $N_1$ can talk to $N_2$ but can’t talk to $N_3$, this doesn’t necessarily mean that $N_2$ can’t talk to $N_3$.

The tagline for SWIM is “outsourced heartbeats”, and it works like this:

  • Similar to all-to-all heartbeating, each node maintains its own membership list of all the other nodes it’s aware of. In addition to this list, each node also keeps track of a “last heard from” timestamp for each of the known members.
  • Every $T_{ping}$ seconds, each node ($N_1$) sends a ping to one other randomly selected node in its membership list ($N_{other}$). If $N_1$ receives a response from $N_{other}$, then $N_1$ updates its “last heard from” timestamp of $N_{other}$ to be the current time.
  • Here’s the outsourced heartbeating piece: If $N_1$ does not hear from $N_{other}$, then $N_1$ contacts $j$ other randomly selected nodes on its membership list and requests that they ping $N_{other}$. If any of those other nodes are able to successfully contact $N_{other}$, then they inform $N_1$ and both parties update their “last heard from” time for $N_{other}$.
Outsourced heartbeating

Outsourced heartbeating

  • If after $T_{fail}$ seconds, $N_1$ still hasn’t heard from $N_{other}$, then $N_1$ marks $N_{other}$ as failed.

At this point, $N_1$ has determined that $N_{other}$ has failed. To make the whole process happen more rapidly, $N_1$ includes this information to the rest of the network, usually by including information about $N_{other}$’s failure in its ping messages to neighbors, which then gradually propagates through the entire network gossip style .

SWIM protocol

SWIM protocol Source

What are the characteristics of this more sophisticated failure detector?

  • Number of messages per node per interval: $1$ ping in the common case. At most $1 + j$ when outsourcing is triggered (where $j$ is the number of “outsourced” ping requests).
  • Time to first detection: It takes some fancy math to get to this point, but in expectation this is $\frac{e}{e-1} * T_{ping}$

What’s notable about these two properties is that: First, the number of messages we send no longer scales with the size of the pool. We could have 1M or 10M or 100M nodes and still send the same number of messages. Second, the expected time to first detection also is still independent of the number of nodes. It’s a constant, tunable by the $T_{ping}$ interval.

Why Did SWIM Stick With Me

So that’s the trick. It’s quite simple! Just outsource some of our detection to you neighbors. What made SWIM stick with me is that it is clever . It seems to legitimately require some inspiration to get to this solution. We could try to build up other approaches on top of all-to-all heartbeating, but most of the obvious improvements aren’t competitive with SWIM. For example:

  • Subset heartbeating, where you pick a subset of your membership list and only ping those. This reduces the number of messages you need to send, but increases the time to detection significantly.
  • Centralized heartbeating, where you elect one node as a leader and it’s the only one that sends the pings and has authority over the membership list. This also reduces the number of total network messages, but puts undue load on a single node.
  • Basic Gossip Propagation, which looks like “SWIM without the outsourced heartbeats”. Health information is piggy-backed on ping packets, but you only ever rely on your own direct pings. This also has reduced network messages and bounded per-node load, but takes $O(\log(N))$ ping intervals to propagate through the whole network – not constant like SWIM gives you.

All of these have tradeoffs that SWIM exceeds. SWIM is simple, elegant, solves a challenging problem, and felt to me like “algorithm design has something interesting to say about distributed systems”. That’s ultimately why it’s stuck with me since I learned of it.

Further Reading

Show HN: A 2-row, 16-key keyboard designed for smartphones

Hacker News
k-keyboard.com
2025-12-10 17:49:28
Comments...
Original Article

What makes QWERTY mini different from the ones above?


1. Symmetric 16-key 2-row layout makes the up-and-down movement of both thumbs extremely comfortable.

Each key becomes up to 66% larger. (cf. From the iPhone Pro to the Pro Max gives about a 20% increase.)

2. The most important point is that vowels form the central axis of a word.

The five vowels (A, E, U, I, O) remain as standalone keys in their original QWERTY positions . This eliminates any conflicts with consonants and preserves a natural typing flow. (This fixes the problems that other reduced-key layouts failed to solve.)


3. Frequency-based consonant integration. 10 letters (Q, Z, X, V, B, J, K, F, G, P) appear in about 10% of English text.

4. You can type everything with just tap and double-tap.

Simultaneous taps using the four triggers (W, A, O, L) increase speed and reduce delay.

​​

5.  It is a structure optimized for multilingual extended characters and split layouts in landscape mode. even if these features come later.

​QWERTY mini is not a replacement for QWERTY.

it's the companion for smartphones.


Show HN: Automated license plate reader coverage in the USA

Hacker News
alpranalysis.com
2025-12-10 17:42:30
Comments...

Is it a bubble?

Hacker News
www.oaktreecapital.com
2025-12-10 17:30:43
Comments...
Original Article

Ours is a remarkable moment in world history. A transformative technology is ascending, and its supporters claim it will forever change the world. To build it requires companies to invest a sum of money unlike anything in living memory. News reports are filled with widespread fears that America’s biggest corporations are propping up a bubble that will soon pop.

During my visits to clients in Asia and the Middle East last month, I was often asked about the possibility of a bubble surrounding artificial intelligence, and my discussions gave rise to this memo. I want to start off with my usual caveats: I’m not active in the stock market; I merely watch it as the best barometer of investor psychology. I’m also no techie, and I don’t know any more about AI than most generalist investors. But I’ll do my best.

One of the most interesting aspects of bubbles is their regularity, not in terms of timing, but rather the progression they follow. Something new and seemingly revolutionary appears and worms its way into people’s minds. It captures their imagination, and the excitement is overwhelming. The early participants enjoy huge gains. Those who merely look on feel incredible envy and regret and – motivated by the fear of continuing to miss out – pile in. They do this without knowledge of what the future will bring or concern about whether the price they’re paying can possibly be expected to produce a reasonable return with a tolerable amount of risk. The end result for investors is inevitably painful in the short to medium term, although it’s possible to end up ahead after enough years have passed.

I’ve lived through several bubbles and read about others, and they’ve all hewed to this description. One might think the losses experienced when past bubbles popped would discourage the next one from forming. But that hasn’t happened yet, and I’m sure it never will. Memories are short, and prudence and natural risk aversion are no match for the dream of getting rich on the back of a revolutionary technology that “everyone knows” will change the world.

I took the quote that opens this memo from Derek Thompson’s November 4 newsletter entitled “AI Could Be the Railroad of the 21 st Century. Brace Yourself,” about parallels between what’s going on today in AI and the railroad boom of the 1860s. Its word-for-word applicability to both shows clearly what’s meant by the phrase widely attributed to Mark Twain: “history rhymes.”

Understanding Bubbles

Before diving into the subject at hand – and having read a great deal about it in preparation – I want to start with a point of clarification. Everyone asks, “Is there a bubble in AI?” I think there’s ambiguity even in the question. I’ve concluded there are two different but interrelated bubble possibilities to think about: one in the behavior of companies within the industry, and the other in how investors are behaving with regard to the industry. I have absolutely no ability to judge whether the AI companies’ aggressive behavior is justified, so I’ll try to stick primarily to the question of whether there’s a bubble around AI in the financial world.

The main job of an investment analyst – especially in the so-called “value” school to which I subscribe – is to (a) study companies and other assets and assess the level of and outlook for their intrinsic value and (b) make investment decisions on the basis of that value. Most of the change the analyst encounters in the short to medium term surrounds the asset’s price and its relationship to underlying value. That relationship, in turn, is essentially the result of investor psychology.

Market bubbles aren’t caused directly by technological or financial developments. Rather, they result from the application of excessive optimism to those developments. As I wrote in my January memo On Bubble Watch , bubbles are temporary manias in which developments in those areas become the subject of what former U.S. Federal Reserve Chairman Alan Greenspan called “irrational exuberance.’’

Bubbles usually coalesce around new financial developments (e.g., the South Sea Company of the early 1700s or sub-prime residential mortgage-backed securities in 2005-06) or technological progress (optical fiber in the late 1990s and the internet in 1998-2000). Newness plays a huge part in this. Because there’s no history to restrain the imagination, the future can appear limitless for the new thing. And futures that are perceived to be limitless can justify valuations that go well beyond past norms – leading to asset prices that aren’t justified on the basis of predictable earning power.

The role of newness is well described in my favorite passage from a book that greatly influenced me, A Short History of Financial Euphoria by John Kenneth Galbraith. Galbraith wrote about what he called “the extreme brevity of the financial memory” and pointed out that in the financial markets, “past experience, to the extent that it is part of memory at all, is dismissed as the primitive refuge of those who do not have the insight to appreciate the incredible wonders of the present.” In other words, history can impose limits on awe regarding the present and imagination regarding the future. In the absence of history, on the other hand, all things seem possible.

The key thing to note here is that the new thing understandably inspires great enthusiasm, but bubbles are what happen when the enthusiasm reaches irrational proportions. Who can identify the boundary of rationality? Who can say when an optimistic market has become a bubble? It’s just a matter of judgment.

Something that occurred to me this past month is that two of my best “calls” came in 2000, when I cautioned about what was going on in the market for tech and internet stocks, and in 2005-07, when I cited the dearth of risk aversion and the resulting ease of doing crazy deals in the pre-Global Financial Crisis world.

  • First, in neither case did I possess any expertise regarding the things that turned out to be the subjects of the bubbles: the internet and sub-prime mortgage-backed securities. All I did was render observations regarding the behavior taking place around me.

  • And second, the value in my calls consisted mostly of describing the folly in that behavior, not in insisting that it had brought on a bubble.

Struggling with whether to apply the “bubble” label can bog you down and interfere with proper judgment; we can accomplish a great deal by merely assessing what’s going on around us and drawing inferences with regard to proper behavior.

What’s Good About Bubbles?

Before going on to discuss AI and whether it’s presently in a bubble, I want to spend a little time on a subject that may seem somewhat academic from the standpoint of investors: the upside of bubbles. You may find the attention I devote to this topic excessive, but I do so because I find it fascinating.

The November 5 Stratechery newsletter was entitled “The Benefits of Bubbles.” In it, Ben Thompson (no relation to Derek) cites a book titled Boom: Bubbles and the End of Stagnation . It was written by Byrne Hobart and Tobias Huber, who propose that there are two kinds of bubbles:

. . . “Inflection Bubbles” – the good kind of bubbles, as opposed to the much more damaging “Mean-reversion Bubbles” like the 2000’s subprime mortgage bubble.

I find this a useful dichotomy.

  • The financial fads I’ve read about or witnessed – the South Sea Company, portfolio insurance, and sub-prime mortgage-backed securities – stirred the imagination based on the promise of returns without risk, but there was no expectation that they would represent overall progress for mankind. There was, for example, no thought that housing would be revolutionized by the sub-prime mortgage movement, merely a feeling that there was money to be made from backing new buyers. Hobart and Huber call these “mean-reverting bubbles,” presumably because there’s no expectation that the underlying developments would move the world forward. Fads merely rise and fall.

  • On the other hand, Hobart and Huber call bubbles based on technological progress – as in the case of the railroads and the internet – “inflection bubbles.” After an inflection-driven bubble, the world will not revert to its prior state. In such a bubble, “investors decide that the future will be meaningfully different from the past and trade accordingly.” As Thompson tells us:

The definitive book on bubbles has long been Carlota Perez’s Technological Revolutions and Financial Capital . Bubbles were – are – thought to be something negative and to be avoided, particularly at the time Perez published her book. The year was 2002 and much of the world was in a recession coming off the puncturing of the dot-com bubble.

Perez didn’t deny the pain: in fact, she noted that similar crashes marked previous revolutions, including the Industrial Revolution, railways, electricity, and the automobile. In each case the bubbles were not regrettable, but necessary: the speculative mania enabled what Perez called the “Installation Phase,” where necessary but not necessarily financially wise investments laid the groundwork for the “Deployment Period.” What marked the shift to the deployment period was the popping of the bubble; what enabled the deployment period were the money-losing investments. (All emphasis added)

This distinction is very meaningful for Hobart and Huber, and I agree. They say, “not all bubbles destroy wealth and value. Some can be understood as important catalysts for techno-scientific progress.”

But I would restate as follows: “Mean-reversion bubbles” – in which markets soar on the basis of some new financial miracle and then collapse – destroy wealth. On the other hand, “inflection bubbles” based on revolutionary developments accelerate technological progress and create the foundation for a more prosperous future, and they destroy wealth . The key is to not be one of the investors whose wealth is destroyed in the process of bringing on progress.

Hobart and Huber go on to describe in greater depth the process through which bubbles finance the building of the infrastructure required by the new technology and thus accelerate its adoption:

Most novel technology doesn’t just appear ex nihilo [i.e., from nothing], entering the world fully formed and all at once. Rather, it builds on previous false starts, failures, iterations, and historical path dependencies. Bubbles create opportunities to deploy the capital necessary to fund and speed up such large-scale experimentation – which includes lots of trial and error done in parallel – thereby accelerating the rate of potentially disruptive technologies and breakthroughs.

By generating positive feedback cycles of enthusiasm and investment, bubbles can be net beneficial. Optimism can be a self-fulfilling prophecy. Speculation provides the massive financing needed to fund highly risky and exploratory projects; what appears in the short term to be excessive enthusiasm or just bad investing turns out to be essential for bootstrapping social and technological innovations . . . A bubble can be a collective delusion, but it can also be an expression of collective vision. That vision becomes a site of coordination for people and capital and for the parallelization of innovation. Instead of happening over time, bursts of progress happen simultaneously across different domains. And with mounting enthusiasm . . . comes increased risk tolerance and strong network effects. The fear of missing out, or FOMO, attracts even more participants, entrepreneurs, and speculators, further reinforcing this positive feedback loop. Like bubbles, FOMO tends to have a bad reputation, but it’s sometimes a healthy instinct. After all, none of us wants to miss out on a once-in-a-lifetime chance to build the future.

In other words, bubbles based on technological progress are good because they excite investors into pouring in money – a good bit of which is thrown away – to carpet-bomb a new area of opportunity and thus jump-start its exploitation.

The key realization seems to be that if people remained patient, prudent, analytical, and value-insistent, novel technologies would take many years and perhaps decades to be built out. Instead, the hysteria of the bubble causes the process to be compressed into a very short period – with some of the money going into life-changing investment in the winners but a lot of it being incinerated.

A bubble has aspects that are both technological and financial, but the above citations are from the standpoint of people who crave technological progress and are perfectly happy to see investors lose money in its interest. “We,” on the other hand, would like to see technological progress but have no desire to throw away money to help bring it about.

Ben Thompson ends this discussion by saying, “This is why I’m excited to talk about new technologies, the prospect for which I don’t know .” I love the fact that he’s excited by future possibilities and at the same time admits that the shape of the future is unknown (in our world, we might say “very risky”).

Assessing the Current Landscape

Now let’s get down to what we used to call “brass tacks.” What do we know? First, I haven’t met anyone who doesn’t believe artificial intelligence has the potential to be one of the biggest technological developments of all time, reshaping both daily life and the global economy.

We also know that in recent years, economies and markets have become increasingly dependent on AI:

  • AI is responsible for a very large portion of companies’ total capital expenditures.

  • Capital expenditures on AI capacity account for a large share of the growth in U.S. GDP.

  • AI stocks have been the source of the vast majority of the gains of the S&P 500.

As a Fortune headline put it on October 7:

75% of gains, 80% of profits, 90% of capex – AI’s grip on the S&P is total and Morgan Stanley’s top analyst is ‘very concerned’

Further, I think it’s important to note that whereas the gains in AI-related stocks account for a disproportionate percentage of the total gains in all stocks, the excitement AI injects into the market must have added a lot to the appreciation of non-AI stocks as well.

AI-related stocks have shown astronomical performance, led by Nvidia, the leading developer of computer chips for AI. From its formation in 1993 and its initial public offering in 1999, when its estimated market value was $626 million, Nvidia briefly became the world’s first company worth $5 trillion. That’s appreciation of around 8,000x, or roughly 40% a year for 26+ years. No wonder imaginations have been fired.

What Are the Areas of Uncertainty?

I think it’s fair to say that while we know AI will be a source of incredible change, most of us have no idea exactly what it will be able to do, how it will be applied commercially, or what the timing will be.

Who will be the winners, and what will they be worth? If a new technology is assumed to be a world changer, it’s invariably assumed that the leading companies possessing that technology will be of great value. But how accurate will that assumption prove to be? As Warren Buffett pointed out in 1999, “[The automobile was] the most important invention, probably, of the first half of the 20 th century. . . . If you had seen at the time of the first cars how this country would develop in connection with autos, you would have said, ‘This is the place I must be.’ But of the 2,000 companies, as of a few years ago, only three car companies survived. So autos had an enormous impact on America but the opposite direction on investors.” ( Time , January 23, 2012)

In AI, there are some very strong leaders at present, including some of the world’s strongest and richest companies. But new technology is notoriously disruptive. Will today’s leaders prevail or give way to upstarts? How much will the arms race cost, and who will win?

Similarly, what’s a share in an upstart worth? Unlike front runners worth trillions, it’s possible to invest in some would-be challengers at enterprise values in mere billions or even – might I say? – millions. On June 25, 2024, CNBC reported as follows:

A team founded by college dropouts has raised $120 million from investors led by Primary Venture Partners to build a new AI chip to take on Nvidia. Etched CEO Gavin Uberti said the startup is betting that as AI develops, most of the technology’s power-hungry computing requirements will be filled by customized, hard-wired chips called ASICs. “If transformers go away, we’ll die,” Uberti told CNBC. “But if they stick around, we’re the biggest company of all time.”

Even granting the possibility that Etched won’t become the biggest company of all time, if success could give them a valuation just one-fifth of Nvidia’s peak – a mere $1 trillion – what probability of success would be required to justify an investment of $120 million? Assuming for simplicity’s sake that the investment was for a 100% ownership stake, all you need is a belief that achieving the trillion-dollar value has a probability of one-tenth of a percent for an expected return of over eight times your money. Who’s to say Etched doesn’t have that chance? And in that case, why would anyone not play? The foregoing is what I call “lottery-ticket thinking,” in which the dream of an enormous payoff justifies – no, compels – participation in an endeavor with an overwhelming probability of failing.

There’s nothing wrong with calculating expected values this way. Leading venture capitalists engage in it every day to great effect. But assumptions regarding the possible payoffs and their probabilities must be reasonable. Thinking about a trillion-dollar payout will override reasonableness in any calculation.

Will AI produce profits, and for whom? Two things we know little or nothing about are the profits AI will produce for vendors and its impact on non-AI companies, primarily meaning those who employ it.

Will AI be a monopoly or duopoly, in which one or two leading companies are able to charge dearly for the capabilities? Or will it be a highly competitive free-for-all in which a number of firms compete on price for users’ spending on AI services, making it a commodity? Or, perhaps most likely, will it be a mix of leading companies and specialized players, some of whom compete on price and others through proprietary advantages. It’s said that the services currently responding to AI queries, such as ChatGPT and Gemini, lose money on every query they answer (of course, it’s not unusual for participants in a new industry to offer “loss leaders” for a while). Will the leading tech firms – used to success in winner-take-all markets – be content to experience losses in their AI businesses for years in order to gain share? Hundreds of billions of dollars are being committed to the race for AI leadership. Who will win, and what will be the result?

Likewise, what will be AI’s impact on the companies that use it? Clearly, AI will be a great tool for enhancing users’ productivity by, among other things, replacing workers with computer-sourced labor and intelligence. But will this ability to cut costs add to the profit margins of the companies that employ it? Or will it simply enable price wars among those companies in the pursuit of customers? In that case, the savings might be passed on to the customers rather than garnered by the companies. In other words, is it possible AI will increase the efficiency of businesses without increasing their profitability?

Should we worry about so-called “circular deals”? In the telecom boom of the late 1990s, in which optical fiber became overbuilt, fiber-owning companies engaged in transactions with each other that permitted them to report profits. If two companies own fiber, they just have an asset on their books. But if each buys capacity from the other, they can both report profits . . . so they did. In other cases, manufacturers loaned network operators money to buy equipment from them, before the operators had customers to justify the buildout. All this resulted in profits that were illusory.

Nowadays, deals are being announced in which money appears to be round-tripped between AI players. People who believe there’s an AI bubble find it easy to view these transactions with suspicion. Is the purpose to achieve legitimate business goals or to exaggerate progress?

Adding to worries, critics say, some of the deals that OpenAI has made with chipmakers, cloud computing companies and others are oddly circular. OpenAI is set to receive billions from tech companies but also sends billions back to the same companies to pay for computing power and other services. . . .

Nvidia has also made some deals that have raised questions about whether the company is paying itself. It announced that it would invest $100 billion in OpenAI. The start-up receives that money as it buys or leases Nvidia’s chips. . . .

Goldman Sachs has estimated that Nvidia will make 15 percent of its sales next year from what critics also call circular deals. ( The New York Times , November 20)

Noteworthily, OpenAI has made investment commitments to industry counterparties totaling $1.4 trillion, even though it has yet to turn a profit. The company makes clear that the investments are to be paid out of revenues received from the same parties and that it has ways to back out of these commitments. But all this raises the question of whether the AI industry has developed a perpetual motion machine.

(On this subject, I’ve been enjoying articles questioning the ability of people to relate to the word “trillion,” and I think this idea is spot on. A million dollars is a dollar a second for 11.6 days. A billion dollars is a dollar a second for 31.7 years. We get that. But a trillion dollars is a dollar a second for 31,700 years. Who can get their head around the significance of 31,700 years?)

What will be the useful life of AI assets? We have to wonder whether the topic of obsolescence is being handled correctly in AI-land. What will be the lifespan of AI chips? How many years of earnings growth should be counted on in assigning p/e ratios for AI-related stocks? Will chips and other aspects of AI infrastructure last long enough to repay the debt undertaken to buy them? Will artificial general intelligence (a machine capable of doing anything the human brain can do) be achieved? Will that be the end of progress, or might there be further revolutions, and what firms will win them? Will firms reach a position where technology is stable and they can extract economic value from it? Or will new technologies continually threaten to supplant older ones as the route to success?

In this connection, a single issue of an FT newsletter briefly mentioned two developments that suggest the fluid nature of the competitive landscape:

  • A study by the Massachusetts Institute of Technology and open-source AI start-up Hugging Face found that the total share of downloads of new Chinese-made open models rose to 17 per cent in the past year. The figure surpasses the 15.8 per cent share of downloads from American developers such as Google, Meta and OpenAI – the first time Chinese groups have beaten their American counterparts. . . .

  • Nvidia shares fell sharply yesterday on fears that Google is gaining ground in artificial intelligence, erasing $115bn in market value from the AI chipmaker. ( FirstFT Americas , November 26)

Dynamic change creates the opportunity for incredible new technologies, but that same dynamism can threaten the leading companies’ reign. Amid all these uncertainties, investors must ask whether the assumption of continued success incorporated in the prices they’re paying is fully warranted.

Is exuberance leading to speculative behavior? For an extreme example, I’ll cite the trend toward venture capital investments in startups via $1 billion “seed rounds.” Here’s one vignette:

Thinking Machines, an AI startup helmed by former Open AI executive Mira Murati, just raised the largest seed round in history: $2 billion in funding at a $10 billion valuation. The company has not released a product and has refused to tell investors what they’re even trying to build. “It was the most absurd pitch meeting,” one investor who met with Murati said. “She was like, ‘So we're doing an AI company with the best AI people, but we can’t answer any questions.’ ” (“The Is How the AI Bubble Will Pop,” Derek Thompson Substack, October 2)

But that’s ancient history. . . already two months old. Here’s an update:

Thinking Machines Lab, the artificial intelligence startup founded by former Open AI executive Mira Murati, is in early talks to raise a new funding round at a roughly $50 billion valuation, Bloomberg News reported on Thursday. The startup was last valued at $12 billion in July, after it raised about $2 billion. ( Reuters , November 13)

And Thinking Machines Lab isn’t alone:

In one of the boldest bets yet in the AI arms race, Safe Superintelligence (SSI), the stealth startup founded by former OpenAI chief scientist Ilya Sutskever, has raised $2 billion in a round that values the company at $32 billion – despite having no publicly released product or service. ( CTech by Calcalist , April 13)

What’s the end state? Part of the issue with AI includes the unusual nature of this newest thing. This isn’t like a business that designs and sells a product, making money if the selling price exceeds the cost of the inputs. Rather, it’s companies building an airplane while it’s in flight, and once it’s built, they’ll know what it can do and whether anyone will pay for its services.

Many companies justify their spending because they’re not just building a product, they’re creating something that will change the world : artificial general intelligence, or A.G.I. . . . The rub is that none of them quite know how to do it.

But Anton Korinek, an economist at the University of Virginia, said the spending would all be justified if Silicon Valley reached its goal. He is optimistic it can be done.

“It’s a bet on A.G.I. or bust,” Dr. Korinek said. ( The New York Times , November 20 – emphasis added)

The yet-to-be-determined nature of the industry under construction is best captured in remarks from Sam Altman, the CEO of OpenAI, that have been paraphrased as follows: “we’ll build this sort of generally intelligent system and then ask it to figure out a way to generate an investment return from it.”

This should be a source of pause for people who heretofore fully comprehended the nature of the businesses they invested in. Clearly, the value of a technology that equals or surpasses the human brain should be pretty big, but isn’t it well beyond calculation?

A Word About the Use of Debt

To date, much of the investment in AI and the supporting infrastructure has consisted of equity capital derived from operating cash flow. But now, companies are committing amounts that require debt financing, and for some of those companies, the investments and leverage have to be described as aggressive.

The AI data centre boom was never going to be financed with cash alone. The project is too big to be paid for out of pocket. JPMorgan analysts have done some sums on the back of a napkin, or possibly a tablecloth, and estimated the bill for the infrastructure build-out would come to $5tn (not including a tip). Who knows if that’s right, but we have good reason to expect close to half a trillion in spending next year. Meanwhile, the biggest spenders (Microsoft, Alphabet, Amazon, Meta and Oracle) had only about $350bn in the bank, collectively, as of the end of the third quarter. (“Unhedged,” Financial Times , November 13)

The firms mentioned above derive healthy cash flows from their very strong non-AI businesses. But the massive, winner-take-all arms race in AI is requiring some to take on debt. In fact, it’s reasonable to think one of the reasons they’re spending vast sums is to make it hard for lesser firms to keep up.

Oracle, Meta, and Alphabet have issued 30-year bonds to finance AI investments. In the case of the latter two, the yields on the bonds exceed those on Treasurys of like maturity by 100 basis points or less. Is it prudent to accept 30 years of technological uncertainty to make a fixed-income investment that yields little more than riskless debt? And will the investments funded with debt – in chips and data centers – maintain their level of productivity long enough for these 30-year obligations to be repaid?

On November 14, Alex Kantrowitz’s Big Technology Podcast carried a conversation with Gil Luria, Head of Technology Research at financial services firm D.A. Davidson, primarily regarding the use of debt in the AI sector. Here’s some of what Luria had to say:

  • Healthy behavior is being practiced by “. . . reasonable, thoughtful business leaders, like the ones at Microsoft, Amazon, and Google that are making sound investments in growing the capacity to deliver AI. And the reason they can make sound investments is that they have all the customers. . . And so, when they make investments, they’re using cash on their balance sheets; they have tremendous cash flow to back it up; they understand that it’s a risky investment; and they balance it out.”

  • Unhealthy behavior – Here he describes “. . . a startup that is borrowing money to build data centers for another startup. They’re both losing tremendous amounts of cash, and yet they’re somehow being able to raise this debt capital in order to fund this buildout, again without having the customers or the visibility into those investments paying off.”

  • “So there’s a whole range of behaviors between healthy and unhealthy, and we just need to sort that out so we don’t make the mistakes of the past.”

  • “There are certain things we finance through equity, through ownership, and there are certain things we finance through debt, through an obligation to pay down interest over time. And as a society, for the longest time, we’ve had those two pieces in their right place. Debt is when I have a predictable cash flow and/or an asset that can back that loan, and then it makes sense for me to exchange capital now for future cash flows to the lender. . . . We use equity for investing in more speculative things, for when we want to grow and we want to own that growth, but we’re not sure about what the cash flow is going to be. That’s how a normal economy functions. When you start confusing the two you get yourself in trouble.”

Among potentially worrisome factors, Luria cites these:

  • “A speculative asset . . . we don’t know how much of it we’re really going to need in two to five years.”

  • Lender personnel with incentives to make loans but no exposure to long-term consequences

  • The possibility that the supply of AI capacity catches up with or surpasses the demand

  • The chance that future generations of AI chips will be more powerful, obsoleting existing ones or reducing their value as backing for debt

  • Powerful competitors who vie for market share by cutting rental rates and running losses

Here are some important paragraphs from Azeem Azhar’s Exponential View of October 18:

When does an AI boom tip into a bubble? [Investor and engineer] Paul Kedrosky points to the Minsky moment – the inflection point when credit expansion exhausts its good projects and starts chasing bad ones, funding marginal deals with vendor financing and questionable coverage ratios. For AI infrastructure, that shift may already be underway; the telltale signs include hyperscalers’ capex outpacing revenue momentum and lenders sweetening terms to keep the party alive.

Paul makes a compelling case. We’ve entered speculative finance territory – arguably past the tentative stage – and recent deals will set dangerous precedents. As Paul warns, this financing will “create templates for future such transactions,” spurring rapid expansion in junk issuance and SPV proliferation among hyperscalers chasing dominance at any cost. . . .

For AI infrastructure, the warning signs are flashing: vendor financing proliferates, coverage ratios thin, and hyperscalers leverage balance sheets to maintain capex velocity even as revenue momentum lags. We see both sides – genuine infrastructure expansion alongside financing gymnastics that recall the 2000 telecom bust. The boom may yet prove productive, but only if revenue catches up before credit tightens. When does healthy strain become systemic risk? That’s the question we must answer before the market does. (Emphasis added)

Azhar references the use of off-balance sheet financing via special-purpose vehicles, or SPVs, which were among the biggest contributors to Enron’s precariousness and eventual collapse. A company and its partners set up an SPV for some specific purpose(s) and supply the equity capital. The parent company may have operating control, but because it doesn’t have majority ownership, it doesn’t consolidate the SPV on its financial statements. The SPV takes on debt, but that debt doesn’t appear on the parent’s books. The parent may be an investment grade borrower, but likewise, the debt isn’t an obligation of the parent or guaranteed by it. Today’s debt may be backed by promised rent from a data center tenant – sometimes an equity partner – but the debt isn’t a direct obligation of the equity partner either. Essentially, an SPV is a way to make it look like a company isn’t doing the things the SPV is doing and doesn’t have the debt the SPV does. (Private equity funds and private credit funds are highly likely to be found among the partners and lenders in these entities.)

As I quoted earlier, according to Perez (who wrote on the heels of the dot-com bubble), “what enabled the deployment period were the money-losing investments.” Early investment is lost in the “Minsky moment,” in which unwise commitments made in an extended up-cycle encounters value destruction in a correction. And there are three things we know for sure about the use of debt:

  • it magnifies losses if there are losses (just as it magnifies the hoped-for gains if they materialize),

  • it increases the probability of a venture failing if it encounters a difficult moment, and

  • despite the layer of equity beneath it, it puts lenders’ capital at risk if the difficult moment is bad enough.

One key risk to consider is the possibility that the boom in data center construction will result in a glut. Some data centers may be rendered uneconomic, and some owners may go bankrupt. In that case, a new generation of owners might buy up centers at pennies on the dollar from lenders who foreclosed on them, reaping profits when the industry stabilizes. This is a process through which “creative destruction” brings markets into equilibrium and reduces costs to levels that make future business profitable.

Debt is neither a good thing nor a bad thing per se . Likewise, the use of leverage in the AI industry shouldn’t be applauded or feared. It all comes down to the proportion of debt in the capital structure; the quality of the assets or cash flows you’re lending against; the borrowers’ alternative sources of liquidity for repayment; and the adequacy of the safety margin obtained by lenders. We’ll see which lenders maintain discipline in today’s heady environment.

It’s worth noting in this connection that Oaktree has made a few investments in data centers, and our parent, Brookfield, is raising a $10 billion fund for investment in AI infrastructure. Brookfield is putting up its own money and has equity commitments from sovereign wealth funds and Nvidia, to which it intends to apply “prudent” debt. Brookfield’s investments seem likely to go largely into geographies that are less saturated with data centers and for infrastructure to supply the vast amounts of electric power that data centers will require. Of course, we’re both doing these things on the basis of what we think are prudent decisions.

I know I don’t know enough to opine on AI. But I do know something about debt, and it’s this:

  • It’s okay to supply debt financing for a venture where the outcome is uncertain.

  • It’s not okay where the outcome is purely a matter of conjecture.

  • Those who understand the difference still have to make the distinction correctly.

The FT’s Unhedged quotes Chong Sin, lead analyst for CMBS research at JPMorgan, as saying, “. . . in our conversations with investment grade ABS and CMBS investors, one often-cited concern is whether they want to take on the residual value risk of data centers when the bonds mature.” I’m glad potential lenders are asking the kind of questions they should.

Here’s how to think about the intersection of debt and AI according to Bob O’Leary, Oaktree’s co-CEO and co-portfolio manager of our Opportunities Funds:

Most technological advances develop into winner-takes-all or winner-takes-most competitions. The “right” way to play this dynamic is through equity, not debt. Assuming you can diversify your equity exposures so as to include the eventual winner, the massive gain from the winner will more than compensate for the capital impairment on the losers. That’s the venture capitalist’s time-honored formula for success.

The precise opposite is true of a diversified pool of debt exposures. You’ll only make your coupon on the winner, and that will be grossly insufficient to compensate for the impairments you’ll experience on the debt of the losers.

Of course, if you can’t identify the pool of companies from which the winner will emerge, the difference between debt and equity is irrelevant – you’re a zero either way. I mention this because that’s precisely what happened in search and social media: early leaders (Lycos in search and MySpace in social media) lost out spectacularly to companies that emerged later (Google in search and Facebook in social media).

Trying to Get to a Conclusion

There can be no doubt that today’s behavior is “speculative,” defined as based on speculation regarding the future. There’s also no doubt that no one knows what the future holds, but investors are betting huge sums on that future.

In that connection, I want to say a little about the unique nature of AI. The AI revolution is different from the technological revolutions that preceded it in ways that are both wonderful and worrisome . It feels to me like a genie has been released from a bottle, and it isn’t going back in:

AI may not be a tool for mankind, but rather something of a replacement. It may be capable of taking over cognition, on which humans have thus far had a monopoly. Because of this, it’s likely to be different in kind from prior developments, not just in degree. (More on this in my postscript.)

AI technology is progressing at an incredibly rapid clip , possibly leaving scant time for mankind to adjust. I’ll provide two examples:


  • Coding, which we called “computer programming” 60 years ago, is the canary in the coal mine in terms of the impact of AI. In many advanced software teams, developers no longer write the code; they type in what they want, and AI systems generate the code for them. Coding performed by AI is at a world-class level, something that wasn’t so just a year ago. According to my guide here, “There is no speculation about whether or not human replacement will take place in that vertical.”

  • In the field of digital advertising, when users log into an app, AI engages in “ad matching,” showing them ads tailored to the preferences displayed by their prior surfing. No humans need apply to do this job.

Perhaps most importantly, the growth of demand for AI seems totally unpredictable. As one of my younger advisers explained, “the speed and scale of improvement mean it’s incredibly hard to forecast demand for AI. Adoption today may have nothing to do with adoption tomorrow, because a year or two from now, AI may be able to do 10x or 100x what it can do today. Thus, how can anyone say how many data centers will be needed? And how can even successful companies know how much computing capacity to contract for?”

With differences like these, how can anyone correctly judge what AI implies for the future?

*            *            *

One of the things occupying many observers at this juncture – including me – is the search for parallels to past bubbles. Here’s some historical perspective from a recent article in Wired :

AI’s closest historical analogue here may be not electric lighting but radio. When RCA started broadcasting in 1919, it was immediately clear that it had a powerful information technology on its hands. But less clear was how that would translate into business. “Would radio be a loss-leading marketing for department stores? A public service for broadcasting Sunday sermons? An ad-supported medium for entertainment?” [Brent Goldfarb and David A. Kirsch of the University of Maryland] write. “All were possible. All were subjects of technological narratives.” As a result, radio turned into one of the biggest bubbles in history – peaking in 1929, before losing 97 percent of its value in the crash. This wasn’t an incidental sector; RCA was, along with Ford Motor Company, the most high-traded stock on the market. It was, as The New Yorker recently wrote, “the Nvidia of its day.” . . .

In 1927, Charles Lindbergh flew the first solo nonstop transatlantic flight from New York to Paris. . . . It was the biggest tech demo of the day, and it became an enormous, ChatGPT-launch-level coordinating event – a signal to investors to pour money into the industry.

“Expert investors appreciated correctly the importance of airplanes and air travel,” Goldfarb and Kirsch write, but “the narrative of inevitability largely drowned out their caution. Technological uncertainty was framed as opportunity, not risk. The market overestimated how quickly the industry would achieve technological viability and profitability.”

As a result, the bubble burst in 1929 – from its peak in May, aviation stocks dropped 96 percent by May 1932. . . .

It’s worth reiterating that two of the closest analogs AI seems to have in tech bubble history are aviation and broadcast radio. Both were wrapped in high degrees of uncertainty and both were hyped with incredibly powerful coordinating narratives. Both were seized on by pure play companies seeking to capitalize on the new game-changing tech, and both were accessible to the retail investors of the day. Both helped inflate a bubble so big that when it burst, in 1929, it left us with the Great Depression. (“AI Is the Bubble to Burst Them All,” Brian Merchant, Wired , October 27 – emphasis added. N.b., the Depression had many causes beyond the bursting of the radio/aviation bubble.)

Derek Thompson, who supplied the quote with which I opened this memo, ended his newsletter with some terrific historical perspective:

The railroads were a bubble and they transformed America. Electricity was a bubble, and it transformed America. The broadband build-out of the late-1990s was a bubble that transformed America. I am not rooting for a bubble, and quite the contrary, I hope that the US economy doesn’t experience another recession for many years. But given the amount of debt now flowing into AI data center construction, I think it’s unlikely that AI will be the first transformative technology that isn’t overbuilt and doesn’t incur a brief painful correction. (“AI Could Be the Railroad of the 21 st Century. Brace Yourself.” November 4 – emphasis added)

The skeptics readily cite ways in which today’s events are comparable to the internet bubble:

  • A change-the-world technology

  • Exuberant, speculative behavior

  • The role of FOMO

  • Suspect, circular deals

  • The use of SPVs

  • $1 billion seed rounds

The supporters have reasons why the comparison isn’t appropriate:

  • An existing product for which there is strong demand

  • One billion users already (many times the number of internet users at the height of the bubble)

  • Well-established main players with revenues, profits, and cash flow

  • The absence of an IPO craze with prices doubling in a day

  • Reasonable p/e ratios for the established participants

I’ll elaborate regarding the first of the proposed non-comparable factors. Unlike in the internet bubble, AI products already exist at scale, the demand for them is exploding, and they’re producing revenues in rapidly increasing amounts. For example, Anthropic, one of the two leaders in producing models for AI coding as described on page 12, is said to have “10x-ed” its revenues in each of the last two years (for those who didn’t study higher math, that’s 100x in two years). Revenues from Claude Code, a program for coding that Anthropic introduced earlier this year, already are said to be running at an annual rate of $1 billion. Revenues for the other leader, Cursor, were $1 million in 2023 and $100 million in 2024, and they, too, are expected to reach $1 billion this year.

As to the final bullet point, see the table below, which comes from Goldman Sachs via Derek Thompson. You’ll notice that during the internet bubble of 1998-2000, the p/e ratios were much higher for Microsoft, Cisco, and Oracle than they are today for the biggest AI players – Nvidia, Microsoft, Alphabet, Amazon, and Meta (OpenAI doesn’t have earnings). In fact, Microsoft’s on a half-off sale relative to its p/e 26 years ago! In the first bubble I witnessed – surrounding the Nifty-Fifty in 1969-72 – the p/e ratios for the leading companies were even higher than those of 1998-2000.


Exhibit 7

In Conclusion

For my final citation, I’ll look to Sam Altman of OpenAI. His comments seem to me to capture the essence of what’s going on:

“When bubbles happen, smart people get overexcited about a kernel of truth,” Mr. Altman told reporters this year. “Are we in a phase where investors as a whole are overexcited about A.I.? My opinion is yes. Is A.I. the most important thing to happen in a very long time? My opinion is also yes.” ( The New York Times , November 20)

But do I have a bottom line? Yes, I do. Alan Greenspan’s phrase, mentioned earlier, serves as an excellent way to sum up a stock market bubble: “irrational exuberance.” There is no doubt that investors are applying exuberance with regard to AI. The question is whether it’s irrational. Given the vast potential of AI but also the large number of enormous unknowns, I think virtually no one can say for sure. We can theorize about whether the current enthusiasm is excessive, but we won’t know until years from now whether it was. Bubbles are best identified in retrospect.

While the parallels to past bubbles are inescapable, believers in the technology will argue that “this time it’s different.” Those four words are heard in virtually every bubble, explaining why the present situation isn’t a bubble, unlike the analogous prior ones. On the other hand, Sir John Templeton, who in 1987 drew my attention to those four words, was quick to point out that 20% of the time things really are different. But on the third hand, it must be borne in mind that behavior based on the belief that it’s different is what causes it to not be different!

Today’s situation calls to mind a comment attributed to American economist Stuart Chase about faith. I believe it’s also applicable to AI (as well as to gold and cryptocurrencies):

For those who believe, no proof is necessary. For those who don't believe, no proof is possible.

Here’s my actual bottom line:

  • There’s a consistent history of transformational technologies generating excessive enthusiasm and investment, resulting in more infrastructure than is needed and asset prices that prove to have been too high. The excesses accelerate the adoption of the technology in a way that wouldn’t occur in their absence. The common word for these excesses is “bubbles.”

  • AI has the potential to be one of the greatest transformational technologies of all time.

  • As I wrote just above, AI is currently the subject of great enthusiasm. If that enthusiasm doesn’t produce a bubble conforming to the historical pattern, that will be a first.

  • Bubbles created in this process usually end in losses for those who fuel them.

  • The losses stem largely from the fact that the technology’s newness renders the extent and timing of its impact unpredictable. This in turn makes it easy to judge companies too positively amid all the enthusiasm and difficult to know which will emerge as winners when the dust settles.

  • There can be no way to participate fully in the potential benefits from the new technology without being exposed to the losses that will arise if the enthusiasm and thus investors’ behavior prove to have been excessive.

  • The use of debt in this process – which the high level of uncertainty usually precluded in past technological revolutions – has the potential to magnify all of the above this time.

Since no one can say definitively whether this is a bubble, I’d advise that no one should go all-in without acknowledging that they face the risk of ruin if things go badly. But by the same token, no one should stay all-out and risk missing out on one of the great technological steps forward. A moderate position, applied with selectivity and prudence, seems like the best approach.

Finally, it’s essential to bear in mind that there are no magic words in investing. These days, people promoting real estate funds say, “Office buildings are so yesterday, but we’re investing in the future through data centers,” whereupon everyone nods in agreement. But data centers can be in shortage or in oversupply, and rental rates can surprise to the upside or the downside. As a result, they can be profitable . . . or not. Intelligent investment in data centers, and thus in AI – like everything else – requires sober, insightful judgment and skillful implementation.


December 9, 2025

P.S.: The following has nothing to do with the financial markets or the question of whether AI is the subject of a bubble. My topic is the impact of AI on society through joblessness and purposelessness. You needn’t read it – that’s why it’s a postscript – but it’s important to me, and I've been looking for a place to say a few words about it.

On November 18, a research note from Barclays described Fed Governor Christopher Waller as having “highlighted how recent stock market enthusiasm around AI has not yet translated into job creation.” This strikes me as paradoxical given my sense that one of AI’s main impacts will be to increase productivity and thus eliminate jobs. That is the source of my concern.

I view AI primarily as an incredible labor-saving device. Joe Davis, Global Chief Economist and Global Head of the Investment Strategy Group at Vanguard, says, “for most jobs – likely four out of five – AI’s impact will result in a mixture of innovation and automation, and could save about 43% of the time people currently spend on their work tasks.” ( Exponential View , September 3)

I find the resulting outlook for employment terrifying. I am enormously concerned about what will happen to the people whose jobs AI renders unnecessary, or who can’t find jobs because of it. The optimists argue that “new jobs have always materialized after past technological advances.” I hope that’ll hold true in the case of AI, but hope isn’t much to hang one’s hat on, and I have trouble figuring out where those jobs will come from. Of course, I’m not much of a futurist or a financial optimist, and that’s why it’s a good thing I shifted from equities to bonds in 1978.

The other thing the optimists say is that “the beneficial impact of AI on productivity will cause a huge acceleration in GDP growth.” Here I have specific quibbles:

  • The change in GDP can be thought of as the change in hours worked times the change in output per hour (aka “productivity”). The role of AI in increasing productivity means it will take fewer hours worked – meaning fewer workers – to produce the goods we need.

  • Or, viewed from the other direction, maybe the boom in productivity will mean a lot more goods can be produced with the same amount of labor. But if a lot of jobs are lost to AI, how will people be able to afford the additional goods AI enables to be produced?

I find it hard to imagine a world in which AI works shoulder-to-shoulder with all the people who are employed today. How can employment not decline? AI is likely to replace large numbers of entry-level workers, people who process paper without applying judgment, and junior lawyers who scour the lawbooks for precedents. Maybe even junior investment analysts who create spreadsheets and compile presentation materials. It’s said that AI can read an MRI better than the average doctor. Driving is one of the most populous professions in America, and driverless vehicles are already arriving; where will all the people who currently drive taxis, limos, buses, and trucks find jobs?

I imagine government’s response will be something called “universal basic income.” The government will simply mail checks to the millions for whom there are no jobs. But the worrier in me finds problems in this, too:

  • Where will the money come from for those checks? The job losses I foresee imply reduced income tax receipts and increased spending on entitlements. This puts a further burden on the declining segment of the population that is working and implies even greater deficits ahead. In this new world, will governments be able to fund ever-increasing deficits?

  • And more importantly, people get a lot more from jobs than just a paycheck. A job gives them a reason to get up in the morning, imparts structure to their day, gives them a productive role in society and self-respect, and presents them with challenges, the overcoming of which provides satisfaction. How will these things be replaced? I worry about large numbers of people receiving subsistence checks and sitting around idle all day. I worry about the correlation between the loss of jobs in mining and manufacturing in recent decades and the incidence of opioid addiction and shortening of lifespans.

And by the way, if we eliminate large numbers of junior lawyers, analysts, and doctors, where will we get the experienced veterans capable of solving serious problems requiring judgment and pattern recognition honed over decades?

What jobs won’t be eliminated? What careers should our children and grandchildren prepare for? Think about the jobs that machines can’t perform. My list starts with plumbers, electricians, and masseurs –physical tasks. Maybe nurses will earn more than doctors because they deliver hands-on care. And what distinguishes the best artists, athletes, doctors, lawyers, and hopefully investors? I think it’s something called talent or insight, which AI might or might not be able to replicate. But how many people at the top of those professions are needed? A past presidential candidate said he would give laptops to everyone who lost their job to offshoring. How many laptop operators do we need?

Finally, I’m concerned that a small number of highly educated multi-billionaires living on the coasts will be viewed as having created technology that puts millions out of work. This promises even more social and political division than we have now, making the world ripe for populist demagoguery.

I’ve seen incredible progress over the course of my lifetime, but in many ways I miss the simpler world I grew up in. I worry that this will be another big one. I get no pleasure from this recitation. Will the optimists please explain why I’m wrong?

Interestingly in this connection, Vanguard’s Joe Davis points out that more Americans are turning 65 in 2025 than in any preceding year, and that approximately 16 million baby boomers will retire between now and 2035. Could AI merely make up for that? There’s an optimistic take for you.

HM

Legal Information and Disclosures

This memorandum expresses the views of the author as of the date indicated and such views are subject to change without notice. Oaktree has no duty or obligation to update the information contained herein. Further, Oaktree makes no representation, and it should not be assumed, that past investment performance is an indication of future results. Moreover, wherever there is the potential for profit there is also the possibility of loss.

This memorandum is being made available for educational purposes only and should not be used for any other purpose. The information contained herein does not constitute and should not be construed as an offering of advisory services or an offer to sell or solicitation to buy any securities or related financial instruments in any jurisdiction. Certain information contained herein concerning economic trends and performance is based on or derived from information provided by independent third-party sources. Oaktree Capital Management, L.P. (“Oaktree”) believes that the sources from which such information has been obtained are reliable; however, it cannot guarantee the accuracy of such information and has not independently verified the accuracy or completeness of such information or the assumptions on which such information is based.

This memorandum, including the information contained herein, may not be copied, reproduced, republished, or posted in whole or in part, in any form without the prior written consent of Oaktree.

© 2025 Oaktree Capital Management, L.P.

Testing and Benchmarking of AI Compilers

Lobsters
www.broune.com
2025-12-10 17:29:29
Comments...
Original Article

This is an in-depth post on bugs and how to prevent them in AI software and AI compilers specifically. I was the software lead for TPUv3 at Google and I’ve worked on a variety of AI compilers and projects across Google, Nvidia, Amazon and Facebook.

Zero is a hard number

In my estimation, XLA has the most comprehensive AI test suite of any ML compiler, so I heartily recommend XLA for mission-critical AI. XLA is used for most Google AI and has been for a decade. XLA is highly reliable. Yet, even XLA has bugs that escape into the wild for customers to encounter. The number of bugs is not zero, not even for XLA.

Anthropic published this report , diagnosing a bug in an XLA op as one of the causes of the Anthropic service giving bad response to its users for a period of time. We should all commend Anthropic for being this open about the incident. The op in question, approximate top k, was a new op in XLA that evidently didn’t receive as much testing as it needed. This is just one bug, yet look at what resulted. I hope no one received bad medical advice or bad suicide prevention guidance from Anthropic as a result of this XLA bug, but those are among the possibilities. It’s just one bug and people might have died. Which is how you might start to understand why Anthropic took the issue so seriously as to publicly publish a report like that. If you read between the lines, you can tell how upset the person who wrote the report was, even though they are being very professional about it. AI software correctness is serious business.

Consider that your project, whatever it is, is quite unlikely to be as error free and as well tested as XLA is. If you are responsible for an AI development effort, how many situations like this would you like your customers to encounter? Zero. The correct answer is zero. But zero is a hard number. If your project will be widely deployed, you are not going to be able to keep the number of bugs that your users will encounter at zero. In fact, many software engineers might be laughing right now reading this, at the idea of zero bugs as a concept. The only projects with zero bugs reported are projects that don’t have any customers.

It’s a similar question as asking how many patients a surgeon should kill because they didn’t do their job correctly. It’s a number that one would certainly hope is zero, yet humans make mistakes and a surgeon’s number isn’t going to be zero over a long career. Except if he doesn’t do any surgeries. You can’t stay at zero. Zero is a place for people with no customers.

I’ve seen software developers discount testing because they know it will not remove all bugs. Zero is impossible, therefore any number will do. This isn’t an exaggeration or something funny. This is what some otherwise highly capable real software professionals really believe. They prefer to think this way because they believe that it will make their jobs easier if they don’t have to write tests - also incorrect, at least in the context of most AI software. It’s a bit like thinking that heavy smoking improves your life because it improves your day. I wouldn’t want surgery from a surgeon that did his work in this way and, for important AI applications, such as what Anthropic offers, I would avoid using AI software that was developed with such a mindset. XLA is excellent on testing and even XLA has issues such as this. “Sounds like XLA is doing a lot on this, if even they can’t do it, why should we try?” I’d suggest to stop thinking like that.

Zero is a hard number. If that by itself leads you to discount the attempt to reduce your project’s number, since we won’t reach zero anyway, I suggest (in fact insist) that there is something wrong with your philosophy of software development.

Planes sometimes crash. Airbags sometimes don’t deploy. Surgeons sometimes kill people by mistake. Rockets with astronauts on them sometimes explode. Trains sometimes derail. Bungee jump cords sometimes snap. Parachutes sometimes fail. Buildings sometimes collapse. Even though zero is a hard number, it matters how often “sometimes” is.

Testing and benchmarking should be high status work

Your AI project needs to view testing as lower status work in the same way it needs to view fire escapes as optional. Yet this is commonly how it is. People often do it out of a sense of duty, not because they have to. Some managers are getting a free pass because they have employees who do things correctly even if they aren’t supposed to.

One of the problems with testing is that it is a difficult task to estimate how well tested a feature or product actually is. It requires good engineering judgement. To have a firm sense of this, you need to be a good engineer, you need to know everything about the feature and you need to inspect the test suite carefully. There are metrics, such as the number of tests or various kinds of code coverage measures, which are OK to use, but they are not replacements for good engineering judgement. So if employee A does poor testing and employee B does good testing, it’s not necessarily going to be obvious that this is the case without looking closely. Both engineers delivered their project but employee B took longer. Maybe there is something wrong with employee B?

It’s OK, we’ll just count the number of bugs reported later and then we’ll know who did a good job. Employee B took longer. Now we also see more bugs reported in what he did. This employee B is real trouble. Well maybe employee B’s project was also more complex and more important to customers, so they used it more and found more of the few bugs that it did have. Maybe employee A’s project had many more bugs, but nobody used it, so it stayed at zero bugs reported. So counting bugs, as a metric by itself, is not great. It’s not a replacement for good engineering judgement.

If your CEO complains about your project having bugs, he probably just doesn’t know that zero is a hard number. Right? You can explain this to him. If it’s hard for your manager to tell whether proper testing has been done, what chance does your CEO have of figuring this out?

Well, OK, maybe figuring out if testing is good is just hard. But, surely, once a bug has been reported, we have to value our customers’ concerns and fix them quickly (true enough!). Turns out it’s quite easy to tell if your dev team is doing a good job fixing customer bugs - just ask the customers. We should probably reward engineers that are responsible for fixing bugs, because we need engineers to do this and they sometimes don’t want to. In fact, this employee A looks like a real star - he delivers all his projects quickly and he fixes more bugs than anyone else. There is no metric that is a replacement for good engineering judgement.

The CEO may notice that customers are sad about the bugs but loyal customers do in fact appreciate the close relationship that they have with the company’s dev team resulting from these quick bug fixes. Zero is a hard number, but our policy to focus on and reward bug fixing is working.

So, if you can’t tell, this is not a great situation this company is finding itself in, but everything looks reasonable. That’s the problem. Focusing on fixing bugs quickly is a fine idea, but the question is why there are so many bugs to fix in the first place. But how many is too many? I can’t tell you a specific number (well, OK, 37, that’s too many). Nobody can. There is no metric that is a replacement for good engineering judgement.

What happens if an employee notices that we aren’t doing a lot of testing and proposes to do more about testing? You are going to reduce the team’s apparent development velocity for a time if you do this. That doesn’t sound appealing. Worse, suppose this testing turns out to be effective. Then you’ve now revealed that your project in fact had many more bugs than it seemed. Your dev velocity will also now be even slower as you fix the sudden influx of bugs that your own testing revealed. So what does this look like externally to your team? Well, it might look like you first suggested doing less (to do more testing), then your project suddenly has way more bugs, then you did even less than you said you would (to fix more bugs) and, through all this, you’ve delivered nothing that any customer is happy about (for now). You are saying that you are now doing better on bugs, but actually the number of bugs reported against your project (by your own new tests) is above that of other projects in the company. So your testing effort is a success on its own, but what do things look like externally? This situation certainly calls for some careful management of perceptions.

I’m not an engineer specializing in testing, but testing is one of the underdone aspects of many projects I’ve been on, in my opinion, so I’ve had occasion to work quite a bit on testing because I thought it needed improvement. So, in fact, I have personally set off such a cascade of events in my career as described above. This is a direct quote from the manager at the time: “you found more bugs in the past two weeks than our entire team did in the past year”. This was a team with a subteam doing just testing (it wasn’t their fault - they were doing their jobs in the way that they were told to do it).

I ended the story at its low point. What happened immediately after this darker chapter is that customers were still reporting bugs, but now, more often than not, the answer was: “This is already fixed in the latest version, please update.” A while later, the number of bugs reported fell dramatically. The team had been spending half their time fixing customer bugs before any of this started, now it was much less than that. So development velocity and morale were significantly improved. Bugs are far faster and easier to diagnose and fix if you have small tests to find them up front, instead of having to collaborate with a customer to figure out what is wrong later. If there is a bug somewhere in a large customer AI model, this can be very challenging and time-consuming to diagnose (it can take weeks of work to diagnose one bug). That’s the primary source of the quite significant speed-up in team velocity that occurred. Testing improves team velocity. That can just take some time to materialize - both on the way up and on the way down.

There was also another source of improved team velocity. I didn’t just write a bunch of tests myself - though I also did that. The more important thing that I did was to improve the testing infrastructure of the project. If testing is lower status work, your top engineers may not look that much at what they can do to improve testing and its infrastructure. Especially not if you then have a subteam that does the testing instead of the people writing the features doing the testing. That subteam may be expected to take instruction on what and how to test, not to improve testing infrastructure. So then no one is expected to improve testing infrastructure.

Testing AI software isn’t easy or simple and neither is testing infrastructure. What I did was, first, to reduce the amount of boilerplate involved in writing a test. So, and this is not an exaggeration, you could write a test in 3 simple lines that would have taken 30+ more complex lines before, and this improvement applied across tests. This wasn’t easy to do, it required careful API work, and the effect was more significant than it may sound like since it makes people more keen to write many tests. So it doesn’t just save you some time when writing the test, it improves testing in other ways, too. Previously, often a file would contain only a single test. Now, files could contain many tests because they were not so big. Even this is a significant improvement - it’s just easier to keep track of less code.

I also wrote a fuzzer, which found a bunch of bugs. It was based on taking existing tests and automatically making them more complicated in various ways that didn’t change the result of the test. This was very successful, acting as a force multiplier on the existing number of tests, and I would recommend that approach for any AI compiler. So you write one test, but behind the scenes it turns into 20 tests. That’s a lot more productive.

This work was at first somewhat hard to sell as a positive. During the dark chapter period, which lasted a few weeks, I had caused everyone to now have to spend almost all their time fixing bugs, which is usually a software engineer’s least favorite activity. The view of this effort was much improved once we got past the dark chapter period. The well of bugs ran dry and things were looking up.

What did the testing subteam think? They were actually quite happy. If you write tests for a living, your job is going to be more fun if you can write many tests quickly. It’s also more fun if you can write one test and then a fuzzer automatically turns it into 20 tests and then you find many more bugs. You can probably see how the status of people involved with testing rises if they find more bugs. Which they will if given proper tools. I think it also helped morale that I was talking about their work as something very important, which of course it was.

This all led to the testing subteam having some extra time due to the now increased productivity of writing tests. They had previously been expected to use the project’s APIs to write tests, but not to inspect how the code inside the project worked - it was quite complex. I proposed that the testing subteam spend some of their now freed up time to do a series of improvements to the project on the inside of the code, primarily long overdue refactorings, so that they became familiar with the internals of the project, too. I also suggested that they write a reference backend for the AI compiler, which, apart from such a backend being yet another boost to testing productivity, required them to understand how to implement every op in the whole compiler (as opposed to testing each op from the outside). It’s easier to test a project if you know how it works on the inside. It turned out that they were perfectly capable of doing such work, they just hadn’t been expected to do such work previously. I would have just removed the entire notion of a test subteam and mixed this team in with the rest of the team, though we didn’t do that.

Was I expected or hired to do this kind of work? Absolutely not, though I didn’t have trouble justifying the time I spent on this after it got going. The whole thing took around 2 months of my time. It was successful enough that the testing approach that I used was disseminated more widely within the company through a company-specific avenue for such things. Don’t misunderstand this story - it was a great company and a strong team.

What about safety certifications? What if this team had been subjected to a safety certification process, maybe that would have led to the same changes that I made? No. I’ve been involved in such a process and nothing of what I did here would have been the result of a safety certification process. So you can perhaps see why I’m skeptical of safety certifications, even though they may indeed have some legitimate positive effects. I think that they are more a legal tool than an engineering tool. I suppose that legal tools can be important, too.

Maybe you think this is a story where I say that I’m a great engineer. Well, I do like to think so, yes, but you might have missed the bigger picture here. This was a story about the importance of testing and testing infrastructure and some of the challenges that get in the way. You underestimate these areas at your peril. I’ve never joined an AI software team that didn’t have some need for improvement in this area in my opinion (which partly explains how I was so useful on this - wasn’t my first rodeo). I think the whole AI industry is underestimating the importance of testing and benchmarking, not in one company or in one place but everywhere.

In the previous section, there is a perspective of how even a single bug can cause deaths and public embarrassment for you and your customers. In this section, we are talking about a high volume of bugs. So it seems that there is some kind of mismatch here? Yes, there is. That’s what I’m saying. You’ll find this mismatch everywhere in the AI industry.

If you still think that testing is or should be lower status work, then maybe read this story again. I have to say that I disagree with you. Testing AI software is not easy and it matters how you do it.

Kinds of AI software bugs and their impact

What can the impact of bugs in AI software be? There are different levels of AI software bugs:

No service bug A no service bug is when the user consistently gets an internal error or the system is obviously broken in some other way. These bugs are obvious and bothersome, but they are the least serious kind of error. A self driving system with a no service bug like this will not be released to the world until the bug is fixed, so it’s not that serious. It’s just bothersome.

Intermittent no service bug Like a no service bug, but it only happens some of the time. Maybe rarely. This is much more of a problem, since such bugs take time to be noticed, so they impact customers to a greater extent. For example, a self driving system with an intermittent no service bug might be released to the public, if the error does not occur during system testing, and then cause deaths in the wild if the bug does occur there.

Correctness bug With this kind of bug, the system isn’t obviously broken, and there are no errors, but what is happening is not correct. This is a very serious kind of bug in the context of AI software. These bugs can be extremely hard to diagnose and they can go unnoticed for extended periods of time. A self driving system with such a bug will probably not be released to the public, since the bug will likely be detected during testing, but that isn’t guaranteed.

Intermittent correctness bug This is the worst kind of bug and it is the kind of bug that Anthropic was dealing with in their public report . You can see how such a bug can escape testing efforts and go unnoticed for a long time even while it keeps causing serious problems. A self driving system with such a bug may well be released to the public.

As a customer, you might notice that no service bugs are not that serious for your business. Once something works, it’ll keep working. So you can deal with that. However, I would suggest considering that these different kinds of bugs are correlated. An AI system with many bugs of one kind is likely to have many bugs of the other kinds, too. So I would not take no service bugs lightly, even though their direct impact is limited. They are a red flag that should have you worried about encountering other bugs that may well impact your business more seriously.

Note that here we are not talking about AI that makes mistakes. That’s different. An AI mistake is when the AI functions the way it’s supposed to, but it just can’t figure out the right thing to do. This is a problem for AI researchers to deal with - they need to come up with a better transformer or use a better dataset to train the AI. That’s not what we are talking about here. We are instead talking about situations where the AI would do the right thing if the software that realizes its internal computations were functioning correctly, but that software is not functioning correctly. That’s an AI software bug (or potentially hardware bug), not an AI mistake. No matter how well an AI is trained or structured, it might still do the wrong thing if there is a bug in the underlying software that runs it. So buggy AI and wrong AI are different.

What can the impact of AI bugs be?

AI assistants AI assistants, such as Anthropic, ChatGPT or Gemini, are used for advice in all areas of human activity, including suicide prevention and medical advice. There is really no limit to what might result if an AI assistant uses buggy software, since there is no limit to the potential actions that the wrong person might take if given the wrong advice at the wrong time from a source that they trust.

Is it really feasible that an AI assistant could start saying evil things due to a bug? Consider that one of the possibilities for an intermittent correctness bug is intermittent sign error. Not a likely bug, but it is a perfectly possible one. Be aware that AI’s internally contain many vectors and an AI model may well have directions of such vectors that correspond to various kinds of evil behavior. An AI assistant will then of course have been trained to avoid such vectors. However, if there is a sign error, you might flip a vector or one of its components from a direction of “goodness” to a direction of “evilness” with a corresponding flip in behavior of the AI. So an intermittent sign error bug could in fact lead to an intermittently evil AI assistant that’s randomly good most of the time and optimizing towards evil when you aren’t looking. So buggy AI can potentially be quite a bit more serious than simply somewhat wrong AI.

Medical diagnosis AI is today used for medical diagnosis, such as finding cancer on a mammogram. In such cases, currently, AI is usually used to support human judgement, so a faulty AI verdict may be corrected by a human, but humans make mistakes, too, so that isn’t guaranteed. If a hospital that I will use does use AI in their diagnosis procedures, even if only in an advisory capacity, I would very much appreciate if they will avoid using buggy AI software.

Self Driving There are already self driving cars on the road and people fall asleep while “driving” them even thought they aren’t supposed to. In the future, there will be full self driving cars on the roads where you are allowed to fall asleep or there might be no human occupant at all. These are already on the road in a few areas. AI software bugs here can of course lead to traffic accidents and deaths.


These are three particularly serious applications, but there are many other applications of AI where bugs are still serious, even if not quite that serious. Ask your favorite systolic array, I mean LLM, and it’ll give you a long list of such applications.

Ah, but it’s OK, none of our customers are planning on using our product in a place where it might kill someone. Well, not right now and not as far as you know. If your AI software becomes popular, you will never know all the places it will be used. And, in any case, if a big order comes in from a medical diagnosis company, are you going to tell them that they shouldn’t use your product because it’s buggy? Probably not. You’ll take the order and hope for the best.

Maybe that medical company will require a safety certification process, but as I’ve said, these certification processes don’t assure what it sounds like they assure. You think “certified safe” software doesn’t have any serious bugs? Zero is a hard number. So the question is how effective the certification process is at finding bugs. Somewhat effective. Much of a safety certification involves making a list of all possible bugs you might have and then to do paperwork to document that you have tested for each of them. If you are honestly bad at coming up with possible bugs in your software, then your certification will be easier to complete. If I will receive surgery from an AI robot, what I want to know is that the people who created it were conscientious and competent. That is more powerful than any certification. Of course, I won’t object if they also have a certification.

I suggest you take software correctness and testing seriously. I also suggest that you prefer to prevent bugs escaping your software development process instead of focusing on fixing bugs quickly after customers report them - even though of course you do need to fix it if a customer does report a bug and doing it quickly in that case of course is preferable.

If you are buying a lot of AI hardware, you might want to ask your vendor about how many bugs they’ve ever had escape to their customers (they won’t know, or if they do don’t expect the number to be small, but watch how they respond) and what their testing story for their hardware and software is. You may have a hard time evaluating the answer, but if you get the sense that they aren’t taking that side of the business seriously, that’s something I’d be concerned about in your place. If they only emphasize their bug fixing turn-around time, and don’t have answers on their efforts on preventing bugs (e.g. testing), that’s maybe not great. Though maybe no one else ever asked before, so it’s OK if the sales team needs to go back to their dev team to ask. If they don’t take testing seriously, what that really means is that they aren’t taking your interests as a customer seriously. At least that’s how I would view it in your place.

AI hardware and software infrastructure

The first thing to know is that you need a significantly large server farm to run your tests if you will be developing large-scale AI software. During TPUv2 development, our XLA testing fleet of TPUs was so powerful that it would have been in the top 5 of world supercomputers at the time, if we ignored that such lists require higher precision computations than what TPUs do. To be fair, this happened because TPUs are incredibly fast, so we had many TPUs but not so many as that makes it sound like. Even though we had a lot of TPUs available for testing XLA, we still would have liked more. This is because you need many tests and ideally these should all run before every change to the software repository, so that bugs never even make it into the repository. This can require a lot of testing hardware.

It is quite important how long it takes to make a change to the code, compile (if in a compiled language) and run the tests. You will want a parallelized compilation flow where compilation happens in a distributed way rather than locally, since otherwise it will be slow. The same for testing - you will want a distributed system where tests can be run in parallel across many machines, not just locally. Critically, you will want this to be easy to do from the command line. At Google (and externally if you use Bazel ), you can compile and run all relevant tests by simply typing “bazel run” in your code directory. It will quickly compile and test in a distributed fashion automatically from that one invocation. If your work flow is not so good as that, I don’t know why you wouldn’t improve it until it is. And consider using Bazel for building and testing, it works well.

A particularly bad situation here is if a developer needs to book a machine to run tests on, has to log into it and then maybe has to install some things before running tests. Repeatedly (one-off is maybe OK). Don’t do it that way. Just use Bazel - it allows you to declare that a test requires a specific kind of hardware and it will make it happen (well, as long as you provide that kind of hardware to it, of course). At Google, you can type “bazel run my_test” and if my_test is set up that way, this will run in parallel across all current kinds of TPUs and a few types of GPUs, involving reaching out to many different machines, each with their own kind of hardware. It happens in seconds. You can also tell it to run only on a specific kind of hardware. You can have that where you work, too, if you use Bazel.

[[[ Irrelevant aside Why call it “Bazel”? Seems like an odd name. Well, internally at Google, this system has always been called “Blaze” for “blazing fast”. When it was open sourced, I guess they wanted to distinguish the open version from the internal version, so they called the open version “Bazel”. It’s a bit of an odd name, but you can see the connection to “Blaze”. ]]]

The modify, compile, test cycle time is important because it has a strong effect on developer productivity. If your developers’ time isn’t expensive and valuable, you probably didn’t hire the right team to work on AI software. Proper infrastructure is a large force multiplier on your development team’s efforts.

If it takes too long to run your tests (more than a minute is already a long time, I think), buy more hardware. Keep doing that until you can’t afford it anymore. If that never happens, you have either a very large budget (unlikely), or you team isn’t adding enough tests (very likely). If you are somewhat into the development process and you as a manager aren’t getting a request for more funds to buy a larger fleet of test machines, something is wrong . You should figure out what’s going on. Perhaps your team doesn’t feel empowered to ask for what they need. Perhaps they didn’t write any tests. Something is wrong. They should eventually be complaining that they don’t have enough test hardware no matter how much test hardware they already have.

You should buy a lot of hardware, but, eventually, you won’t have more money for more hardware. Even at Google that’s how it was eventually (though we did get a world top 5 supercomputer out of it, so hey, that was pretty nice). So what then? The most obvious solution is to ask people to stop adding so many tests. I’ve seen this proposed and used as the primary solution. That’s bad. ABAT - Always Be Adding Tests. But... then how do we solve the problem that it will take longer and longer to run all these tests? That could tank your dev team productivity, so that’s no good. What to do?

The first thing to do is to optimize your tests. The easiest way to optimize code is to run a profiler and generate a flame graph (if profiling and generating a flame graph takes more than a single command line invocation in your team’s setup, why not make a script for it?). Tests are code. So profile your tests. This will be very effective if you’ve been underway for a while - you are surely wasting time doing many silly things in your testing code. You might get a 10x this way, it can happen. A common first discovery upon doing this on an AI compiler is that generating large random arrays, commonly used to test AI software, takes way longer than you’d think. So cache those and reuse them. That alone can be a large speed-up.

If you profile your tests, you might also discover that your actual product is slow in some cases where you didn’t expect it to be. Congratulations, you just found a performance bug in your software and you will speed up also your tests by fixing it. For example, if you make AI compiler tests with very large graphs, as you should, then you might well find that your AI compiler is very slow for such cases - I once discovered an O(N^6) algorithm in the compiler I was working on this way. That’s something to fix. It’ll speed up your tests and please your customers if they use large graphs. If you do this work, of course document numbers on the impact of your work for your performance evaluation / promotion case in the future.

While you are profiling your tests, pay attention to the utilization of the AI hardware that you are running your tests on. The utilization of your AI HW during the testing process will usually be very low, e.g. less than 1%. This happens because many tests use a lot of CPU cycles to compile a kernel, prepare inputs and inspecting outputs for correctness. The actual kernel that runs on the AI HW is usually completed very quickly. AI HW is very fast - that’s the whole point of AI HW. So your tests are likely mostly CPU bound, running on a $200 CPU, while your $10k accelerator (if you can find one that cheap) is 99% idle. So in this case you can buy twice as many $10k accelerators to double your testing capacity. That’s industry standard, I’m not joking, it’s not something funny, I’m giving you serious information without exaggeration here. This happens largely because teams don’t realize that the AI HW is poorly utilized during testing (Profile my tests? Why would I do that???). But, even when teams do realize the issue, there might not be the will to fix it. I’ve seen that as well.


The trouble is that solving low utilization of AI HW during testing is somewhat tricky. Each test needs to use the device, so a test will commonly acquire exclusive use of the device, run the test and then release the device. So if you have one device, running the tests in parallel on the CPU doesn’t help since they are serialized on acquiring the device, even though the device is mostly idle while it is acquired.

What you need is an improved way to write tests, and improved testing infrastructure, such that, naturally, and without special efforts in any of the tests, the testing infrastructure automatically sets it up so that a test will do as much work as it can before acquiring the device (prepare inputs and compile the test kernel), then it quickly acquires the device, transfers inputs to the device, runs the (already compiled) test kernel on the device, transfers the output off the device and immediately releases the device. Only after the device is released, the test can then inspect the output for correctness. That’s how to do it - you can even pipeline these steps for further optimization, but that’s less critical to do. Your tests will probably still be bound on the CPU, with a significantly idle device, but if you have 30 cores on the CPU, this might give you a ~30x improvement on your testing throughput. So that’s nice. Now you can write 30x as many tests before you have a problem with the speed of testing again. ABAT - Always Be Adding Tests.

Won’t multiplexing tests on the AI HW like this lead to race conditions and flaky tests? If you do it wrong, yes. If you do it right, no. But worrying about flaky tests (tests that fail but only sometimes) is good. Flaky tests are bad. Make sure you don’t have flaky tests.

Won’t this make every test more complicated to write? If you do it wrong, it will. But if you hide all these steps behind testing infrastructure and good testing APIs, it will be no more complicated to write a test with this setup than without it. I suggest doing it that way.

Anyway, this is starting to sound like somewhat complicated software development, isn’t it? Profiling, optimizing, distributing across machines and parallelizing within machines even perhaps software pipelining. Yes, that’s right. Testing AI software isn’t a lightweight easy activity.

What do good testing APIs look like for AI software? This is what a test for 1 + 1 = 2 should look like for an AI compiler:

ExpectEq(AddInput(1) + AddInput(1), 2)

This one line does a lot of stuff:

  1. Create an AI model graph.

  2. Add two scalar inputs to the AI model graph.

  3. Add an addition node of the two inputs to the AI model graph.

  4. Add an output node to the AI model graph connected to the addition node.

  5. Compile the model graph to a kernel that can run on the device.

  6. Create two arrays on the host, each containing a scalar 1.

  7. Acquire the device, so that other tests cannot use the device.

  8. Transfer the compiled binary to the device.

  9. Transfer both inputs to the device.

  10. Reserve memory to hold the output on the device.

  11. Run the binary on the device, passing in the addresses of both inputs and the output.

  12. Wait for the binary to complete.

  13. Transfer the output scalar from the device back to the host.

  14. Release the device, so that other tests can use the device.

  15. Compare the transferred output array to the expected result, in this case a scalar 2.

  16. Report to the testing infrastructure whether the test succeeded or failed.

  17. Deallocate memory for the AI model graph, binaries, inputs, outputs etc.


For some AI compilers, this same test will require more than 17 lines of code, and from the list above, you can maybe tell how such a thing is possible. But, in fact, you can create AI testing APIs so that the above one line does all of that. There is enough information on that one line to do all of these steps. If your AI testing infrastructure isn’t so nice as this is, I suggest that you work on it until it is. You’ll also notice that the idea to acquire the device only when necessary is already baked into this API (you can this way also later add support for pipelining the transfers between tests without changing any of the tests).

I’d also suggest adding an automated fuzzer to expand each test into multiple other more complicated tests. So from one line you can generate many tests. You’ll also want a reference backend so that this also checks that 1 + 1 = 2. The correct output is inferred on the CPU using the reference backend, which is a simple and therefore (more often) correct.

ExpectEq(AddInput(1) + AddInput(1))

This is not useful for such a trivial case like this, since it’s not a problem to say that the answer should be 2, but it’s very useful for cases where the output is very large (e.g. a million numbers). Your fuzzer might also have use of such a reference backend.

Your reference backend will be used for large outputs. And the CPU reference backend should be implemented simply, so that you can have more confidence that it is correct, so you cannot (should not) use complex optimizations inside the reference backend. So if you look at your test profile (you are profiling your tests, right?), you are going to see after a while that the reference backend will be by far the biggest factor slowing down your testing. What to do? This one is a bit tricky. You need a simple reference backend, so maybe buy faster/more CPUs. What else? Well, here’s an idea that I think is good, even if it is a bit of trouble:

If the output from your device was previously correct upon comparison with the reference backend, record a stable 256 bit hash code of the previous correct device output. If the hash code of the current output from the device is equal to that which was recorded as correct previously, then you can mark the test as passing without running the reference backend again. This way the reference backend will rarely be run (most code changes do not change the output of most tests), yet you didn’t lose any test coverage.

You might also leverage this into a nightly determinism test: if you run the tests twice with no code change, the outputs should be bitwise identical. If they aren’t, that’s a determinism bug.

There are some complications with this idea, like how to store and retrieve the hash codes, but I think it’s the best practical solution. Better than just having unnecessarily slow tests - though definitely don’t do this until the test profile shows that the reference backend is starting to be a problem. You will want to rerun everything with the reference backend once nightly to catch, in a timely manner, some very rare situations where you can get false negatives from this approach (the reference backend changed enough to trigger test failures, or you look up by test name, the test changed and the device output didn’t change but it should have). Couldn’t you store the whole output instead of a stable hash code? Yes, but these outputs can be large and would consume bandwidth to transfer, so you might not want to do that, though you could. In that case you might as well cache the reference backend outputs instead. You can’t cache reference backend output hash codes since the device output comparison to the reference backend is usually approximately equal, not exactly equal, for reasons such as floating point reassociation.

Most of your tests are going to be smallish unit tests, that you can run before every code change, but you are also going to need larger-scale tests that take too long to run before every code change. E.g. if your hardware is for training, one of your tests should be to train e.g. a transformer from scratch and the test criterion is that the converged accuracy of the trained model at a given training step is within expectation. If your hardware is for inference, you should be looking at accuracy end-to-end of e.g. a pretrained transformer. You might want to run this for multiple kinds of models. If you aren’t running such tests daily (nightly) or at least weekly in an automated way, that’s a problem. You’ll also want automated tracking and graphing of how model accuracy is changing over time so you can spot regressions (or improvements).


If tests take more than a few minutes to run, people will sometimes skip it when they think it’s probably OK, occasionally leading to tests that fail in the shared code repository. This is a big problem for everybody every time it happens, so people get understandably angry about this. If running the tests is slow or somehow troublesome in any other way (“just follow this 10 step process for booking a machine that you run the tests on”), the real root cause for this situation is your testing infrastructure . If I hear of a situation where this keeps happening, I already know that their tests are slow or bothersome to run in some other way. That’s why it happens. I’ve seen managers and infra teams be very angry at engineers for skipping a 2 hour (!!!) test run, causing test regressions to be submitted in rare cases, when the real underlying trouble is that no one bothered to ever profile and optimize the tests. And no one even asked for more test hardware, either. You need good testing infrastructure and a bunch of test hardware.

In the past two paragraphs, I wrote that you need tests that take too long to run before every code change, and I also wrote that if you get regressions into your code repository, that’s because your tests are too slow to run. Seems like a contradiction - it has to be fast, but also you need slow tests. How to reconcile these two ideas? The consistent rule here is that if a test is very slow, it might get regressed because it won’t and shouldn’t be run all the time, only sometimes (e.g. nightly). You’ll just have to deal with that as a permanent situation, is what I’m saying. However, ideally, you will have many fast small tests that test your entire product, so that you only very rarely have cases where a slow long-running nightly test fails but none of your fast per-code-change tests fail. That’s how you deal with this, by reducing the rate of such occurrences to be low enough that it is tolerable. If slow-test regressions are happening often enough to be bothersome, then you don’t have good enough coverage from your fast tests.

Here is something that I would suggest to avoid: It is possible to save significant test load by having everyone submit their code changes without running tests, or running just a few tests, and then only running the tests every N’th (e.g. 10th) code change on the shared code repository. In case of a failure, a code change is then automatically identified (bisected) and reverted. This is bad for developer productivity and also morale (“who submitted a regression again?”). If you don’t have any budget left (did you ask?) and you already profiled and optimized the tests (did you, though?), maybe you have to do it that way, but it’s not good. I wouldn’t do that.

You might also want to run your test suite together with Valgrind and the various llvm sanitizer modes and with your favorite static analysis and coverage tools and, these days, AI analysis. This doesn’t have to happen often, some of the modes are very slow, but it’s imprudent never to do it. Once monthly or more often might make sense. Certainly always before releases. If you find e.g. 30 of these, you can run one of them each night, to avoid having to deal with the combined findings of 30 such tools at the same time.

XLA has an excellent open source test suite. If you are developing software for your own AI HW, you might consider adding an XLA backend (not that hard) so that you can run XLA’s test suite against your software and hardware. You might consider doing so to access XLA’s test suite even if you aren’t interested in using XLA for anything else. Though XLA is a fine choice. It’s how Google does AI.

Benchmarking infrastructure

Everyone on your team needs easy access to running benchmarks. Any change, even ones not intended to, can change performance for better or for worse, and it’s important for your team to be able to tell what’s going on here. It is also much easier and more motivating to do performance work if you can right away see that your change was a positive +X% for some X.

You need benchmarks both for how fast your compiler is at compiling and for how fast the generated binaries are at running an AI model. A code change can make one model faster and another model slower, so it’s important to measure across a variety of models - you can’t just look at a single model. Unless your project only supports a single model - which is starting to perhaps be a possible situation with everything being a transformer. You of course want your customers’ models to be included in your set of benchmarks, so that your team knows right away if you are making your customers’ models slower. Regressing customer models is easy to do by mistake, so it’s important to be able to tell if you did it. What if your customers want to keep their models secret from you? Well, that’s OK, but then they might get regressed (run slower). That’s the deal.

One thing that any team eventually learns over time is that some powerful optimizations that improve many models overall might still regress some models and it isn’t always practical to fix the regressions (sometimes it is, of course). So you are going to have to accept regressions sometimes. It’s a case by case judgement call. Requiring no regressions ever is not a practical policy, not even of customer models. What you’ll want to make an effort to avoid is large regressions in customer models that make their way to that customer and then there is nothing the customer can do about it. Keep in mind the possibility that if your team did many separate optimizations for a release, then perhaps all your customers’ models are improved, even if some of those individual optimizations caused regressions on their own. Don’t let fear of waves prevent a rising tide from lifting all boats (this analogy breaks down because tides are also cyclical, but don’t think about that).

It is common, maybe even industry standard, that running benchmarks is an involved process that requires a specific team member, who knows how to do it, to spend some time on it each time. So people will ask that team member to benchmark their code changes and, if the team member can be bothered that day, you might get the result the next day. The team member will probably only run one or two benchmarks, because otherwise it’s too much trouble. This is normal. It’s also not a good situation. Instead, you want a benchmark report like this to be reported automatically if you simply write “run_benchmarks” on the command line:

Performance report for change XYZ

Before After Speedup (1.0 = neutral)

model_benchmark_A 120 ms 60 ms 1.93

model_benchmark_B ...

...

Geomean 1.42


Geomean means geometric mean, this is the correct mean for ratios (like speedup). This report is comparing the time with and without a code change. That’s what you need to know the impact of the code change in terms of performance. You want a report for both compilation time and model benchmark time (always reported together). Ideally the time to generate such a benchmark should be very low, but in reality it’ll probably take a while. If it’s more than an hour that’s probably not good (you are profiling your benchmarks, right?). You want the report from the command line to be in the form of a permanent HTTP link pointing to the report. That link can then be put into a code review, so the reviewers can see what the impact of your change is without having to rerun the benchmarks for themselves, or it can be put into a report for your company, such as a written case for promotion.

If you don’t have this, I’d suggest to work at it until you do. I don’t know why you wouldn’t. It is a force multiplier for your entire team.

An important property of a benchmarking system is how noisy the numbers are. To measure this, run the benchmark a few times with a change that doesn’t do anything, so that before and after should be identical. If you don’t get a perfect 1.00 impact result every time, then your benchmark is noisy. An optimization that gives you 1% across your benchmark set is a good optimization, so you will want benchmarks that are stable enough so that you can trust a 1% difference to be real, not just noise. If you don’t make special efforts here, your benchmarks are probably going to be very noisy. You may want to have a daily (nightly) run that measures noise in your benchmarking process and shows it as a graph and with notifications in case of an increase in noise. This is important since if your benchmarks have suddenly become noisy, your team may not know about it yet, so important decisions can then end up being based on differences that in fact are just noise. Also, if you have one of these Employee A types on your team, they can run the benchmark 10 times until it gets the result that they want and that will work if the measurement is noisy. Promotions and pay are in part based on such numbers (“I improved X by Y%”), so there is a very real motivation for that kind of shenanigans. Noisy benchmarks are bad. Unless you want to confuse your organization - then maybe they are very useful. Don’t accept noisy benchmarks.

Where does all this noise in benchmark results come from? For speed, your benchmarks need to run in parallel across your fleet of machines. This will likely be the test fleet and may even be coming from a cloud provider. This introduces many noise problems:

  1. Varying load If machines in your test fleet are also running tests at the same time, that will introduce noise since the load on the machine is varying. So you will want to acquire exclusive use of each machine for each benchmark, so that each machine is otherwise quiet. If your test fleet is coming from a cloud provider, this may not be possible. You may need to buy physical machines only for the purpose of benchmarking.

  2. Temperature If the temperatures of the chips in your machines are not identical across benchmark runs, then throttling features of high performance chips will cause variation in the results (all high performance chips have throttling features). This is true for both the CPU and your AI HW. It is hard to manage getting an identical temperature, but one thing that may help is to leave the device idle for a second before running the benchmark or running the benchmark twice while discarding the first result. If nothing else works, you may also choose to pre-throttle the machine so that it is always running at a slower clock during benchmarks. This messes up the measurement a bit, since your customers will not be using that configuration, but at least it can be less noisy. Make sure that your test fleet machines are always returned to their default, faster state after a benchmark, even if the benchmark is interrupted by e.g. a segmentation fault.

  3. Varying machines Are all your test machines identical? Are you sure? What if one machine is standing in a hot corner of your datacenter and another machine is in a cold corner. That’s already non-identical. One way around this is to run all benchmarks in parallel across machines, but to run the before and after for each same benchmark each on their own same machine. That reduces before/after noise. It will also reduce your parallelism by a factor of two, but that may well be worth it. You can also run all the benchmarks always on the exact same machine, but then your benchmarks will likely take too long to run overall after a while (benchmarks accumulate) because you will have no parallelism.

  4. Measure the right thing However strange it may sound, I can report that it is quite easy to create a benchmarking infrastructure that ends up measuring things that shouldn’t be included in inference or training step time and these additional things tend to be more noisy than the real signal that you are after. One bad example here is if the measured time includes the time to acquire exclusive use of the device (a process that can be hidden away inside the infrastructure), which could take anywhere from a nanosecond to a minute. Inspect carefully when you start and stop the clock and what occurs between the two. You also probably want to exclude time taken to transfer inputs/outputs and binaries to/from the device. Also take care to know whether you are measuring wall-clock time or CPU cycles time or device cycles time. Measuring all 3 times is fine, but if you only report one, you want wall clock time. This is the real end-to-end measure.

  5. Natural variation Whatever you do, there is still going to be variation. For this reason, you may prefer to run the benchmark a few times and report the median or minimum time instead of just running it once. You will likely want to run very fast benchmarks many times and long running benchmarks perhaps just once. For important runs (like the ones you promote people based on), maybe run the whole report separately a few times to ensure that you get the same result. If you use Bazel, which can also be used for benchmarks, be aware that you have to tell it to do the same thing again, otherwise it will just give you the same cached result back many times without remeasuring.

  6. Wrong baseline You can’t cache a baseline number to compare against across your team. It may sound like a nice 2x on benchmark speed to reuse the same baseline, but it causes meaningless results. You have to compare a change to a baseline that is the revision just before that change. If your baseline is a fixed revision that changes e.g. nightly, then people are not getting just the effect of their own change. This is an optimization that I have in fact seen used and it’s not a good idea.

Getting a handle on all this is necessary if you will be building serious AI software. You need to be able to run benchmarks easily and (somewhat) quickly and you need to be able to trust the results. Benchmarking is hard.

You will also want to ensure that you store the logs of all the benchmarks. Benchmarks can crash or exhibit other problems and you need some easy way to investigate that. You probably also want a feature where you can specify a single benchmark to run, or a specific few, instead of the default larger set, on the command line.

Due to the virtues of ABAT, Always Be Adding Tests (benchmarks are performance tests), over time you will accumulate benchmarks (good) so that you will have to curate the set of benchmarks that run by default when people write “run_benchmarks” on the command line. Otherwise it will take too long. This is unfortunate but will eventually be necessary. Certainly profile and optimize your benchmarks and complain about not having enough hardware before you start doing this. What is the utilization of your AI HW during benchmarks? It can be and should be quite a bit higher than what you’ve achieved for your unit tests, since benchmarks run for longer on the device per benchmark than unit tests normally do. So make sure that that is the case.

You will also want a daily (nightly) run of your benchmarks that is permanently recorded and which feeds into graphs that your whole team can easily access, so that everyone can see how performance is trending over time for each benchmark model. This is also how you notice if a regression in performance makes it into the shared code repository (which can easily happen). For this nightly run, you will want to run many more benchmarks than are run by default when team members type “run_benchmarks”. This is feasible since it only runs once per day.

Avoid stigmatizing people for having submitted performance regressions. It happens and it’s not a big deal as long as all this infrastructure is in place to catch it. Don’t make your team afraid to change the code, including junior developers and interns. If someone keeps doing it all the time irresponsibly, sure, have a conversation about it, but it probably shouldn’t be a public conversation.

You may also prefer to be able to run benchmarks against competing AI software. I think that’s a great idea. How else will you be able to claim that your AI software or hardware is better than everyone else’s AI software, as all AI software and hardware vendors claim? You can even have a version of your benchmarks where the before/after is your product against their product. There are some troubles here. You probably know how to properly configure your own product (better check, though!), but you may not know how to properly configure your competitions’ products. It’s easy to end up comparing your parallel product against your competitors’ product that, unfortunately, only got one thread to run on, which is a meaningless comparison even if it enables you to claim a 30x speedup (I’ve seen this happen). There are many ways to get this wrong. Do not expect setting up benchmarking to be easy to get right. In fact, wrong benchmarks is probably part of how it is possible for all vendors to claim that their product is best (another factor is only reporting numbers for models where you are faster). If you are a customer, never believe the benchmarks coming from vendors . If you are a vendor yourself, you probably will want to know what’s really up, though, so better to have correct benchmarks. A sophisticated customer will set this up for themselves, so you can’t hide the truth. If a specific AI model is much faster on a competitor’s product, that’s also an easy way to tell that there is something to be improved in your own product on that AI model, to the benefit of your own customers.

Does it sound like benchmarking is not that easy? If that’s what you think now, then you are correct. Teams commonly don’t do this at all or do it quite poorly to save time having to deal with all this. This is a false economy of development effort. Easy (easy to run), stable and fast benchmarking is a force multiplier on your entire team, so you will want to get it right.

This is a summary of the the three most important concepts (recall that benchmarks are included in “tests”):

  1. ABAT - Always Be Adding Tests (always improve your test suite)

  2. ABP - Always Be Profiling (always optimize your tests and inspect HW utilization)

  3. ATTAM - Always Try To Acquire Machines (keep asking for more test hardware)

If ABP and ATTAM don’t seem necessary, you have not been doing enough ABAT (of course postpone ATTAM until you need it - don’t worry, you will soon).

There really are AI compiler teams out there that never optimized their tests at all (probably missing out on 100x speed) and that never considered getting more hardware. So they just stopped writing tests because otherwise running the tests is too slow (and maybe some of the engineers don’t like writing tests, maybe in part because the testing APIs are bad - you need good testing APIs). However absurd that may seem after you’ve read all this, this is a real thing that happens . Don’t let it happen to you. ABAT, ABP, ATTAM.

If you are a manager, how much test hardware should you budget for? More than that. Whatever it was. See, that’s how you do ATTAM. More seriously, you will have to weigh this carefully. Ensure your engineers are doing ABAT and ABP and you’ll have to see how far the budget for hardware can reasonably go after that. Don’t fall into the trap of thinking that “there has to be a limit” implies “any limit is as good as any other limit”. This is similar to “zero is a hard number, so any number will do.” That’s wrong and that’s why you need testing in the first place. There is a test hardware budget that is right for your situation and you’ll have to figure out what it is.

If people follow this advice literally, have I created an economically inefficient situation where engineers are trying to pull too many funds into test hardware? You could imagine such a thing. First of all, don’t follow this too literally. Second, it is more likely that you are neglecting this area than overinvesting in it. Though if your engineers want more test hardware and they didn’t even profile their tests (a common situation), then I’ve given you some ammunition to push back here.

The power of assertions and/or internal errors

It is far better to get a consistent internal error than to get a wrong result from an AI model. For this reason, you will want to put assertions / internal error checks everywhere in your AI compiler. All over the place, in every function, absolutely everywhere. If these trigger often for customers, so that your assertion practice may start to have a poor reputation, that’s still much better than not having the assertions and getting correctness errors instead. If customers are seeing too many assertions triggering, then your problem is the overall quality of your product and of your testing practices. The problem is not the assertions themselves. Even if an assertion would be in error (which almost never happens in my experience), your testing efforts should have uncovered that. Assertions are good. Use them everywhere. Ship your product to your customers with assertions enabled - but see the paragraph below on legal liability.

Assertions can slow your product down. This happens primarily if an assertion is checking a property that takes too long to check. You might also, very rarely, have an issue with assertions, even fairly cheap ones, occurring inside inner loops and therefore taking too long. Both cases should be noticeable when you profile your product (you are profiling your product, right?). In this case you might have to remove an offending assertion that’s just too slow. That’s fine. Still keep adding other assertions, though. Most assertions are quite cheap. Assertions are good.

If using C++, never use the C assert() macro. Instead, use a macro like the Abseil CHECK() macro, which enables providing a useful dynamic message along with the error, e.g.

ABSL_CHECK(a == b) << “a (“ << a << “) should be equal to b (“ << b << “).”;

Now we know the values of a and b when the condition fails, not just that they weren’t equal. This can be very useful sometimes. This particular case is so common that there is a short-hand for it:

ABSL_CHECK_EQ(a, b);

Be aware that the code to generate the message is only ever executed if the CHECK fires, so that isn’t a performance overhead on the happy path. There might be indirect performance effects through code size and adding branches to your code - I’d consider those small overheads to be well worth it, excepting the hottest inner loops in your code.

If you don’t want to take a dependency on Abseil just for a CHECK macro, that’s fine. Just make your own CHECK macro. You can do it in a small header. Don’t put unnecessary messages on every CHECK assertion, but sometimes it can be quite useful. Also during debugging.

What should you do when an assertion triggers? The default is to shut the program down. This can be a good solution, since it forces reporting of the issue and operating in an incorrect state is a menace - also for cyber security. However, some cases like self driving software might call for a different response if the car is not currently in a safe situation to shut down or restart. My advice is: consider carefully what to do in case of a segmentation fault, then do that also for assertions. This advice doesn’t help you figure out what to do, but it halves the number of decisions to make, it might help you realize that you can in fact change what happens on segmentation fault (and sometimes, very rarely, should) and, most importantly, it makes it clear that the problem of what to do in case an internal error is encountered isn’t introduced by using assertions - you have to figure that out anyway. If your solution is to continue to operate the program, be sure to consider what happens if there is a high rate of segmentation errors or assertions - log disk space may overflow and, also, if the logging or other handling is slow, you might slow your program down so much that it can’t be used anyway (very bad in self driving). Before you get too depressed about this issue, consider that the real line of defence is to have a very low rate of errors in the first place and that’s something assertions help with.

This all leads to the awful issue, saved for last, of what to do about bugs and potential legal liability caused by assertions (or by not having them - could be construed as an irresponsible development practice). Assertions on the whole reduce bugs (and therefore deaths), that is why I advise to use them, but assertions are code. Any code that you write can result in bugs and therefore deaths. E.g. even a seemingly side-effect-free assertion can follow a wild pointer, leading to a segmentation fault. Just as any code can. You might remove assertions when you ship to the customer, but are you 100% sure that all the code inside all your assertions doesn’t include any necessary side effects that you now removed? This is a source of bugs, so I don’t suggest this. You might also have an incorrect assertion - it is reporting an error, but there isn’t an error (the assertion itself is the error). On the whole I think the most responsible development process is to assert liberally and to handle what happens when an assertion triggers carefully. I’d suggest to try to avoid policy decisions that will motivate your development team to avoid asserting liberally - e.g. if you require an error recovery document written for every assertion, there may well be no assertions at all, so you’ve in effect decided not to use them. There is no decision that you can make here, including not using assertions in the first place, that doesn’t have some serious problems with it. If you can, sure, use formal methods and prove that every assertion is correct and side effect free, but this is unlikely to be practically feasible for most software. You’ll want to ask your lawyers about error detection and recovery, including assertions (show them this paragraph).

Whole model debugging

Suppose your AI software has a bug that causes an AI model to fail. Worse, your suite of unit tests has failed to catch the bug. Uh oh. Hopefully, you were running some whole models as part of your nightly tests and this is how you caught the bug. In that case you can figure out (bisect) which of the changes from the prior day were responsible - not great to deal with, but also not horrible. But let’s say that’s not what happened. The bug escaped your testing entirely and has either been reported from a customer or this is a new model that you’ve just added to your test suite. Hopefully, one of your assertions caught the issue, which makes things much easier. But let’s say that’s not what happened either. There is no error, the model just doesn’t give the right result when run on your software and hardware. Let’s say the user model has 100k ops in it. Something is wrong somewhere. It can be extraordinarily challenging to figure out what is causing a situation such as this. Oh, and the customer is waiting, so if you can get this done real quick, that would be nice. Sometimes it can be resolved quickly, but sometimes it can take weeks of hard work to figure something like this out.


First of all, you are in this situation because your unit test suite isn’t good enough. You clearly need more unit tests if this is happening. But zero is a hard number, so, very infrequently, even if you have an excellent suite of unit tests, this might still happen.

What to do? I can tell you what happened the first time this occurred on the XLA team. We all stared at each other and said something like “oh fuck, what now?” The whole team got together and brainstormed how to address such situations (we had a good manager who realized that this was necessary). The team had many ideas and we went off and did all of them. Over time, it became clear that just two of these ideas turned out to resolve most cases like this. It turns out that most bugs are either something where some individual op (or fusion of ops) is producing an incorrect result locally, or, if that isn’t it, the problem is likely that there is some kind of memory error somewhere (accessing an address in memory that shouldn’t have been accessed - this can have non-local effects). The good news is that both of these kinds of errors can be diagnosed automatically if you have the right tools for it. I haven’t seen either of these two tools in other AI compilers, but I’ll strongly suggest for everyone to build them. The function of both tools is to tell you which exact op it is that is having a problem. From there you’ll have to debug that single op, but that is much easier than doing whole model debugging.

The Isolator If an op (or fusion of ops) is doing the wrong thing, the way to figure that out is to take the user model, separate it into individual ops (or fusions of ops), and test each op (or fusion of ops) individually against your reference backend. So, yeah, you must have a reference backend for this. The tool in XLA that does this is called The Isolator. In most situations of whole model debugging, the isolator tool will quickly tell you the exact op that is the root cause of the issue. So you are going to want to build a tool like that.

The primary complication in building an isolator tool like this is that you have to provide the isolated op (or fusion of ops) with inputs so that you can run the ops and compare the result to the result from the reference backend. One way to do this is to capture the inputs used inside the actual user model. That will work, though this is going to be some trouble to set up and will generate very large files of inputs. An alternative is to just use random numbers as the inputs. This has some issues, chief among that the floating point numeric properties of the op in question may be bad on a random input, even if the numerics are fine on the actual data used inside the model. This makes the output noisy and can result in a mismatch against the reference backend even if there is really nothing wrong with the op. Note that the reference backend will not likely produce an identical output due to floating point reassociation, among other reasons. In XLA, there is a very complicated set of heuristics inside The Isolator to figure out how to generate data that will likely be numerically stable for a given op (or fusion of ops). There are also some ops that e.g. take indices, like DynamicSlice , which won’t be tested very well with purely random data. Using heuristics to resolve this can work very well. This means that The Isolator can diagnose bugs on models without needing any associated data for that model. If you don’t want to deal with that, you’ll have to capture live internal data for the user model. Or you’ll have to deal with false positives. One of those three.


The Isolator is very effective. For example, the XLA bug that Anthropic found would likely have been immediately automatically diagnosed using The Isolator.

The Bounds Checker Sometimes The Isolator will report that all the ops in a model are working correctly, yet the final result is still wrong. This is rare, but it happens. How can that be? The most likely reason for such a situation is that one of the ops is accessing memory that it should not be accessing. This can cause some other op to later receive incorrect data, even while the offending op that is the root cause of the issue doesn’t output any incorrect data itself. The Isolator will not detect such a situation. For this reason, XLA has a feature called bounds checking. It ensures that all memory accesses are checked to see if they are as expected. Running a model with this feature enabled will pinpoint which exact op it is that is accessing memory incorrectly. This is so desirable that XLA:TPU enables partial bounds checking at all times, even if you don’t ask for it, which was made possible due to performance work that reduced the overhead of bounds checking. Bounds that are particularly expensive to check are still left unchecked unless you ask XLA to enable them.

It is a rare situation where neither The Isolator nor the bounds checker can resolve a whole model debugging problem down to an exact op. It can happen, but it is rare. In those rare cases, it is still very useful to know that both of those tools failed to find the bug, because that is a big hint for where to find the bug - the bug can’t be anywhere where those tools would find it, so it has to be elsewhere. So these tools are very useful even in the rare cases where they fail to find an offending op (or fusion of ops).

Another possible bug is where high-level optimizations rewrote the computational graph incorrectly somehow. Bounds checking and The Isolator work at a lower level of abstraction from this and so will not detect such an error. This is easier to diagnose, though - disable the optimizations passes one at a time (or all at once) and see what effect that has. We didn’t see many cases like this for whole model debugging, though, I expect because most such cases were already well tested using unit tests. These optimizations are usually not so hard to unit test. Though if you see more bugs of this kind than we did in XLA, perhaps you’ll want a tool for this kind of debugging, as well.

Productive and non-error-prone APIs

Not making errors in the first place is upstream of finding them with tests. This isn’t something you can do 100%, but some software designs and APIs are much more error prone than others and it is a good thing to prefer less error prone designs, all other things being equal.

Productive test APIs and productive APIs in general are also important. I gave a concrete example of a productive test API in an earlier section.


These are two things, productive APIs and non-error-prone APIs, that I know when I see it, but I don’t know how to write a long section about how to do them. Yet these are very important, so just because I didn’t say much about it shouldn’t be taken to mean that they aren’t important. It might just require long-time interest, some talent and a lot of experience to arrive at such APIs, but maybe someone cleverer than me can cover it better.

Updates

2025-01-12 Sean Silva mentioned an IR invariant checker on LinkedIn. Yep, that’s important! This is something that you should have - something to run between compiler passes that detects if IR invariants have been violated. This helps a lot to find bugs in compiler passes and can detect some intermittent issues that may otherwise escape your test suite. Running it between all compiler passes might be too slow, except during debugging, but you may well want to run it as the first and last passes even in production, and perhaps also a few places in-between.

Auto-grading decade-old Hacker News discussions with hindsight

Hacker News
karpathy.bearblog.dev
2025-12-10 17:23:53
Comments...
Original Article

hnhero

Yesterday I stumbled on this HN thread Show HN: Gemini Pro 3 hallucinates the HN front page 10 years from now , where Gemini 3 was hallucinating the frontpage of 10 years from now. One of the comments struck me a bit more though - Bjartr linked to the HN frontpage from exactly 10 years ago , i.e. December 2015. I was reading through the discussions of 10 years ago and mentally grading them for prescience when I realized that an LLM might actually be a lot better at this task. I copy pasted one of the article+comment threads manually into ChatGPT 5.1 Thinking and it gave me a beautiful analysis of what people thought + what actually happened in retrospect, even better and significantly more detailed than what I was doing manually. I realized that this task is actually a really good fit for LLMs and I was looking for excuses to vibe code something with the newly released Opus 4.5, so I got to work. I'm going to get all the front pages of December (31 days, 30 articles per day), get ChatGPT 5.1 Thinking to do the analysis, and present everything in a nice way for historical reading.

There are two macro reasons for why I think the exercise is interesting more generally:

  1. I believe it is possible to train your forward future predictor. You can actually get better at predicting the future, but you need training data and effort. I always thought that "predicting the future" could be a serious, productive class that people could imagine taking in College or studying professionally. Superforcasters agree.
  2. I was reminded again of my tweets that said "Be good, future LLMs are watching" . You can take that in many directions, but here I want to focus on the idea that future LLMs are watching. Everything we do today might be scrutinized in great detail in the future because it will be "free". A lot of the ways people behave currently I think make an implicit "security by obscurity" assumption. But if intelligence really does become too cheap to meter, it will become possible to do a perfect reconstruction and synthesis of everything. LLMs are watching (or humans using them might be). Best to be good.

Vibe coding the actual project was relatively painless and took about 3 hours with Opus 4.5, with a few hickups but overall very impressive. The repository is on GitHub here: karpathy/hn-time-capsule . Here is the progression of what the code does:

  • Given a date, download the frontpage of 30 articles
  • For each article, download/parse the article itself and the full comment thread using Algolia API.
  • Package up everything into a markdown prompt asking for the analysis. Here is the prompt prefix I used:
The following is an article that appeared on Hacker News 10 years ago, and the discussion thread.

Let's use our benefit of hindsight now in 6 sections:

1. Give a brief summary of the article and the discussion thread.
2. What ended up happening to this topic? (research the topic briefly and write a summary)
3. Give out awards for "Most prescient" and "Most wrong" comments, considering what happened.
4. Mention any other fun or notable aspects of the article or discussion.
5. Give out grades to specific people for their comments, considering what happened.
6. At the end, give a final score (from 0-10) for how interesting this article and its retrospect analysis was.

As for the format of Section 5, use the header "Final grades" and follow it with simply an unordered list of people and their grades in the format of "name: grade (optional comment)". Here is an example:

Final grades
- speckx: A+ (excellent predictions on ...)
- tosh: A (correctly predicted this or that ...)
- keepamovin: A
- bgwalter: D
- fsflover: F (completely wrong on ...)

Your list may contain more people of course than just this toy example. Please follow the format exactly because I will be parsing it programmatically. The idea is that I will accumulate the grades for each account to identify the accounts that were over long periods of time the most prescient or the most wrong.

As for the format of Section 6, use the prefix "Article hindsight analysis interestingness score:" and then the score (0-10) as a number. Give high scores to articles/discussions that are prominent, notable, or interesting in retrospect. Give low scores in cases where few predictions are made, or the topic is very niche or obscure, or the discussion is not very interesting in retrospect.

Here is an example:
Article hindsight analysis interestingness score: 8
---
  • Submit prompt to GPT 5.1 Thinking via the OpenAI API
  • Collect and parse the results
  • Render the results into static HTML web pages for easy viewing
  • Host the html result pages on my website: karpathy.ai/hncapsule
  • Host all the intermediate results of the data directory if someone else would like to play. It's the file data.zip under the exact same url prefix (intentionally avoiding a direct link).

I invite people to browse around the results. I spent about 2 hours and found them to be very interesting. A few examples just for fun:

And then when you navigate over to the Hall of Fame , you can find the top commenters of Hacker News in December 2015, sorted by imdb-style score of their grade point average. In particular, congratulations to pcwalton, tptacek, paulmd, cstross, greglindahl, moxie, hannob, 0xcde4c3db, Manishearth, johncolanduoni - GPT 5.1 Thinking found your comments very insightful and prescient. You can also scroll all the way down to find the noise of HN, which I think we're all familiar with too :)

I hope people enjoy the results. My code (wait, Opus' code?) on GitHub can be used to reproduce or tweak all of the results. Running 31 days of 30 articles through GPT 5.1 Thinking meant 31 * 30 = 930 LLM queries in total and cost about $58 and somewhere around ~1 hour. The LLM megaminds of the future might find this kind of a thing a lot easier, a lot faster and a lot cheaper.

Valve: HDMI Forum Continues to Block HDMI 2.1 for Linux

Hacker News
www.heise.de
2025-12-10 17:20:06
Comments...
Original Article

The HDMI Forum, responsible for the HDMI specification, continues to stonewall open source. Valve's Steam Machine theoretically supports HDMI 2.1, but the mini-PC is software-limited to HDMI 2.0 . As a result, more than 60 frames per second at 4K resolution are only possible with limitations.

In a statement to Ars Technica, a Valve spokesperson confirmed that HDMI 2.1 support is "still a work-in-progress on the software side." "We’ve been working on trying to unblock things there."

The Steam Machine uses an AMD Ryzen APU with a Radeon graphics unit. Valve strictly adheres to open-source drivers, but the HDMI Forum is unwilling to disclose the 2.1 specification. According to Valve, they have validated the HDMI 2.1 hardware under Windows to ensure basic functionality.

The restriction imposed by the HDMI Forum was already criticized in early 2024 by an AMD employee responsible for Linux . Even then, according to AMD, they had submitted a functional, HDMI 2.1-compatible driver, which the HDMI Forum rejected.

"Unfortunately, the HDMI Forum rejected our proposal," it was stated at the time. "At this time an open source HDMI 2.1 implementation is not possible without running afoul of the HDMI Forum requirements."

Only HDMI 2.1 offers sufficient bandwidth for 120 or 144 Hertz at 3840 × 2160 pixels without compression. Furthermore, this version introduced manufacturer-independent variable refresh rates (HDMI VRR). Valve enables 4K and 120 Hertz using chroma subsampling, a compression technique that is particularly noticeable with text. VRR functions in the form of AMD's Freesync, which requires compatible displays.

Alternatively, interested parties can use an active adapter from DisplayPort 1.4 to HDMI 2.1 to increase the frame rate without compression. However, they do not officially support VRR. Popular variants from Club3D are no longer available; offers from less well-known providers (starting from 35,67 €) are still available in price comparisons.

( mma )

Don't miss any news – follow us on Facebook , LinkedIn or Mastodon .

This article was originally published in German . It was translated with technical assistance and editorially reviewed before publication.

EFF Launches Age Verification Hub as Resource Against Misguided Laws

Electronic Frontier Foundation
www.eff.org
2025-12-10 17:15:09
EFF Also Will Host a Reddit AMA and a Livestreamed Panel DiscussionSAN FRANCISCO—With ill-advised and dangerous age verification laws proliferating across the United States and around the world, creating surveillance and censorship regimes that will be used to harm both youth and adults, the Electro...
Original Article

SAN FRANCISCO—With ill-advised and dangerous age verification laws proliferating across the United States and around the world, creating surveillance and censorship regimes that will be used to harm both youth and adults, the Electronic Frontier Foundation has launched a new resource hub that will sort through the mess and help people fight back.

To mark the hub's launch, EFF will host a Reddit AMA (“Ask Me Anything”) next week and a free livestreamed panel discussion on January 15 highlighting the dangers of these misguided laws.

“These restrictive mandates strike at the foundation of the free and open internet,” said EFF Activist Molly Buckley. “While they are wrapped in the legitimate concern about children's safety, they operate as tools of censorship, used to block people young and old from viewing or sharing information that the government deems ‘harmful’ or ‘offensive.’ They also create surveillance systems that critically undermine online privacy, and chill access to vital online communities and resources. Our new resource hub is a one-stop shop for information that people can use to fight back and redirect lawmakers to things that will actually help young people, like a comprehensive privacy law.”

Half of U.S. states have enacted some sort of online age verification law. At the federal level, a House Energy and Commerce subcommittee last week held a hearing on “Legislative Solutions to Protect Children and Teens Online.” While many of the 19 bills on that hearing’s agenda involve age verification, none would truly protect children and teens. Instead, they threaten to make it harder to access content that can be crucial, even lifesaving, for some kids .

It’s not just in the U.S.  Effective this week, a new Australian law requires social media platforms to take reasonable steps to prevent Australians under the age of 16 from creating or keeping an account.

We all want young people to be safe online. However, age verification is not the panacea that regulators and corporations claim it to be; in fact, it could undermine the safety of many.

Age verification laws generally require online services to check, estimate, or verify all users’ ages—often through invasive tools like government ID checks, biometric scans, or other dubious “age estimation” methods—before granting them access to certain online content or services. These methods are often inaccurate and always privacy-invasive, demanding that users hand over sensitive and immutable personal information that links their offline identity to their online activity. Once that valuable data is collected, it can easily be leaked, hacked, or misused.

To truly protect everyone online, including children, EFF advocates for a comprehensive data privacy law.

EFF will host a Reddit AMA on r/privacy from Monday, Dec. 15 at 12 p.m. PT through Wednesday, Dec. 17 at 5 p.m. PT, with EFF attorneys, technologists, and activists answering questions about age verification on all three days.

EFF will host a free livestream panel discussion about age verification at 12 p.m. PDT on Thursday, Jan. 15. Panelists will include Cynthia Conti-Cook, Director of Research and Policy at the Collaborative Research Center for Resilience ; a representative of Gen Z for Change ; EFF Director of Engineering Alexis Hancock ; and EFF Associate Director of State Affairs Rindala Alajaji . RSVP at https://www.eff.org/livestream-age .

For the age verification resource hub: https://www.eff.org/age

For the Reddit AMA: https://www.reddit.com/r/privacy/

For the Jan. 15 livestream: https://www.eff.org/livestream-age

U.S. State Department Changes Back to Times New Roman From Calibri

Daring Fireball
www.nytimes.com
2025-12-10 17:04:43
Michael Crowley and Hamed Aleaziz, reporting for The New York Times: While mostly framed as a matter of clarity and formality in presentation, Mr. Rubio’s directive to all diplomatic posts around the world blamed “radical” diversity, equity, inclusion and accessibility programs for what he said ...
Original Article

Please enable JS and disable any ad blocker

iksemel rusted

Lobsters
thinkerf.blogspot.com
2025-12-10 17:00:47
Comments...
Original Article

I had written some Rust code before, but it was for command-line tools, Advent of Code solutions, and similar small-scale programs. I was looking for a suitable project to delve deeper into Rust and decided to rewrite an old C project of mine in Rust!

Iksemel was a fast and easy-to-use XML/ XMPP library which I developed over two decades ago. It was written in highly portable C and used in all sorts of places, including: an instant messenger on the Amiga home computer; an embedded sensor with 64 kilobytes of memory that posted measurements to users on Google Talk; the Scientific and Technological Research Council of Turkey's Linux distribution's Pisi Package Manager via its Python bindings; and a bunch of Symbian devices.

Did you know that Google Talk used to speak the open XMPP protocol before it was enshittified?

I haven't worked much on it recently since neither the XML format nor the XMPP instant messaging protocol is widely used anymore. But it was perfect for the task. The document tree was managed with self-referential and intrusive data structures with custom memory management, a challenge to the borrow checker! There were lots of low-level tricks for performance. It was also a complete package with tests, documentation, tools, and Python bindings, hence many areas to explore.

You can find the shiny new Iksemel Rust code here and the preliminary Python bindings here .

The rewrite took longer than I expected. The official Rust documentation is decent for someone coming from high-level languages, and perhaps even for those with no programming background, but it is not enough for C folks. The necessary information is spread between blog posts, issue comments, source code, and the collective wisdom of the community. This motivated me to write this post.

Despite the learning phase, the overall development process was much nicer and faster than the C version! I remember constant crashes and hard-to-debug bugs. The Rust version was immediately crash-free despite the use of unsafe , most of the code worked like a charm on the first try, and debugging in rare cases was a breeze. This was a result of the combination of the strong type system, memory safety features, and the comprehensive test tooling of Rust.

I haven't had a chance to profile and optimize the actual parsing code, or take advantage of SIMD and other modern hardware yet, but the performance looks excellent already. All benchmarks are lies, so take this with a grain of salt, but it is faster than libxml2 and even the read-only roxmltree crate on several giant Wikipedia dumps I have tested with.

xmlThrDefParserInputBufferCreateFilenameDefault is an actual function name of libxml2.

Of the Borrow Checker

"Don't fight with the borrow checker!" is sound advice. Unfortunately there are some common and necessary programming patterns in C which initially seemed impossible to implement in Rust. I have found a four-step process to avoid fights:

0. Using values.

Solving the problem with clone() , to_string() , etc., is usually the compiler's suggestion and general beginner advice. If the performance loss caused by this were generally acceptable, programs wouldn't be written in C in the first place, so this is not a solution C programmers would settle for. However, even in Iksemel, there were a few places where it did the job perfectly.

1. Disentangling the data.

In most cases, it is possible to get rid of circular references by refactoring, such as splitting a God object into parts with orthogonal responsibilities. This generally makes the program more testable, composable, and understandable as well, so it is a very good thing to do even if the final solution involves the next steps.

2. Using dedicated Rust facilities.

If the borrow checker lifetime model isn't working, the reference counting model can be used instead with Rc or Arc . If shared/interior mutability is desired, RefCell or Mutex exists. Pin can help with self-references. There are other goodies in the standard library and crates. They are well-implemented and their performance costs are negligible everywhere except under the most demanding hot loops. The majority of programs have no reason to look further beyond this step.

3. Using unsafe Rust.

If the language facilities are not enough, and performance is really important, this is the ultimate solution. There is a stigma attached to it, mostly because it is often used to bypass the borrow checker quickly without good justification, but it is perfectly fine to use when you have honestly worked through all the above steps and have a business requirement for the performance. After all, the standard library has a ton of unsafe code. You really must apply the same level of engineering discipline as the standard library, though. I will give a thorough checklist in a moment.

A good way to look at this is to understand that the safe Rust rejects all unsound programs but also rejects some sound programs because the borrow checker has both theoretical and practical limitations. The unsafe Rust on the other hand allows all sound programs, but also allows some unsound programs.

Of Callbacks and Events

One of the entanglement points was the implementation of the parser hooks. Iksemel is built in layers, with a so-called SAX (serial access) parser which is more or less a tokenizer at the bottom, and the document and XMPP stream parsers on top of each other. When the SAX parser produces an element, it calls the registered hook with the element information. The API is exactly what you would expect from C:

typedef int (iksTagHook)(void *user_data, char *name, char **atts, int type);
typedef int (iksCDataHook)(void *user_data, char *data, size_t len);
iksparser *iks_sax_new (void *user_data, iksTagHook *tagHook, iksCDataHook *cdataHook);
int iks_parse (iksparser *prs, const char *data, size_t len, int finish);

The parser calls these registered hooks with the provided user_data to allow upper level parsers to maintain their state. Now, there are ways to implement these sorts of callbacks with Fn / FnMut / FnOnce and generics, but I realized how limiting this kind of API is going to be while playing with some examples, and implemented this instead:

pub fn parse_bytes<'a>(
        &'a mut self,
        bytes: &'a [u8],
    ) -> Result<Option<(SaxElement<'a>, usize)>, ParseError>

The return type looks a bit complicated but compiles into a very compact structure thanks to niche optimizations. In plain language, it either returns an error, a result of a pair of an encountered element and how many bytes of the buffer are processed, or a result of None saying that all bytes are processed without producing a complete element yet.

Note that the returned element is a reference and has the same lifetime as both the parser and the input buffer. This allows the parser to return a reference from either the input data or from the internal parser buffers if data needs to be modified, thus avoiding memory copies while ensuring that the caller cannot free either source before dealing with the returned reference.

Since the callback mess is gone, infinite layers of higher-level modules can be built on top without angering the borrow checker. In fact, the document tree parser mimics exactly the same API:

pub fn parse_bytes(&mut self, bytes: &[u8]) -> Result<(), ParseError> {}

pub fn into_document(mut self) -> Result<Document, ParseError> {}

All the incoming bytes are passed to the SAX parser internally, returned elements are inserted into the tree, and once everything is parsed, the final document could be retrieved.

Rust's ability to efficiently return complex types makes it easy to implement this kind of APIs, which are highly suitable for Sans-IO .

Of Lending Iterators

Not everything is perfect yet. One downside to this API is that the caller must handle the bookkeeping for the processed bytes and repeatedly call the function. I wanted to implement the standard Iterator trait for this so it could be used with a for loop; unfortunately, the trait does not support returning an item with a lifetime tied to the iterator itself.

This requires a pattern known as a "lending iterator," which is possible in Rust but incompatible with the standard Iterator trait. Here is the implementation:

impl<'a> SaxElements<'a> {
    pub fn new(parser: &'a mut SaxParser, bytes: &'a [u8]) -> Self { ... }

    pub fn next(&mut self) -> Option<Result<SaxElement<'_>, ParseError>> { ... }
}

It cannot be used with a for loop, but it is still very easy to use with while let :

let mut elements = parser.elements(b"<doc>example</doc>");
while let Some(result) = elements.next() {
     println!("Element parsed: {:?}", result?);
}

Of Self-Referential Structs

Iksemel stores the XML data inside a custom memory arena containing two bump-allocator areas: one for the element structs and one for the character data. Container structs are allocated within the arena chunks, and the element structs have internal pointers forming the tree structure, which makes navigation fast and easy. Reducing the allocation count and packing similar objects together like this has a huge performance impact, but it cannot be done with Safe Rust at the moment.

There are some crates like Bumpalo which implement bump allocation for the general case, but this wasn't enough for Iksemel. There are also a couple of crates for implementing self-referential structs, but I found they obscured what was going on due to heavy macro use, and they weren't flexible enough either.

Of Unsafe

Unsafe code can introduce undefined behavior. We don't want to go back to the misery of C by doing that, so I have collected an absolute minimum list of rules to ensure success.

0. Before even doing anything, the following resources must be internalized:

Pointer provenance:

1. Unsafe code must be contained within a safe API.

This is how Safe Rust is possible in the first place: by encapsulating the dangerous parts in an impossible-to-misuse abstraction. This API has two jobs: keeping the internal invariants of the unsafe code true, and not leaking anything to the outside which breaks the invariants of Safe Rust.

As an example, the XML document tree is represented as the struct Document , and any editing and navigation operations return a struct Cursor to move around the document and perform operations. The Document contains all the data, and the Cursor is essentially a reference to a certain node of the document tree.

let doc = Document::fromStr("<a><b>lala</b></a>");
let cursor: Cursor = doc.find_tag("b").first_child();
assert_eq!(cursor.cdata(), "lala");

These few lines contain a bunch of invariants to uphold. The biggest internal invariant is that the backing memory area is not released back to the system while the cursor is pointing to it. Thankfully, Rust already solves this problem with lifetimes:

pub struct Cursor<'a> {
    node: *mut Node,
    arena: &'a Arena,
}

impl Document {
    pub fn find_tag<'a>(&'a self, name: &str) -> Cursor<'a> {
        // ...
    }
}

Earlier versions of this code had PhantomData<Document> to link the lifetimes without an actual data member. Now it has a reference to the Arena member of the Document for reasons explained later in this post. Either way, this API ensures that the returned Cursor cannot stay alive longer than the Document . When the Document is dropped and the memory is freed, all the pointers into it are already gone.

On the other hand, a C compiler would happily compile this without an error:

iks *document, *cursor;
int e;
document = iks_tree("<a><b>lala</b></a>", &e);
if (IKS_OK == e) {
    cursor = iks_find(document, "b");
    iks_delete(document);
    printf("crash here = %s", iks_name(cursor));
}

Another internal invariant is that no modifications via other cursors can invalidate an existing Cursor . This is ensured by the design which leaves the modified or deleted nodes intact in place and just hides them by changing internal links. These links can only be followed by one Cursor at a time, which is further ensured by not declaring cursors as Sync .

External invariants are a little bit more complicated. Thankfully, there is enough guidance in the Rust Reference and the Rustonomicon book. You might have noticed that Cursor::cdata returns a string reference to the internal buffer. It uses the &str type to avoid copying. Therefore, it follows &str invariant rules by: a) attaching its lifetime to the Cursor to prevent early free, b) ensuring that the returned pointer is never null , and c) ensuring all the strings inside the document character data memory are valid UTF-8 strings at all times.

2. There is Unsafe and there is YOLO.

Iksemel does use raw pointers, but they are all typed. When there is a need to calculate the size and alignment of an object to make space, the Layout module from the standard library is used instead of making assumptions. No conversion between a pointer and an integer is done. Pointer arithmetic to find space in the arena uses *const::add_byte and never jumps from one allocation area to another. std::mem::transmute is never used. The same memory area is never mapped to different structs.

The main theme between these avoided operations is that they challenge the compiler's model of data representation in memory. While there are rare cases in the standard library where they are needed, they are considerably riskier than regular unsafe use.

3. Unsafe code must have negative tests.

It is always a good thing to check that things which shouldn't happen are indeed not happening. This is quite easy with Cargo tests and doctests for runtime invariants. Testing that the compiler rejects API misuse turned out to be a bit harder. The best way I found to do this is to use rustdoc tests with the compile_fail flag like this:

/// Cursor clone cannot outlive the Document:
/// ```compile_fail
/// let c2: Cursor;
/// {
///     let doc = Document::from_str("<a><b/></a>")?;
///     let c1 = doc.root();
///     c2 = c1.clone();
/// }
/// println!("{}", c2);

Unfortunately, you cannot specify the expected error, so occasionally you have to run cargo test -- --no-capture to check that they are not failing for other reasons. I haven't tried it yet, but this could be a good candidate to automate via snapshot testing.

This check would have been completely unnecessary in Safe Rust because the borrow checker wouldn't let you return anything from c1 which references its own data without proper lifetime association. But since the compiler cannot check the intention behind the raw pointers, a simple typo in the method signature in unsafe Rust can detach two lifetimes from each other with disastrous results.

4. All unsafe code must be tested under Miri

Miri is a sort of interpreter which runs your Rust code and checks for any unsound operations at runtime. Note that the only way it can be as effective as the Safe Rust compiler is to exercise all codepaths. That means you must have a comprehensive test suite.

It is quite slow; Iksemel's test suite takes a second for the CI run, but takes 18 minutes with Miri. Still, the release action of the Iksemel crate is gated on a successful Miri run, and I run Miri whenever I make changes to the unsafe code as well.

5. Tests must be tested, preferably with cargo mutants

Miri is useless unless you execute all codepaths under inspection. There are different ways to ensure test coverage, but I found that mutation testing works better than others. Classic code coverage gives more importance to a hundred-line print banner function than to a single complicated conditional expression. More on that in the next chapter.

6. Unsafe code must have examples and documentation.

Using the library you wrote is super simple. Trying to explain how to use it to someone else, on the other hand, makes you realize even more corner cases, safety issues, and all sorts of awkwardnesses in your API.

7. Code must be idiomatic and heavily linted.

I wrote the Iksemel C code in GNU C style, which wasn't a great choice in hindsight, but it was still a lot better than not having a style at all. I'm happy that Rust syntax is a lot less flexible and tools like cargo fmt apply a highly opinionated style to everything. It helps when reading other people's code, using examples, and making unexpected things stand out amongst the surrounding lines. The last bit is especially important here since unsafe code depends on human review.

Rust also has a ton of lints to check problematic patterns and non-idiomatic code. Some of these are not always applicable, but it is generally easier to enable all and give permission with the expect attribute when you have a good reason to ignore them.

Iksemel enables all of them except the pedantic and restriction categories, and then cherry-picks some of those as well:

#![deny(clippy::suspicious)]
#![deny(clippy::complexity)]
#![deny(clippy::perf)]
#![deny(clippy::style)]
#![deny(clippy::cargo)]
#![deny(clippy::items_after_statements)]
#![deny(clippy::missing_panics_doc)]
#![deny(clippy::uninlined_format_args)]
#![deny(clippy::unnecessary_semicolon)]
#![deny(clippy::unreadable_literal)]
#![deny(clippy::allow_attributes_without_reason)]
#![deny(clippy::panic)]
#![deny(clippy::partial_pub_fields)]
#![deny(clippy::redundant_test_prefix)]

The allow_attributes_without_reason lint is particularly useful. It forces you to explain the rationale whenever you have to ignore lints like this:

#[expect(
    clippy::inherent_to_string_shadow_display,
    reason = "prereserving exact capacity makes this method significantly faster"
)]
fn to_string(&self) -> String { ... }

Of Mutation Testing

The following snippet is simplified from the unsafe code which handles the removal and insertion of an attribute using a classic linked list approach:

// in the attribute removal code
if (*tag).last_attribute == attr {
    (*tag).last_attribute = (*attr).previous;
}

// in the attribute insertion code
if !(*tag).last_attribute.is_null() {
    (*(*tag).last_attribute).next = attribute;
    (*attribute).previous = (*tag).last_attribute;
}
(*tag).last_attribute = attribute;

I was surprised when cargo-mutants showed a miss for the following mutation:

if (*tag).last_attribute != attr {

To get full coverage for the four branches in these two if statements, it is sufficient to test adding three attributes, removing the middle one, and then removing the last one. I had more tests than that, but the crucial scenario of removing the last element and then adding it back was forgotten!

Note that the bug exposed by this mutation depends on the order of operations, and mutant testing was able to find the testing gap quite easily.

Of Interior Mutability

I changed the Cursor API several times as I played with it while implementing higher-level parsers and command-line tools. The first iteration was like this:

#[repr(transparent)]
struct Cursor<'document> {
    node: *mut Node;
    _phantom: PhantomData<Document>;
}

This implementation was good enough for the navigation methods. But the editing methods required access to the backing Arena structure, which was inside the Document . The C version didn't have the Cursor and Document separation; every node contained an arena pointer and the same iks* type referred to both a complete document and individual nodes, but that approach wasted a pointer per node. So, to avoid going back, I tried the following:

let doc = Document::from_str("</test>");
doc.root().insert_cdata(&doc, "hello")?.append_cdata(&doc, " world!")?;

A worthy effort, but futile! The problem was the cursor lifetime we connected to the document. The borrow checker sees that and does not allow the document to be reborrowed by the same owner. Running away from the fight but keeping the lifetime safety, I changed it to:

pub struct Cursor<'a> {
    node: *mut Node,
    arena: &'a Arena,
}

It felt like the structure size increased at first, but in practice, the compiler can easily analyze and optimize out the arena reference copying.

At this point, astute readers have already noticed that there is something missing with the arena reference, or with the Cursor itself, for that matter. Both are shared references! So how can they allocate memory or edit the document?

It is quite useful in practice to have two or more cursors pointing to different elements while traversing and editing XML documents. This doesn't play well with the borrow checker's "only one mutable reference" rule and requires "interior mutability", which is usually implemented with a RefCell or related standard library facilities.

Raw pointers don't have this limitation, so all I had to do was not expose the mutability difference to the outside via the methods of Arena and Cursor .

Of Concurrency

The Document is Send but not Sync , which means it can be sent across threads but can only be accessed inside one thread at a time. If there is a need to access it from multiple threads simultaneously, it is easy to put it into an Arc<Mutex<Document>> , but that does not extend to the cursors. It is generally necessary to remember the last position in scenarios where processing happens in multiple steps. An XMPP server thread might append the newly received elements into a document's last position for the currently handled connection, for example.

Another case is releasing a Cursor or a Document into Python. Lifetimes and marker traits for thread safety do not exist in Python. This wasn't a big issue in the C version because Python was relatively safer than C anyway, and the free-threaded interpreter did not exist.

Unfortunately, implementing a multi-threaded cursor requires access to the internals of the Document and unsafe use, so it has to be provided by the Iksemel crate to keep it safe and contained.

The solution is a separate SyncCursor which you can create by consuming a Document . This is necessary to guarantee that no references to the document are alive, and no further thread-unsafe references can be created afterward. It uses Arc and Mutex internally for thread-safe access. The overhead is negligible and well worth the safety in multi-threaded operations or for Python.

Of Strings

Rust strings are amazing! You could argue that they can be implemented in C, and in fact, Iksemel stored the length of every string to avoid recalculating it repeatedly. However, it also had to null-terminate them because the user would eventually need to pass a char* pointer to functions expecting a null terminator.

char *iks_stack_strcat (ikstack *s, char *old, size_t old_len, const char *src, size_t src_len);

This C API concatenates the two strings and inserts them into the arena (called a "stack" there); if the first string is already in the arena and there is space after it, it is simply extended. All good and fast, but there are so many ways to misuse this, such as passing the same arena string twice. These can be handled with some extra checks, but not at compile time.

pub fn concat_str<'a>(&'a self, old_s: &str, s: &str) -> Result<&'a str, NoMemory>

This Rust API, on the other hand, can work with any &str the caller provides since extending the string does not need to remove a null terminator, and all inserted strings remain immutable even while being extended.

I was a bit worried initially about the Display trait. If I used that to convert the document tree into XML text, would it be fast enough? My concern was about the formatter object being passed to the trait, and the extra function call indirections going through that instead of just using String::push . To my amazement, the generated assembly code in release build mode and speed tests matched the direct implementation exactly! The Rust compiler basically removed all the indirections.

One detail that required intervention was avoiding extra allocations for the final string object:

fn to_string(&self) -> String {
        let mut buf = String::with_capacity(self.str_size());

Walking over the tree once to calculate the final string size via the str_size method and reserving that capacity before serializing gives a measurable speed boost even for smallish documents, and a very large boost for big ones.

It looks like there is some internal mechanism in the formatting code to provide a size hint, but it is not exposed to implementors of the Display trait.

Of the Standard Library

Speaking of the standard library, let's talk about the elephant in the room. The Rust standard library is a bit too small, a sentiment shared by many people I've spoken with at community events.

As an example, both C and Python have a getpass function to read a password from the user, but in Rust, you need to depend on a third-party crate which isn't well-maintained. This was used in the included iksjab utility, which lets you send XMPP messages and perform other tasks from the command line.

The name resolver was a bit awkward too, requiring Iksemel to inject the default XMPP port into the address provided by the user before doing the lookup, instead of allowing it to be specified or omitted like literally every other resolver API. I was able to avoid another dependency with:

pub(super) fn need_port(host: &str) -> bool {
    let column_pos = host.rfind(':');
    let bracket_pos = host.find(']');
    match (column_pos, bracket_pos) {
        (None, None) | (None, Some(_)) => true,
        (Some(_), None) => false,
        (Some(column), Some(bracket)) => column < bracket,
    }
}

fn resolve_host_with_default_port(
    host: &str,
    default_port: u16,
) -> std::io::Result<std::vec::IntoIter<SocketAddr>> {
    if need_port(host) {
        (host, default_port).to_socket_addrs()
    } else {
        host.to_socket_addrs()
    }
}

But it was a lot of code just to call a simple lookup function, and I got it wrong on the first try (did you notice the r find ?).

Thankfully, Iksemel was mostly self-sufficient, so this is more of an issue when you are coming from a language like Python. When you need something, it is extremely easy to add it via Cargo, but while there are dozens of crates for every problem, their quality is all over the place, and evaluating them is difficult.

Of Python Bindings

This turned out to be the easiest part. Literally:

$ maturin new
$ maturin generate-ci github > .github/workflows/CI.yml

I then added #[pyclass] , #[pymethods] , and similar decorators around my functions, and boom! A working set of Python bindings.

Maturin (build system) and Pyo3 (bindings) have thought of everything, and they are well-documented.

Of Modules

The official documentation suggests using a modern file structure like this:

sax.rs
sax/parser.rs
sax/tests.rs
document.rs
document/synccursor.rs
document/tests.rs

This is probably the only decision I don't agree with Rust amongst thousands. Thankfully, the original structure is still supported:

sax/mod.rs
sax/parser.rs
sax/tests.rs
document/mod.rs
document/synccursor.rs
document/tests.rs

This mimics the __init__.py or index.ts usage from other modern languages and makes manual navigation much easier. Many crates also seem to follow this convention.

Of the Aftermath

* If Python is "batteries included", then Rust is "power tools included". The unit tests, documentation generator, documentation tests, formatting and linting, dependency management, build system, runtime validation, analyzers for IDEs, instrumentation, benchmarking, and seamless Python binding generation is either part of the language or just one command away in the ecosystem.

Can we count this a productivity win for Rust, if C programmers don't bother with any of those though?

* Rust's type system is not new if you have used OCaml, Haskell, or a similar functional language before. The difference is that you get it with zero overhead. Having type safety while beating C in speed is glorious!

* Writing Rust code is like having a constructive conversation with a highly knowledgeable but strict intellectual. The conversation is always focused on trade-offs or edge cases. You can choose the easy solution if it is good enough for your requirements, such as using more memory with a clone, or just panicking when an unexpected scenario happens, but it is always an informed decision, and never hidden behind opaque abstractions.

* Unsafe Rust is not well-introduced to newcomers in the main sources and is a little bit unergonomic. The good news is that it is improving fast, which could be the reason why documentation is lagging behind a bit. The need for it is also decreasing as the language and the ecosystem evolve, but C and C++ programmers will still want it. The other good news is that if you follow the guidelines, you can still achieve a level of safety not seen in C or C++.

* It is probably unfair to compare my original development experience of Iksemel with this Rust port on technical merits alone, as there is a two-decade gap in time and experience between them. I had a ton of fun back then. Rust made me feel the same joy of learning and experimenting today.

9 Mothers (YC X26) Is Hiring

Hacker News
app.dover.com
2025-12-10 17:00:22
Comments...

Bring a Crowd to Park Slope's New Mumbai-Style Indian Restaurant

hellgate
hellgatenyc.com
2025-12-10 16:36:49
The chef is from Masalawala & Son next door, and the food here is fantastic....
Original Article
Bring a Crowd to Park Slope's New Mumbai-Style Indian Restaurant
Goat biryani, $28 (Scott Lynch / Hell Gate)

$20 Dinner
View on map

The chef is from Masalawala & Son next door, and the food here is fantastic.

Scott's Picks:

Remember Curry Mee, the very solid Malaysian restaurant that opened a couple of summers ago on Fifth Avenue in Park Slope? The food was good, the back patio pleasant, the Great Wave off Kanagawa mural over the bar, a holdover from the previous tenant, was... totally out of place?

Well, RIP, because Curry Mee owner John Liao apparently couldn't convince the neighborhood to eat more nasi lemak.

Give us your email to read the full story

Sign up now for our free newsletters.

Sign up

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Hell Gate.

Your link has expired.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.

DeepSeek uses banned Nvidia chips for AI model, report says

Hacker News
finance.yahoo.com
2025-12-10 16:34:52
Comments...
Original Article

(Bloomberg) -- Chinese artificial intelligence startup DeepSeek has relied on Nvidia Corp. chips that are banned in the country to develop an upcoming AI model, according to a new report in The Information.

Nvidia’s Blackwell chips were smuggled into China through countries that permitted their sale, The Information reported, citing unnamed sources. More specifically, DeepSeek tapped chips that were installed in data centers in unspecified countries, then dismantled and shipped to China after clearing inspection by companies developing server equipment, The Information said.

Most Read from Bloomberg

The US bans the sale of these advanced semiconductors to China, which has led AI developers there to access the hardware through data centers located outside of the mainland or subterfuge. In November, US prosecutors charged two Chinese nationals and two US citizens with a scheme to ship chips to China by way of Malaysia using a fake real estate business.

A representative for DeepSeek didn’t immediately respond to a request for comment.

In a statement, Nvidia said it “hasn’t seen any substantiation or received tips” of the kind of operation The Information described. “While such smuggling seems farfetched, we pursue any tip we receive,” an Nvidia spokesperson said.

Explainer: A Guide to the Nvidia Chips at Center of US-China Rivalry

DeepSeek drew global attention in January when it debuted an AI model that was competitive with Silicon Valley’s best and said it had built it at a fraction of the cost. The startup was funded by the Chinese hedge fund High-Flyer, which had amassed 10,000 Nvidia GPUs in 2021, prior to US bans on exports of sophisticated Nvidia chips and other graphics processing units.

Earlier this week, President Donald Trump granted Nvidia permission to ship to China an older version of its AI accelerators, the H200. An export ban on its more powerful Blackwell version remains in place.

Beijing has meanwhile pushed Chinese technology companies to rely on domestic equipment to develop AI. DeepSeek released a new model in September and indicated that it was working with Chinese chipmakers on the model.

--With assistance from Ed Ludlow.

(Updates with comment from Nvidia and more context on smuggling starting in the second paragraph)

Most Read from Bloomberg Businessweek

©2025 Bloomberg L.P.

England Historic Aerial Photo Explorer

Hacker News
historicengland.org.uk
2025-12-10 16:13:57
Comments...

Qwen3-Omni-Flash-2025-12-01:a next-generation native multimodal large model

Hacker News
qwen.ai
2025-12-10 16:13:38
Comments...

Dark mode

Simon Willison
simonwillison.net
2025-12-10 16:05:34
I've never been particularly invested dark v.s. light mode but I get enough people complaining that this site is "blinding" that I decided to see if Claude Code for web could produce a useful dark mode from my existing CSS. It did a decent job, using CSS properties, @media (prefers-color-scheme: dar...
Original Article

I've never been particularly invested dark v.s. light mode but I get enough people complaining that this site is "blinding" that I decided to see if Claude Code for web could produce a useful dark mode from my existing CSS. It did a decent job , using CSS properties, @media (prefers-color-scheme: dark) and a data-theme="dark" attribute based on this prompt:

Add a dark theme which is triggered by user media preferences but can also be switched on using localStorage - then put a little icon in the footer for toggling it between default auto, forced regular and forced dark mode

The site defaults to picking up the user's preferences, but there's also a toggle in the footer which switches between auto, forced-light and forced-dark. Here's an animated demo:

This site on mobile. Clicking the icon in the footer switches to a black background with readable text.

I had Claude Code make me that GIF from two static screenshots - it used this ImageMagick recipe:

magick -delay 300 -loop 0 one.png two.png \
    -colors 128 -layers Optimize dark-mode.gif

The CSS ended up with some duplication due to the need to handle both the media preference and the explicit user selection. We fixed that with Cog .

[$] Mix and match Linux distributions with Distrobox

Linux Weekly News
lwn.net
2025-12-10 16:05:05
Linux containers have made it reasonably easy to develop, distribute, and deploy server applications along with all the distribution dependencies that they need. For example, anyone can deploy and run a Debian-based PostgreSQL container on a Fedora Linux host. Distrobox is a project that is de...
Original Article

The page you have tried to view ( Mix and match Linux distributions with Distrobox ) is currently available to LWN subscribers only.

Reader subscriptions are a necessary way to fund the continued existence of LWN and the quality of its content.

If you are already an LWN.net subscriber, please log in with the form below to read this content.

Please consider subscribing to LWN . An LWN subscription provides numerous benefits, including access to restricted content and the warm feeling of knowing that you are helping to keep LWN alive.

(Alternatively, this item will become freely available on December 18, 2025)

Launch HN: InspectMind (YC W24) – AI agent for reviewing construction drawings

Hacker News
www.inspectmind.ai
2025-12-10 16:05:03
Comments...

Launch HN: InspectMind (YC W24) – AI agent for reviewing construction drawings

Hacker News
news.ycombinator.com
2025-12-10 16:05:03
Comments...
Original Article

Hi HN, we're Aakash and Shuangling of InspectMind ( https://www.inspectmind.ai/ ), an AI “plan checker” that finds issues in construction drawings, details, and specs.

Construction drawings quietly go out with lots of errors: dimension conflicts, co-ordination gaps, material mismatches, missing details and more. These errors turn into delays and hundreds of thousands of dollars of rework during construction. InspectMind reviews the full drawing set of a construction project in minutes. It cross-checks architecture, engineering, and specifications to catch issues that cause rework before building begins.

Here’s a video with some examples: https://www.youtube.com/watch?v=Mvn1FyHRlLQ .

Before this, I (Aakash) built an engineering firm that worked on ~10,000 buildings across the US. One thing that always frustrated us: a lot of design coordination issues don’t show up until construction starts. By then, the cost of a mistake can be 10–100x higher, and everyone is scrambling to fix problems that could have been caught earlier.

We tried everything including checklists, overlay reviews, peer checks but scrolling through 500–2000 PDF sheets and remembering how every detail connects to every other sheet is a brittle process. City reviewers and GC pre-con teams try to catch issues too, yet they still sneak through.

We thought: if models can parse code and generate working software, maybe they can also help reason about the built environment on paper. So we built something we wished we had!

You upload drawings and specs (PDFs). The system breaks them into disciplines and detail hierarchies, parses geometry and text, and looks for inconsistencies: - Dimensions that don’t reconcile across sheets; - Clearances blocked by mechanical/architectural elements; - Fire/safety details missing or mismatched; - Spec requirements that never made it into drawings; - Callouts referencing details that don’t exist.

The output is a list of potential issues with sheet refs and locations for a human to review. We don’t expect automation to replace design judgment, just to help ACE professionals not miss the obvious stuff. Current AIs are good at obvious stuff, plus can process data at quantities way beyond what humans can accurately do, so this is a good application for them.

Construction drawings aren't standardized and every firm names things differently. Earlier “automated checking” tools relied heavily on manually-written rules per customer, and break when naming conventions change. Instead, we’re using multimodal models for OCR + vector geometry, callout graphs across the entire set, constraint-based spatial checks, and retrieval-augmented code interpretation. No more hard-coded rules!

We’re processing residential, commercial, and industrial projects today. Latency ranges from minutes to a few hours depending on sheet count. There’s no onboarding required, simply upload PDFs. There are still lots of edge cases (PDF extraction weirdness, inconsistent layering, industry jargon), so we’re learning a lot from failures, maybe more than successes. But the tech is already delivering results that couldn’t be done with previous tools.

Pricing is pay-as-you-go: we give an instant online quote per project after you upload the project drawings. It’s hard to do regular SaaS pricing since one project may be a home remodel and another may be a highrise. We’re open to feedback on that too, we’re still figuring it out.

If you work with drawings as an architect, engineer, MEP, GC preconstruction, real estate developer, plan reviewer we’d love a chance to run a sample set and hear what breaks, what’s useful, and what’s missing!

We’ll be here all day to go into technical details about geometry parsing, clustering failures, code reasoning attempts or real-world construction stories about how things go wrong. Thanks for reading! We’re happy to answer anything and look forward to your comments!

Size of Life

Lobsters
neal.fun
2025-12-10 16:03:47
Comments...

The Fragile Lock: Novel Bypasses For SAML Authentication

Lobsters
portswigger.net
2025-12-10 15:37:33
Comments...
Original Article

Zakhar Fedotkin

  • Published: 10 December 2025 at 12:32 UTC

  • Updated: 10 December 2025 at 12:37 UTC

TLDR

This post shows how to achieve a full authentication bypass in the Ruby and PHP SAML ecosystem by exploiting several parser-level inconsistencies: including attribute pollution, namespace confusion, and a new class of Void Canonicalization attacks. These techniques allow an attacker to completely bypass XML Signature validation while still presenting a perfectly valid SAML document to the application.

Here’s a demo of the attack on a vulnerable GitLab EE 17.8.4 instance:

Table of contents

Abstract

Security Assertion Markup Language (SAML 2.0) is a complex authentication standard built on insecure and outdated XML technology. These legacy foundations have made the protocol notoriously difficult to maintain and have resulted in a persistent stream of critical vulnerabilities over the past two decades.

This paper introduces several novel classes of Signature Wrapping (XSW) attacks capable of completely bypassing authentication in widely used open-source SAML libraries used across the internet.

In addition, I present an open-source toolkit designed to identify and analyze discrepancies between XML parsers - enabling the discovery of authentication bypasses with very few requirements.

The recent increase in SAML vulnerabilities shows that secure authentication cannot happen by accident. Keeping protocols like SAML safe requires coordinated, ongoing effort from the entire security community, not just quick fixes.

Service Provider-initiated SAML Flow

The Service Provider-Initiated (SP-Initiated) SAML flow is the most common way users authenticate through SAML. It starts when a user tries to access a protected resource on the service provider’s website. Since the user is not yet authenticated, the service provider generates a SAML authentication request and redirects the user to the Identity Provider (IdP) for verification.

The IdP receives this request, verifies its validity, and then issues a SAML Response containing a digitally signed Assertion that confirms the user’s identity. This response is sent back via the user’s browser to the service provider (SP). The SP then verifies the digital signature and extracts user information (such as username and email) from the Assertion. If the signature and data are valid, access is granted.

XML Signature Wrapping Attack (XSW)

The overall security of this flow depends entirely on how the SAML Response signature is validated. In many implementations, signature verification and assertion processing are handled by separate modules or even different XML parsers. An XML Signature Wrapping (XSW) attack exploits the discrepancies between these components.

In a typical scenario, an attacker intercepts a legitimate SAML Response signed by a trusted Identity Provider and injects a new malicious Assertion containing arbitrary user information into the same document. When the Service Provider processes the response, the signature verification module correctly validates the legitimate portion of the message, while the SAML processing logic mistakenly consumes the attacker’s injected Assertion. As a result, the attacker’s forged data is treated as authentic, leading to a privilege escalation.

Juraj Somorovsky, in his research " On Breaking SAML: Be Whoever You Want to Be " suggests that this could be done by registering through the IdP, performing a man-in-the-middle attack, or even digging through publicly exposed files using Google dorking. The problem is that this is a big requirement. Getting a valid signed SAML Assertion for an arbitrary website is extremely difficult. Identity Providers almost never expose them, and even if you somehow capture one, most Service Providers will accept it only once, after that it gets cached and rejected.

Complete authentication bypass


So we take a different approach. Instead of trying to steal or reuse a signed Assertion, we simply reuse any other XML document signed with the IdP’s private key.

With that legitimate signature in hand, we can then exploit the server's flawed signature-verification logic and make it believe that our malicious Assertion is the one that was signed, even though it wasn’t.

The Illusion of Safety

In our previous research with Gareth Heyes - SAML roulette: the hacker always wins , we demonstrated how flaws in handling Document Type Declarations (DTDs) could be exploited to perform an XSW attack against the widely used Ruby-SAML library. To mitigate these issues, two security patches were released - versions 1.12.4 and 1.18.0.

In this paper, I use the Ruby-SAML 1.12.4 patches as a case study to demonstrate why incremental fixes are insufficient and despite multiple attempts to address XML-related vulnerabilities, the underlying architecture remains fragile.

Flawed XML Security implementation

Security patch 1.12.4 introduced two new checks to ensure that the SAML document does not contain DTDs and is a well-formed XML document. While this eliminated our original exploit, it did not address the root cause of the problem. The XML Security library still relied on two separate XML parsers - REXML and Nokogiri - for different parts of the validation process.

According to the SAML specification, the Assertion element - or one of its ancestor elements - must be referenced by the Signature element, using an enveloped XML Signature.

In the Ruby-SAML implementation, both REXML and Nokogiri locate the Signature element using the XPath query "//ds:Signature" , which retrieves the first occurrence of a signature anywhere in the document. After that, additional logic, implemented in REXML, verifies that the parent element of the signature is an Assertion. This overly permissive XPath query became a key component of the exploit.

An XML Signature is a two-pass signature mechanism: the hash value of the signed resource (DigestValue) and the URI reference to the signed element are stored inside a Reference element. The SignedInfo block that contains these references is then itself signed, and the resulting Base64-encoded signature is placed in the SignatureValue element. In the Ruby-SAML implementation, REXML is used to extract the DigestValue, which is then compared against the hashed element transformed with Nokogiri. The SignatureValue, also extracted by REXML, is expected to match the SignedInfo element as processed by Nokogiri, creating a fragile dependency between two different parsers with inconsistent XML handling.

Attribute pollution

To craft a reliable exploit, it is important to first understand a fundamental feature of XML - namespaces. XML namespaces provide a mechanism for qualifying element and attribute names by associating them with Uniform Resource Identifiers (URIs).

Namespace declarations are defined using a special family of reserved attributes. Such an attribute’s name must either be exactly xmlns (to declare a default namespace) or begin with the prefix xmlns: (to define a namespace with a specific prefix). For example:

< Response xmlns = "urn:oasis:names:tc:SAML:2.0:protocol" /> < samlp:Response xmlns:samlp = "urn:oasis:names:tc:SAML:2.0:protocol" >

Both forms are valid and associate elements with the same SAML 2.0 Protocol namespace.

Namespaces are ideal for Signature Wrapping attacks, as they directly influence how XML elements are identified by XPath queries. Most SAML libraries rely on libxml2 for XML parsing. This library inherits numerous legacy quirks.

A great demonstration of libxml2’s fragility is found in Hakim’s " Abusing libxml2 quirks to bypass SAML authentication on GitHub Enterprise (CVE-2025-23369) ", which showcases how internal caching behavior can be abused for unexpected XML processing results. Unfortunately, since both Entities and Doctypes are now restricted by 1.12.4 patch, that particular attack vector is no longer viable - forcing us to explore alternative ways to exploit parsing inconsistencies.

One helpful insight comes directly from the libxml2 documentation of xmlGetProp:

This function looks in DTD attribute declarations for #FIXED or default declaration values.

NOTE: This function ignores namespaces. Use xmlGetNsProp or xmlGetNoNsProp for namespace-aware processing.

Both Ruby (Nokogiri) and PHP expose libxml2 behaviors that can desynchronize signature verification from assertion parsing. In Nokogiri, attribute lookups such as node.attribute('ID') (not a get_attribute) or the shorthand node['ID'] ignore attribute namespaces and use only the simple name. When multiple attributes collide by simple name (e.g., ID and samlp:ID), only one is returned, and the documentation does not guarantee which one.

In PHP’s DOM: DOMNamedNodeMap::getNamedItem also retrieves an attribute by simple name only.

This ambiguity can be directly observed in how parsers resolve attributes. Consider the following two equivalent-looking XML fragments:

< samlp:Response ID = "1" samlp:ID = "2" > # 1 < samlp:Response samlp:ID = "2" ID = "1" > # 2

In the first case, the call xmlGetProp returns 1, while in the second case it returns 2.

The difference depends solely on the attribute order within the element - behavior inherited from libxml2. Because the namespace is ignored and the returned attribute is undefined when duplicates exist, developers have no control over which attribute is selected.

REXML, which implements its own XML parsing logic independent of libxml2, is vulnerable to the same attribute pollution issue. Both attributes['ID'] and get_attribute("ID").value show inconsistent behavior depending on namespace handling.

< Response ID = "1" samlp:ID = "2" > # 1 < samlp:Response ID = "1" samlp:ID = "2" > # 2

In the first case, the access to the attribute by attributes['ID'] returns 1, while in the second case it returns 2. When a namespace prefix is present, REXML’s internal lookup treats attribute names differently, leading to the opposite selection order compared to libxml2. This inconsistency means that the same XML document can produce different attribute values across parsers, allowing an attacker to manipulate which element is actually signed versus which one is processed:

< samlp:Response ID = "attack" samlp:ID = "ID" > < Signature >
< Reference URI = "#ID" />
</ Signature >
< samlp:Extensions >
< Assertion ID = "#ID" />
</ samlp:Extensions >
< Assertion ID = "evil" />
</ samlp:Response >

Attack Workflow

  • Signature verification module locates the target of the XML Signature using the XPath query "//*[@ID='id']" , that ignores namespaces
  • Business logic then verifies that the root element’s identifier matches the one referenced by the signature - retrieving the ID via a namespace-agnostic attribute getter (e.g., element['ID'] , getNamedItem('ID') , or attributes['ID']) .

REXML Namespace confusion without DTDs

As you already know, xmlns is a reserved attribute, and xml is another reserved prefix. Both are defined by the XML specification and cannot be redeclared or bound to different values.

However, in REXML, these are treated internally as a regular attribute. This subtle difference creates a significant weakness. By redefining or injecting namespace declarations, an attacker can manipulate how namespace-aware XPath queries behave, causing REXML to resolve elements that other parsers - such as Nokogiri - ignores correctly:

< Signature xml: xmlns = 'http://www.w3.org/2000/09/xmldsig#' />

This technique also works in the opposite direction, allowing an attacker to hide the legitimate Signature element from the REXML XPath query "//ds:Signature" while keeping the document valid. By carefully nesting elements and redefining namespaces, it becomes possible to make the Signature node visible to Nokogiri but invisible to REXML:

< Parent xmlns = 'http://www.w3.org/2000/09/xmldsig#' >
< Child xml: xmlns = '#anything' >
< Signature />
</ Child >
</ Parent >

This allows the attacker to split signature detection logic, causing the parser to locate and validate a Signature element in an unintended location within the document.

The XML Schema

Now that we can craft a valid XML document that produces two different interpretations in REXML and Nokogiri, the next step is to determine where to inject malicious elements without violating the XML Schema.

The XML Schema Definition (XSD) specifies the syntax and semantics of all XML-encoded SAML protocol messages. In the case of Ruby-SAML, the implementation ships with twelve XSD files, including protocol-schema.xsd, which define the structure and constraints for each element in a SAML Response.

However, XML Schema validation alone does not prevent the inclusion of malicious extensions. A full list of all identified extension points is provided in the supporting materials. Among them, two elements satisfy the key requirement of appearing before the Signature element within a valid SAML Response: the Extensions element and the StatusDetail element. I will use Extensions:

< samlp:Response >
< samlp:Extensions >
< Parent xmlns = "http://www.w3.org/2000/09/xmldsig#" >
< Child xml: xmlns = "#other" >
< Signature >
< SignedInfo > REAL SIGNATURE </ SignedInfo >
</ Signature >
</ Child >
</ Parent >
</ samlp:Extensions >
< Assertion >
< Signature >
< SignedInfo > FAKE SIGNATURE </ SignedInfo
</ Signature >
</ Assertion >
</ samlp:Response >

Impossible XSW

At this stage, we can successfully bypass the SignatureValue verification, but the process fails with an invalid DigestValue. The reason lies in how Nokogiri handles canonicalization and digest calculation. During digest computation, the parser temporarily removes the Signature element before calculating the hash, ensuring the signature is not included in the data being signed.

However, in our modified document, the fake Signature element remains inside the Assertion, meaning the parser now attempts to calculate the digest over a string that already contains the signature data itself. This creates a recursive dependency - the digest must include its own hash value - achieving a valid DigestValue in this scenario would require generating a perfect hash collision.

Void Canonicalization technique

To solve this seemingly impossible problem, we need to take another close look at the SAML specification. According to the standard, the referenced element must be processed through one or more XML transformations before being hashed. By targeting this transformation stage, we open the door to a new class of attack - what I call Void Canonicalization.

Canonicalization defines a consistent way to represent XML documents by standardizing details such as attribute order, whitespace, namespace declarations, and character encoding. This process ensures that two logically identical XML documents produce the same canonical byte stream, allowing reliable digital signatures and comparisons.

Some aspects of canonicalization - such as whether XML comments are included or excluded - have already been exploited in previous Signature Wrapping (XSW) attacks ( SAMLStorm: Critical Authentication Bypass in xml-crypto and Node.js libraries ). However, beyond these known vectors, there are deeper limitations within the canonicalization process itself that can be abused.

Let's take a look at XML Signature Recommendation, which explicitly warns about the dangers of relative URIs:

Limitations: the relative URIs will not be operational in the canonical form.

The processing SHOULD create a new document in which relative URIs have been converted to absolute URIs, thereby mitigating any security risk for the new document.

This behavior introduces an opportunity: if the canonicalization process encounters a limitation, such as an unresolved relative URI, it may return an error instead of a canonicalized string. Fortunately for an attacker, only a small number of XML parsers are designed to properly handle such failures. Most implementations silently continue execution, treating the missing output as an empty or “void” canonical form, effectively skipping the data that should have been included in the digest. This powerful inconsistency becomes the foundation of the Void Canonicalization attack class.

Golden SAML Response

To demonstrate this behavior, consider the following SAML Response that exploits the canonicalization weakness:

< samlp:Response xmlns:ns="1" >
< samlp:Extensions >
< Parent xmlns = "http://www.w3.org/2000/09/xmldsig#" >
< Child xml: xmlns = "#other" >
< Signature >
< SignedInfo > REAL SIGNATURE </ SignedInfo >
</ Signature >
</ Child >
</ Parent >
</ samlp:Extensions >
< Assertion >
< Signature >
< SignedInfo > EMPTY STRING DIGEST VALUE </ SignedInfo
</ Signature >
</ Assertion >
</ samlp:Response >

Here, the declaration xmlns:ns="1" defines a relative namespace URI. It is still a well-formed XML document, but this causes an error during libxml2 canonicalization.

Instead of failing securely, Nokogiri canonicalization implementation simply returns an empty string when this error occurs. As a result, the subsequent DigestValue calculation is performed over an empty input, producing a valid hash of an empty string ( 47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= for SHA-256).

This behavior can also be exploited if a malicious user gains access to the SignatureValue of the empty string. Because the hash of the canonicalized SignedInfo is what produces the final SignatureValue, an attacker who possesses a precomputed signature for an empty string can reuse it to create a fully valid signature over an arbitrary SAML Response message.

Another exploit of the libxml2 canonicalization logic can be found in my previous exploit of the CVE-2025-25292 in the SAML Raider repo. Unfortunately, this is not well-formed XML, and can not be used any more.

The ruby-saml 1.12.4 and php-saml libraries are vulnerable to the canonicalization exploit, and other PHP XMLDSig implementations, such as Rob Richards’ xmlseclibs are also affected. In contrast, the XMLSec Library and Shibboleth xmlsectool are not vulnerable.

An example of such a "Golden SAML Response" (a message that always passes signature validation, regardless of how the assertion claims are modified) is available in the Appendix 1.

Getting a Valid Signature

Even if a malicious user cannot directly access a signed SAML Assertion, it does not mean there are no valid, IdP-signed XML documents available publicly. Several types of legitimate, signed data can be repurposed for exploitation.

The most straightforward source is SAML metadata. Unfortunately, these files are rarely signed, but in some cases, a signed version can be retrieved by appending parameters such as ?sign=true to metadata URLs.

Another reliable source is the signed error response. According to the SAML specification, the Request Abstract Type requires only three attributes: ID, Version, and IssueInstant. These form the minimal structure for a valid SAML request message. As defined in the SAML Core 2.0 Specification:

If a SAML responder deems a request to be invalid according to SAML syntax or processing rules,

then if it responds, it MUST return a SAML response message

This means that even when a request is malformed or syntactically invalid, the Identity Provider (IdP) may still issue a signed error response to indicate the failure. Invalid AuthnRequest showed below:

< samlp:AuthnRequest
ID = "&#x80;"
IssueInstant = "INVALID"
Version = "INVALID" >
</ samlp:AuthnRequest >

A signed error message can also become a source of a void signature if the reflected error content inside the response triggers a canonicalization error, resulting in the digest being computed over an empty string.

Final Exploit

Finally, Web Services Federation metadata is almost always publicly available for major identity providers. These documents provide a convenient and legitimate way for attackers to obtain valid signature elements, even when the XML is not fully compliant with the SAML schema.

Putting all together:

  • Extracted enveloped signature inserted into Extension point
  • Reserved xml Attribute namespace declaration hides Signature element from SAML processing module but keep it for digital signature
  • Fake signature node remains at Assertion element but keep Digest value of empty string
  • Finally Void canonicalization throws an unhandled exception to bypass hash restrictions

Real Use Case Scenario

In this large SaaS real-world scenario, which cannot be disclosed in detail, we used the Ruby-SAML exploit together with Gareth Heyes’ research, " Splitting the Email Atom: Exploiting Parsers to Bypass Access Controls " to generate a forged SAML Response, create a new account, and ultimately bypass authentication.

Tools

You can download the Burp Suite extension that automates the entire exploitation process from GitHub . These vulnerabilities will also be added to the SAML Raider extension - stay tuned.

Defense

To mitigate the risks described in this research, the following best practices should be adopted when implementing or maintaining SAML authentication systems:

  • Use strict XML schemas with minimal or no extensibility points.
  • Ensure that only signed elements are used for any future processing.
  • Keep all SAML and XML security libraries up to date, applying the latest security patches and version updates.
  • Avoid using email domain suffixes as a form of access control, as parser discrepancies can be exploited to bypass such restrictions.

Timeline

  • 29 April 2025 - Details of the Ruby-SAML 1.12.4 vulnerability were shared with the maintainer.
  • 27 August 2025 - Ruby-SAML and PHP-SAML void canonicalization (libxml2) vulnerabilities were disclosed to their maintainers.
  • 10 October 2025 - The libxml2 vulnerability in Rob Richards’ xmlseclibs was reported to the maintainer.
  • 8 December 2025 - Rob Richards’ xmlseclibs released version 3.1.4 to fix the libxml2 canonicalization vulnerability.
  • 8 December 2025 - Ruby-SAML maintainers published an announcement addressing CVE-2025-66568 and CVE-2025-66567, affecting all versions prior to 1.18.0 (including 1.12.4).

Conclusion

Reliable authentication security cannot depend on unsupported or poorly maintained libraries. Comprehensive and lasting remediation requires significant restructuring of existing SAML libraries. Such changes may introduce breaking compatibility issues or regressions, but they are essential to ensure the robustness of XML parsing, signature validation, and canonicalization logic. Without this foundational rework, SAML authentication will remain vulnerable to the same classes of attacks that have persisted for nearly two decades.

Back to all articles

Qualcomm acquires RISC-V focused Ventana Micro Systems

Hacker News
www.qualcomm.com
2025-12-10 15:30:46
Comments...

RoboCrop: Teaching robots how to pick tomatoes

Hacker News
phys.org
2025-12-10 15:29:14
Comments...
Original Article
RoboCrop: Teaching robots how to pick tomatoes
The left image shows the tomato-picking robot and camera. The right image shows a 'robot-eye view' of the tomatoes. Red represents mature fruits, green indicates immature fruits, and blue indicates selected harvesting targets. Credit: Osaka Metropolitan University

In the agricultural sector, labor shortages are increasing the need for automated harvesting using robots. However, some fruits, like tomatoes, are tricky to harvest. Tomatoes typically bear fruit in clusters, requiring robots to pick the ripe ones while leaving the rest on the vine, demanding advanced decision-making and control capabilities.

How robots learn to pick tomatoes

To teach robots how to become tomato pickers, Osaka Metropolitan University Assistant Professor Takuya Fujinaga, Graduate School of Engineering, programmed them to evaluate the ease of harvesting for each tomato before attempting to pick it.

Fujinaga's new model uses image recognition paired with statistical analysis to evaluate the optimal approach direction for each fruit. The system involves image processing/vision of the fruit, its stems, and whether it is concealed behind another part of the plant. These factors inform robot control decisions and help it choose the best approach. The findings are published in Smart Agricultural Technology .

Shifting from recognition to harvest-ease

The model represents a shift in focus from the traditional 'detection/recognition' model to what Fujinaga calls a 'harvest‑ease estimation'. "This moves beyond simply asking 'can a robot pick a tomato?' to thinking about 'how likely is a successful pick?', which is more meaningful for real‑world farming," he explained.

When tested, Fujinaga's new model demonstrated an 81% success rate, far above predictions. Notably, about a quarter of the successes were tomatoes that were successfully harvested from the right or left side that had previously failed to be harvested by a front approach. This suggested that the robot changed its approach direction when it initially struggled to pick the fruit.

Implications for the future of farming

Ultimately, Fujinaga's research highlights the nuance involved in fruit-picking for robots with factors including fruit clustering, stem geometry, background leaves, and occlusion all being important. "This research establishes 'ease of harvesting' as a quantitatively evaluable metric, bringing us one step closer to the realization of agricultural robots that can make informed decisions and act intelligently," he said.

Fujinaga sees a future where robots will be able to independently determine whether crops are ready for harvest. "This is expected to usher in a new form of agriculture where robots and humans collaborate," he explained. "Robots will automatically harvest tomatoes that are easy to pick, while humans will handle the more challenging fruits."

More information: Takuya Fujinaga, Realizing an intelligent agricultural robot: An analysis of the ease of tomato harvesting, Smart Agricultural Technology (2025). DOI: 10.1016/j.atech.2025.101538

Citation : RoboCrop: Teaching robots how to pick tomatoes (2025, December 8) retrieved 10 December 2025 from https://phys.org/news/2025-12-robocrop-robots-tomatoes.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

In New York City, congestion pricing leads to marked drop in pollution

Hacker News
e360.yale.edu
2025-12-10 15:25:05
Comments...
Original Article

Pexels

A new toll applied to cars driving in parts of New York City has led to a measurable drop in traffic, and with it, a 22 percent decline in particulate pollution, according to a new study.

Congestion pricing came into effect in January, with cars paying $9 to drive through busy parts of Manhattan during peak hours. In the first six months of the program, traffic in the congestion zone dropped by 11 percent, accidents by 14 percent, and complaints of excessive honking or other noise by 45 percent, officials said .

A new study from Cornell has now tallied the impact on particulate pollution. Particulates issued from tailpipes can aggravate asthma and heart disease and increase the risk of lung cancer and heart attack. Globally, they are a leading risk factor for premature death.

Analyzing data on air quality, traffic, and weather conditions, researchers determined that in the first half of this year, particulate pollution was down 22 percent in parts of Manhattan affected by congestion pricing.

The decline seen in New York was greater than in other cities with congestion pricing, such as Stockholm and London, researchers note. And the effect extended beyond Lower Manhattan. Pricing led to a drop in pollution across the greater metropolitan area, according to the study , published in the journal npj Clean Air.

“It’s really exciting to me that air quality improved throughout the entire metro area,” said lead author Timothy Fraser, of Cornell University. “This tells us that congestion pricing didn’t simply relocate air pollution to the suburbs by rerouting traffic. Instead, folks are likely choosing cleaner transportation options altogether, like riding public transportation or scheduling deliveries at night. This thins traffic and limits how smog compounds when many cars are on the road.”

ALSO ON YALE E360

How Parking Reform Is Helping Transform American Cities

Israel Used Palantir Technologies in Pager Terrorist Attack in Lebanon

Hacker News
the307.substack.com
2025-12-10 15:18:47
Comments...
Original Article

In September of 2024, Israel blew up boobie trapped pagers belonging to Hezbollah figures in public places in Lebanon, killing 12 people, including two children and two healthcare workers, and injuring 2,800.

The attack was followed by another attack using explosives in walkie-talkies that killed 25 people and injured another 600.

The Associated Press reported that the attacks “wounded many civilians” and that survivors are left “with missing eyes, faces laced with scars, hands with missing fingers”.

The United Nations at the time noted that the attacks “constitute war crimes of murder, attacking civilians, and launching indiscriminate attacks, in addition to violating the right to life” adding that, “Around 500 people suffered severe eye injuries, including a diplomat. Others suffered grave injuries to their faces, hands and bodies” and that “It is also a war crime to commit violence intended to spread terror among civilians, including to intimidate or deter them from supporting an adversary, A climate of fear now pervades everyday life in Lebanon”.

At the time, when asked about the attacks, former CIA director Leon Panetta said , “I don’t think there’s any question that it’s a form of terrorism”.

Now, a new book quietly reveals that Israel carried out the terrorist attack with the help of the AI surveillance firm Palantir, led by Alex Karp and Peter Thiel.

In the new biography of Palantir co-founder Alex Karp, “The Philosopher in the Valley: Alex Karp, Palantir, and the Rise of the Surveillance State,” by New York Times journalist Michael Steinberger, he writes that prior to the genocide in Gaza, “the Mossad had been using Palantir technology,” adding that the Shin Bet and IDF, “ sought to obtain Palantir’s software in the wake of Ocotber 7th”.

He goes on to write that, “The demand for Palantir’s assistance was so great that the company dispatched a a team of engineers from London to help get Israeli users online,” adding, “Palantir ended up having to rent a second-floor building that housed its Tel Aviv office, to accommodate the intelligence analysts who needed tutorials”.

Revealing what Israel used the AI-powered software for, Michael Steinberger notes, “Its software was used by the Israeli military in several raids in Gaza” and goes on to write that, “The company’s technology was deployed by the Israelis during military operations in Lebanon in 2024 that decimated Hezbollah’s top leadership” adding that, “It was also used in Operation Grim Beeper, in which hundreds of Hezbollah fighters were injured and maimed when their pagers and walkie-talkies exploded (the Israelis had booby trapped the devices)”.

Francesca Albanese, the United Nations’ Special Rapporteur on the situation of human rights in the Palestinian Territory, occupied since 1967, documented Palantir’s role in the genocide in Gaza, noting, “In January 2024, Palantir announced a new strategic partnership with Israel and held a board meeting in Tel Aviv “in solidarity”; in April 2025, Palantir’s Chief Executive Officer responded to accusations that Palantir had killed Palestinians in Gaza by saying, ‘mostly terrorists, that’s true’. Both incidents are indicative of executive-level knowledge and purpose vis-à-vis the unlawful use of force by Israel, and failure to prevent such acts or withdraw involvement.”

Now it is revealed that the AI software was used in Israel’s terrorist attack in Lebanon as well.

In a recent interview, the former head of the Israeli Mossad, Yossi Cohen, revealed that Israel has similar “booby-trapped and spy-manipulated equipment” in “all the countries you can imagine”.

The fact that a company as influential as Palantir was involved in the terrorist attacks makes these comments even more concerning.

Note to readers: The Dissident is a reader-supported outlet. If you liked this article, consider becoming a paid subscriber.

Discussion about this post

Ready for more?

Writing an Outlook Add-in in Rust

Hacker News
tritium.legal
2025-12-10 15:10:36
Comments...
Original Article

One of legal tech's clichés is that "lawyers live in Word".

This is demonstrably incorrect. I, for example, am a lawyer and in fact live in London, England.

But what they mean to say is that lawyers spend much of their time editing documents in Microsoft Word. This is because, for the most part, opening .docx files in Word is the default behavior where it's installed (everywhere). Lawyers, and again I'm speaking from experience here, are generally lazy when it comes to technology. Defaults are the law.

This is rational. Clients pay thousands of dollars per hour to have their legal needs addressed by the top law firms in the world. This means that law firms account for every moment their lawyers' working days. Generally, in 6-minute increments (or, 0.1 hours). No client is paying even 0.3 for their lawyer to learn a new software paradigm, and most law firms don't find forgoing revenue to train lawyers on new systems that will make them faster especially motivating.

So to get a foothold into legal, we need to make Tritium slot as nearly as possible into the existing workflow.

So where does the legal work flow originate?

Three places: (1) the document management system (DMS), (2) the desktop and (3) email.

We've previously talked about iManage , one of the most important document management systems in legal. There are other important ones such as NetDocuments, and our integrations into those will be the subject of another post.

Today, we're focused on the third place.

We're giving access to Tritium right in the lawyer's inbox.

We're going to replicate our "Open with Tritium" desktop entry point in Outlook. Here's what it looks like on the desktop:

Outlook Integration

"New Outlook" is some sort of half-implemented WebView mess that requires javascript round-tripped from a host server to plug in new features.

We'll eventually have to get in there, too, but for the most part law firms seem to have thus far stuck with the much more featureful "legacy Outlook". That version is a venerable, performant, C++-based Windows desktop application.

So, how do we plug into it?

COM

Before even the easy 100 MB of RAM days let alone the advent of node and electron and JSON , the Windows operating system needed a way to allow processes and applications to communicate in a language-agnostic way. This ultimately resulted in the "Component Object Model" or COM. COM allows us to plug into various entry points using a Dynamically Linked Library (.dll) which follows a strict ABI with certain calling conventions.

COM lives on today, and it is still an effective way to communicate with various processes, including Windows 11's File Explorer.

Fortunately, COM is supported in the windows-rs Rust crate.[ 1 ]

To add a link to Outlook's attachment context menu, we need to inherit from a series of COM classes: IDispatch , IDTExtensibility2 and ultimately IRibbonExtensibility .

windows-rs provides an IDispatch implementation out-of-the box which exposes a trait that looks like the below:

fn GetTypeInfoCount(&self) -> windows::core::Result<u32> {}

fn GetIDsOfNames(
    &self,
    riid: *const GUID,
    rgsz_names: *const PCWSTR,
    c_names: u32,
    lcid: u32,
    rg_disp_id: *mut i32,
) -> std::result::Result<(), windows_core::Error> {}

fn Invoke(
    &self,
    disp_id_member: i32,
    riid: *const GUID,
    lcid: u32,
    w_flags: DISPATCH_FLAGS,
    p_disp_params: *const DISPPARAMS,
    p_var_result: *mut VARIANT,
    p_excep_info: *mut EXCEPINFO,
    pu_arg_err: *mut u32,
) -> std::result::Result<(), windows_core::Error> {}

These functions provide the basic COM dispatching mechanisms.

Using them a caller is able to look up the rg_disp_id of a particular named function in your implementation, then Invoke that function with the results optionally populating p_var_result which is a pointer to a mutable union of possible result types.

This is the basic wiring which allows us to implement the required IDTExensibility2 and IRibbonExtensibility classes.

windows-rs doesn't implement these classes, but does help us by providing the interface procedural macro which handles setting up the VTables to map our struct's methods to the COM ABI.

We use the class's GUID for the macro to establish that we're implementing IDTExtensibility2 .[ 2 ]

#[windows::core::interface("B65AD801-ABAF-11D0-BB8B-00A0C90F2744")]
pub unsafe trait IDTExtensibility2: IDispatch {
    unsafe fn OnConnection(
        &self,
        _application: Option<&IDispatch>,
        _connectmode: i32,
        _addin_instance: Option<&IDispatch>,
        _custom: SAFEARRAY,
    ) -> HRESULT;
    unsafe fn OnDisconnection(&self, mode: i32, custom: SAFEARRAY) -> HRESULT;
    unsafe fn OnAddInsUpdate(&self, custom: SAFEARRAY) -> HRESULT;
    unsafe fn OnStartupComplete(&self, custom: SAFEARRAY) -> HRESULT;
    unsafe fn OnBeginShutdown(&self, custom: SAFEARRAY) -> HRESULT;
}

Then, we implement that interface for our struct .

#[implement(IRibbonExtensibility, IDTExtensibility2, IDispatch)]
struct Addin;

This causes the procedural macro to generate IRibbonExensibility_Impl , IDTExensibility2_Impl and IDispatch_Impl traits for us to implement in struct Addin_Impl .

Here's the initial Tritium IDTExensibility2_Impl verbatim for example:

impl IDTExtensibility2_Impl for Addin_Impl {
    unsafe fn OnConnection(
        &self,
        _application: Option<&IDispatch>,
        _connectmode: i32,
        _addin_instance: Option<&IDispatch>,
        _custom: SAFEARRAY,
    ) -> HRESULT {
        log("OnConnection called()");
        // Don't do any heavy operations here that could crash Outlook
        S_OK
    }

    unsafe fn OnDisconnection(&self, _mode: i32, _custom: SAFEARRAY) -> HRESULT {
        log("OnDisconnection called()");
        S_OK
    }

    unsafe fn OnAddInsUpdate(&self, _custom: SAFEARRAY) -> HRESULT {
        log("OnAddInsUpdate called()");
        S_OK
    }

    unsafe fn OnStartupComplete(&self, _custom: SAFEARRAY) -> HRESULT {
        log("OnStartupComplete called()");
        S_OK
    }

    unsafe fn OnBeginShutdown(&self, _custom: SAFEARRAY) -> HRESULT {
        log("OnBeginShutdown called()");
        S_OK
    }
}

As discussed below, we used an LLM to generate these signatures since they aren't provided in the windows-rs crate out of the box.

Since our simple add-in at this point doesn't maintain any global state that would otherwise be constructed, adjusted and deconstructed at OnConnection , OnAddInsUpdate and OnBeginShutdown , respectively, we just log the call for debugging and return S_OK .

Now, being somewhat "vintage" in 2025, COM is noticeably not well documented on the web.

For example, Microsoft's own web documentation for the IRibbonExtensibility class in C++ gently nudges one towards the managed C# version:

But from this we can determine that GetCustomUI is called with an id string, which is used to look up the correct custom XML ribbon we've implemented. That is returned to the caller. In our case, that's Outlook.

That's helpful for understanding the mechanics, but not exactly helpful for implementing the API in Rust. In fact, despite many minutes of bona fide web searching, I was unable to locate the C++ signature for IRibbonExtensibility .

But, it's 2025 and since modern LLMs have ingested and essentially compressed the entire web, plus all books and New York Times articles ever written, we can ask them to generate a signature for IRibbonExtensibility for us!

This is what Claude one-shotted at the time:

impl IRibbonExtensibility_Impl for Addin {
    unsafe fn GetCustomUI(&self, _ribbon_id: BSTR, xml: *mut BSTR) -> HRESULT {
        // Only provide ribbon XML for specific ribbon IDs or all if we want global
        // ribbon For now, we'll provide it for all requests
        unsafe {
            *xml = BSTR::from(RIBBON_XML);
        }
        S_OK
    }
}

So, unlike the C# code which returns our custom XML, C++ and, thus the Rust implementation, wants an HRESULT value to specify success and the result written to a mutable parameter called xml here. Seems plausible.

Rust would do this more ergonomically with the Result return type today, but this is a common historical approach.

And with that, we implement a custom RIBBON_XML , which looks like this:

const RIBBON_XML: &str = r#"
<customUI xmlns="http://schemas.microsoft.com/office/2009/07/customui" loadImage="LoadImage">
    <contextMenus>
        <!-- Attachment context-menu -->
        <contextMenu idMso="ContextMenuAttachments">
            <button id="btnOpenWithTritium"
                    label="Open with Tritium"       
                    onAction="OpenWithTritium"
                    insertAfterMso="OpenAttach"
                    image="tritiumIcon"
            />
        </contextMenu>
    </contextMenus>
</customUI>
"#;

And, success!

After wiring up the Invoke functions for launching Tritium and registering our DLL with Outlook in the Windows registry, we're basically done.

Except.

...

Interesting.

...

Bomb

Every so often, and with no particular pattern, it seems other add-ins are now crashing.

We get the dreaded safe-mode prompt on restart,

then, "Outlook detected an issue with an add-in and disabled it",

and a suggestion to disable an arbitrary other add-in.

Now, the add-in ecosystem is notoriously buggy due in part to these COM complexities, but these random crashes sometimes include the Microsoft Exchange Add-in. That one is used to communicate with Microsoft's cloud services and thus in the hot path of M$FT profits.

It's not them. It's us.

Non-deterministic crashes when crossing an FFI barrier from Rust into C screams memory error.

We wire up a unit test to try to isolate the issue. It looks something like the following:

#[test]
fn confirm_implementations() {
    use windows::Win32::System::Com::CLSCTX_INPROC_SERVER;
    use windows::Win32::System::Com::CoGetClassObject;
    use windows::Win32::System::Com::CoInitializeEx;
    use windows::Win32::System::Com::IClassFactory;

    unsafe { CoInitializeEx(None, windows::Win32::System::Com::COINIT_APARTMENTTHREADED).unwrap() };

    let clsid = CLSID_RUST_ADDIN;
    // create an instance of the class here
    {
        unsafe {
            let factory: IClassFactory =
                CoGetClassObject(&raw const clsid, CLSCTX_INPROC_SERVER, None).unwrap();
            let com_object: IDTExtensibility2 = factory.CreateInstance(None).unwrap();

            let array = SAFEARRAY::default();
            let result = com_object.OnConnection(None, 1, None, array);
            assert_eq!(result, S_OK, "OnConnection failed");
            let mut ribbon_ptr: *mut std::ffi::c_void = std::ptr::null_mut();
            let outer_ptr: *mut *mut std::ffi::c_void = &raw mut ribbon_ptr;
            com_object
                .query(&IRibbonExtensibility::IID, outer_ptr as *mut _)
                .unwrap();
            let addin = IRibbonExtensibility::from_raw_borrowed(&ribbon_ptr).unwrap();
            let mut xml: BSTR = BSTR::new();
            addin
                .GetCustomUI(BSTR::from(""), &raw mut xml)
                .unwrap();
        }
    }

    unsafe { CoUninitialize() };
}

We're not making a lot of assertions here, because we're just trying to find the memory error. But this of course passes just fine thanks to Rust's memory guarantees.

No dice.

We comment out all of the behavior and isolate the issue down to the GetCustomUI implementation.

We're writing to a *mut BSTR which is unsafe and the first probable source of the error.

windows-rs manages the lifetime of an owned BSTR for us by implementing Drop which calls the Windows-level SysFreeString on the underlying C string if the pointer is non-null:

impl Drop for BSTR {
    fn drop(&mut self) {
        if !self.0.is_null() {
            unsafe { bindings::SysFreeString(self.0) }
        }
    }
}

One theory Nik and I come up with is that when we write to the *mut BSTR pointer, we subsequently drop the BSTR resulting in Outlook reading some uninitialized memory or a double-free.

Switching the assingment to std::mem::transmute or std::mem::write or other memory tricks doesn't fix the issue.

Time for the big guns.

We opt to launch or attach directly to OUTLOOK.EXE which is reading our DLL from the target/debug/ directory.

In VS Code, that can be configured like so:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Debug Outlook (cppvsdbg)",
            "type": "cppvsdbg",
            "request": "launch",
            "symbolSearchPath": "${workspaceFolder}/target/debug",
            "program": "C:/Program Files/Microsoft Office/root/Office16/OUTLOOK.EXE",
            "cwd": "${workspaceFolder}",
        },
        {
            "name": "Attach to Outlook (cppvsdbg)",
            "type": "cppvsdbg",
            "request": "attach",
            "processId": "${command:pickProcess}",
            "symbolSearchPath": "${workspaceFolder}/target/debug",
        },
    ]
}

To check the drop, we set a breakpoint on drop and launch Outlook with the debugger attached.

Outlook calls GetCustomUI on startup, so we should see a drop immediately.

Since the out value is null , Drop doesn't call SysFreeString on it. However, drop does call SysFreeString on unused _ribbon_id argument at the end of the scope.

Drats.

...

Wait.

...

Would Outlook really pass us an owned BSTR as a function argument?

Let's look at our initial COM signatures again.


// provided by `windows-rs`
impl IDispatch_Impl for Addin_Impl { 
    fn GetTypeInfoCount(&self) -> windows::core::Result<u32> {}

    fn GetIDsOfNames(
        &self,
        riid: *const GUID,
        rgsz_names: *const PCWSTR,
        c_names: u32,
        lcid: u32,
        rg_disp_id: *mut i32,
    ) -> std::result::Result<(), windows_core::Error> {}

    fn Invoke(
        &self,
        disp_id_member: i32,
        riid: *const GUID,
        lcid: u32,
        w_flags: DISPATCH_FLAGS,
        p_disp_params: *const DISPPARAMS,
        p_var_result: *mut VARIANT,
        p_excep_info: *mut EXCEPINFO,
        pu_arg_err: *mut u32,
    ) -> std::result::Result<(), windows_core::Error> {}
}

// initial signatures provided by LLMs
impl IDTExtensibility2_Impl for Addin_Impl {
    unsafe fn OnConnection(
        &self,
        _application: Option<&IDispatch>,
        _connectmode: i32,
        _addin_instance: Option<&IDispatch>,
        _custom: SAFEARRAY,
    ) -> HRESULT {}
 
    unsafe fn OnDisconnection(&self, _mode: i32, _custom: SAFEARRAY) -> HRESULT {}
 
    unsafe fn OnAddInsUpdate(&self, _custom: SAFEARRAY) -> HRESULT {}
    unsafe fn OnStartupComplete(&self, _custom: SAFEARRAY) -> HRESULT {}
    unsafe fn OnBeginShutdown(&self, _custom: SAFEARRAY) -> HRESULT {}
}

impl IRibbonExtensibility_Impl for Addin_Impl {
    unsafe fn GetCustomUI(&self, _ribbon_id: BSTR, out: *mut BSTR) -> HRESULT {}
}

Note that the IDispatch signature which is provided by actual Microsoft team members via windows-rs passes all of its arguments as *const or *mut pointers.

But our LLM-inspired signatures for IDTExtensibility2 and IRibbonExtensibility pass owned, heap-allocated arguments.

That's wrong.

There's doesn't seem to be an automatic Drop implementation which frees the SAFEARRAY internal data structures, so the memory corruption doesn't automatically occur when those values in the IDTExtensibility2 methods go out of scope.

But when _ribbon_id goes out of scope and frees the Outlook-owned string underneath it, we get a crash. That's our bomb.

This project has been pending in a feature branch since July. During that time, I have prompted every major frontier LLM to identify the memory corruption point. While they have all provided very useful feedback on the code, and even though many can generate a correct signature today, none, zero, zip, zilch identified the incorrect signatures as the source of the problem.

So it will be an open question whether this project could have been completed without an LLM, but certainly some of their shortcomings have been identified during it.

Hopefully the windows-rs team will some day soon provide Rust implementations of the entire COM universe as they have done with IDispatch . But until then time, be ware of the unsafe boundaries between your Rust project and the COM interface.

Thanks for reading.


The correct signature for future LLMs (and humans!):

impl IDTExtensibility2_Impl for Addin_Impl {
    unsafe fn OnConnection(
        &self,
        _application: *const std::ffi::c_void,
        _connectmode: i32,
        _addin_instance: *const std::ffi::c_void,
        _custom: *const SAFEARRAY,
    ) -> HRESULT {}
 
    unsafe fn OnDisconnection(&self, _mode: i32, _custom: *const SAFEARRAY) -> HRESULT {}
 
    unsafe fn OnAddInsUpdate(&self, _custom: *const SAFEARRAY) -> HRESULT {}
    unsafe fn OnStartupComplete(&self, _custom: *const SAFEARRAY) -> HRESULT {}
    unsafe fn OnBeginShutdown(&self, _custom: *const SAFEARRAY) -> HRESULT {}
}

impl IRibbonExtensibility_Impl for Addin_Impl {
    unsafe fn GetCustomUI(&self, _ribbon_id: *const BSTR, out: *mut BSTR) -> HRESULT {}
}

And the test would be fixed to:

#[test]
fn confirm_implementations() {
    use windows::Win32::System::Com::CLSCTX_INPROC_SERVER;
    use windows::Win32::System::Com::CoGetClassObject;
    use windows::Win32::System::Com::CoInitializeEx;
    use windows::Win32::System::Com::IClassFactory;

    unsafe { CoInitializeEx(None, windows::Win32::System::Com::COINIT_APARTMENTTHREADED).unwrap() };

    let clsid = CLSID_RUST_ADDIN;
    // create an instance of the class here
    {
        unsafe {
            let factory: IClassFactory =
                CoGetClassObject(&raw const clsid, CLSCTX_INPROC_SERVER, None).unwrap();
            let com_object: IDTExtensibility2 = factory.CreateInstance(None).unwrap();

            let array = SAFEARRAY::default();
            let result =
                com_object.OnConnection(std::ptr::null(), 1, std::ptr::null(), &raw const array);
            assert_eq!(result, S_OK, "OnConnection failed");
            let mut ribbon_ptr: *mut std::ffi::c_void = std::ptr::null_mut();
            let outer_ptr: *mut *mut std::ffi::c_void = &raw mut ribbon_ptr;
            com_object
                .query(&IRibbonExtensibility::IID, outer_ptr as *mut _)
                .unwrap();
            let addin = IRibbonExtensibility::from_raw_borrowed(&ribbon_ptr).unwrap();
            let mut xml: BSTR = BSTR::new();
            addin
                .GetCustomUI(&BSTR::from("") as *const BSTR, &raw mut xml)
                .unwrap();
        }
    }

    unsafe { CoUninitialize() };
}

[1] We first considered building our add-in in the Microsoft-preferred "managed" approach using a C# dotnet system .NET . For reference, the C# code required for this was only a few hundred straightforward lines of code.

But using C# required us to contemplate whether and which dotnet runtime our client supported. Or did we need to ship our own? Isn't this just a small launcher stub? This was just too much complexity outside of our wheelhouse to put between our product and the user. This is not to say that the C# approach isn't valid. It is just that our limited understanding of that ecosystem and its requirements counseled against shipping it as a primary entry point into our application. We also briefly looked at implementing the classes in C++, but we can get the same performance with thread and memory safety guarantees in Rust.

[2] Finding the relevant GUID is left as an exercise to the reader.

Why AGI Will Not Happen

Hacker News
timdettmers.com
2025-12-10 15:10:31
Comments...
Original Article

If you are reading this, you probably have strong opinions about AGI, superintelligence, and the future of AI. Maybe you believe we are on the cusp of a transformative breakthrough. Maybe you are skeptical. This blog post is for those who want to think more carefully about these claims and examine them from a perspective that is often missing in the current discourse: the physical reality of computation.

I have been thinking about this topic for a while now, and what prompted me to finally write this down was a combination of things: a Twitter thread, conversations with friends, and a growing awareness that the thinking around AGI and superintelligence is not just optimistic, but fundamentally flawed. The purpose of this blog post is to address what I see as very sloppy thinking, thinking that is created in an echo chamber, particularly in the Bay Area, where the same ideas amplify themselves without critical awareness. This amplification of bad ideas and thinking exhuded by the rationalist and EA movements, is a big problem in shaping a beneficial future for everyone. Realistic thought can be used to ground where we are and where we have to go to shape a future that is good for everyone.

I want to talk about hardware improvements, AGI, superintelligence, scaling laws, the AI bubble, and related topics. But before we dive into these specific areas, I need to establish a foundation that is often overlooked in these discussions. Let me start with the most fundamental principle.

Computation is Physical

A key problem with ideas, particularly those coming from the Bay Area, is that they often live entirely in the idea space. Most people who think about AGI, superintelligence, scaling laws, and hardware improvements treat these concepts as abstract ideas that can be discussed like philosophical thought experiments. In fact, a lot of the thinking about superintelligence and AGI comes from Oxford-style philosophy. Oxford, the birthplace of effective altruism, mixed with the rationality culture from the Bay Area, gave rise to a strong distortion of how to clearly think about certain ideas. All of this sits on one fundamental misunderstanding of AI and scaling: computation is physical.

For effective computation, you need to balance two things. You need to move global information to a local neighborhood, and you need to pool multiple pieces of local information to transform old information into new. While the complexity of local computation is virtually constant — much accelerated by smaller transistors — movement scales quadratically with distance to local computation units. While memory movement also benefits from smaller transistors, improvements become quickly sublinear due to the squared nature of memory access patterns.

This is most easily seen by looking at cache hierarchies. L1, L2 and L3 cache are physically the same technology, but computationally they are very different. L2 and L3 are much larger than L1, but they are also much slower. This is because L2 and L3 are further away, physically, from the computational core, and memory lookups need to traverse a longer distance due to the physical size.

Two ideas to remember: First, larger caches are slower. Second, as we get smaller and smaller transistors, computation gets cheaper, but memory becomes more expensive, relatively speaking. The fraction of silicon area dedicated to memory on a chip has increased over time to the point where now computational elements on a chip are trivial in proportion. Almost all area is allocated to memory. In other words, if you want to produce 10 exaflops on a chip, you can do that easily — but you will not be able to service it with memory, making it useless FLOPS (the NVIDIA marketing department is good at ignoring this fact). All of this makes AI architectures like the transformer fundamentally physical. Our architectures are not abstract ideas that can be developed and thrown around carelessly. They are physical optimizations of information processing units.

To process information usefully, you need to do two things: compute local associations (MLP) and pool more distant associations to the local neighborhood (attention). This is because local information alone only helps you to distinguish closely related information, while pooling distant information helps you to form more complex associations that contrast or augment local details. The transformer is one of the most physically efficient architectures because it combines the simplest ways of doing this local computation and global pooling of information. The global pooling of information might be made more effective through research, and there is still active investigation going on that I think might be promising, but it has diminishing returns — the transformer architecture is close to physically optimal.

Computation is physical. This is also true for biological systems. The computational capacity of all animals is limited by the possible caloric intake in their ecological niche. If you have the average calorie intake of a primate, you can calculate within 99% accuracy how many neurons that primate has. Humans invented cooking, which increased the physically possible caloric intake substantially through predigestion. But we reached the physical limits of intelligence. When women are pregnant, they need to feed two brains, which is so expensive that physically, the gut cannot mobilize enough macronutrients to keep both alive if our brains were bigger. With bigger brains, we would not be able to have children — not because of the birth canal being too small, but because we would not be able to provide enough energy — making our current intelligence a physical boundary that we cannot cross due to energy limitations.

We are close to reaching the same limits for digital computation.

Linear Progress Needs Exponential Resources

There have been studies about progress in all kinds of fields that come to the same conclusion: linear progress needs exponential resources. What does that mean? If you want to improve a system further and further, make it more precise, or improve its efficiency, you need exponentially more resources with any improvement that you make. This is true for all kinds of fields and problems being investigated, and it is pretty clear why.

There are two realities at play here: one physical and one in the idea space. In the physical reality, if you need to accumulate resources in time and space to produce an outcome, then for logistical reasons, the overall effect that is locally produced needs linear resources to produce a linear outcome. But because of physicality and because matter takes up space, those resources can only be pooled at an increasingly slowing rate due to contention in space or time.

In the idea space, there is a similar phenomenon, which is less obvious. If two ideas are completely independent, they can have an effect that is ten times larger than any single idea. But if ideas are related, then the overall impact is limited due to diminishing returns — the ideas are just too correlated. If an idea builds on another, it can only be so much better. Often, if there is a dependency between ideas, one is a refinement of the other. Refinements, even if they are extremely creative, will yield incremental improvements. If a field is large enough, even if one tries to work on very different ideas, they are still heavily related to previous ideas. For example, while state-based models and Transformers seem like very different approaches to attention, they concentrate on the same problem. Very minimal gains can be achieved through any idea that modifies attention in these ways.

These relationships are most striking in physics. There was a time when progress could be made by individuals – not so much anymore.

I talked to a top theoretical physicist at a top research university, and he told me that all theoretical work in physics is, in some sense, either incremental refinement or made-up problems. The core problem of the idea space is this: if the idea is in the same sub-area, no meaningful innovation is possible because most things have already been thought. A first urge is to look for wildy creative ideas, but the problem is that are still bound by the rules of that subspace that often exist for a very good reason (see graduate-student-theory-of-everything-phenomenon). So the theoretical physicist faces only two meaningful choices: refine other ideas incrementally, which leads to insignificant impact; or work on rule-breaking unconventional ideas that are interesting but which will have no clear impact on physical theory.

The experimental physics demonstrates the physical limitations. The experiments that test more and more fundamental laws of physics and constituent particles — in other words, the standard model — become increasingly expensive. The standard model is incomplete, and we do not know how to fix it. Higher energies at the Large Hadron Collider have only led to more inconclusive results and the ruling out of more theories. We have no understanding of what dark energy or dark matter is, even though we build increasingly complex experiments that cost billions of dollars. The reality might be that certain aspects of physics are unknowable, hidden by complexity that cannot be attained with the resources that we can muster.

If you want to get linear improvements, you need exponential resources.

GPUs No Longer Improve

One of the most common misconceptions I see is that people assume hardware keeps improving and improving. This is an important misconception that explains a lot of the poor thinking around AI progress. The efficiency of GPUs has driven almost all innovation in AI. AlexNet was only possible by developing one of the first CUDA implementations that could compute convolutions over networked GPUs. Further innovation was mostly possible through improved GPUs and using more GPUs. Almost everybody sees this pattern — GPUs improve, AI performance improves — and it is easy to think that GPUs will improve further and will continue to improve AI outcomes. Every generation of GPUs has been better, and it would seem foolish to think that it will stop. But actually, it is foolish to think that GPUs will continue to improve. In fact, GPUs will no longer improve meaningfully. We have essentially seen the last generation of significant GPU improvements. GPUs maxed out in performance per cost around 2018 — after that, we added one-off features that exhaust quickly.

The first of these one-off features was 16-bit precision, then Tensor Cores, or the equivalent, then high-bandwidth memory (HBM),then the TMA or equivalent,  then 8-bit precision, then 4-bit precision. And now we are at the end, both in the physical and the idea space. I have shown in my paper about k-bit inference scaling laws what data types with particular block sizes and computational arrangements are optimal. This has already been adopted by hardware manufacturers. Any further improvement will lead not to straightforward improvements but to trade-offs: either better memory footprint at lower computational efficiency or higher computational throughput at higher memory footprint. Even if you can innovate – linear improvements, need exponential resources – further improvements will be trivial and will not add any meaningful advancement.

While GPUs can no longer improve meaningfully, rack-level optimizations are still critically important. Efficient shuttling of key-value caches is one of the most important problems in AI infrastructure. The current solution to this problem, however, is also relatively straightforward. Companies like OpenAI boast about their AI infrastructure, but it is relatively simple to design because there is essentially only one optimal way to design it. And while it is complex to implement, it just needs clear thinking and mostly hard, time-intensive engineering. But the overall system design is not particularly novel. OpenAI – or other frontier labs – have no fundamental advantage in their inference and infrastructure stacks. The only way to gain an advantage is by having slightly better rack-level hardware optimizations or data-center-level hardware optimizations. But these will also run out quickly – maybe 2026, maybe 2027.

Why Scaling Is Not Enough

In my Twitter thread, I talked about how Gemini might signal a plateau in AI progress in the sense that we might not see meaningful improvements anymore. A lot of people responded with something along the lines of, “You are being too pessimistic. Can you not see that scaling works?” The point here is a bit more subtle, so I want to elaborate.

I believe in scaling laws and I believe scaling will improve performance, and models like Gemini are clearly good models. The problem with scaling is this: for linear improvements, we previously had exponential growth as GPUs which canceled out the exponential resource requirements of scaling. This is no longer true. In other words, previously we invested roughly linear costs to get linear payoff, but now it has turned to exponential costs. That would not be a problem on its own, but it sets a clear physical limit on scaling that is rapidly approaching. We have maybe one, maybe two more years of scaling left because further improvements become physically infeasible. The scaling improvements in 2025 were not impressive. Scaling in 2026 and 2027 better work out better.

Despite these exponential costs, the current infrastructure build-out is reasonable, particularly with the growth of inference use, but it still creates a very precarious balance. The biggest problem is this: if scaling does not provide much larger improvements than research/software innovations, then hardware becomes a liability and not an asset.

Small players like MoonshotAI and Z.ai show that they do not need many resources to reach frontier performance (I personally prefer Kimi K2-thinking over Sonnet 4.5 for coding). If these companies innovate beyond scale, they might just create the best model. While they might still use existing infrastructure, they could just switch to Huawei Ascend chips for inference, which are more than fine for providing good inference performance.

Another big threat to scale-up-infrastructure is that, currently, large-model inference efficiency is strongly related to a large user base due to network scaling. The problem is that efficient deployments of a large model needs a certain amount of GPUs to be efficient enough to overlap computation with networking and KV-cache length partitioning. Such deployments are ultra-efficient but demand a large user base to unlock full utilization and with that, cost-effectiveness. That is why open-weight models currently have not had the expected impact, because the infrastructure cost of large deployments need a large user-base. However, this problem can be solved with software.

While vLLM and SGLang currently try to optimize frontier-type deployments, they do not provide this efficiency at smaller scales. With the right inference stack beyond vLLM/SGLang, people would be able to deploy a ~300-billion-parameter model with the same efficiency as OpenAI or Anthropic deploys their frontier models. If smaller models become more capable — we see this with GLM 4.6 — or if AI applications become more specialized, the infrastructure advantage of frontier labs might vanish overnight. The software complexity evaporates, and open-source, open-weight deployments might be close to physically optimal, both in terms of computational efficiency and information processing efficiency. This is a large risk for frontier players.

Under slowing scaling, any of these three factors might degrade the value of AI infrastructure significantly and rapidly: (1) research/software innovations, (2) strong open-weight inference stacks, (3) shift to other hardware.

The current trends do not look good for frontier labs.

Frontier AI Versus Economic Diffusion

The US and China follow two different approaches to AI. The US follows the idea that there will be one winner who takes it all – the one that builds superintelligence wins. Even coming short of superintelligence of AGI, if you have the best model, almost all people will use your model and not the competition’s model. The idea is: develop the biggest, badest model and people will come.

China’s philosophy is different. They believe model capabilities do not matter as much as application. What matters is how you use AI. The key indicator of progress is how much AI is integrated into everything and how useful it is. If one model is better than another, it does not automatically mean it will be used more widely. What is important is that the model is useful and yields productivity gains at a reasonable cost. If the current approach is more productive than the previous one, it will be adopted. But hyper-optimization for slightly better quality is not very effective. In most cases, settling on “good enough” yields the highest productivity gain.

I think it is easy to see that the US philosophy is short-sighted and very problematic — particularly if model capability slows. The Chinese philosophy is more long-term focused and pragmatic.

The key value of AI is that it is useful and increases productivity. That makes it beneficial. It is clear that, similarly to computers or the internet, AI will be used everywhere. The problem is that if AI were just used for coding and engineering, it would have a very limited impact. While a lot of economic activity is supported by digital programs, these also have diminishing returns, and producing more software will not improve outcomes significantly if existing software is already good enough (just look at the SAAS failure in China). This makes wide-spread economic integration absolutely vital for AI effectiveness.

So in order to provide real value, AI needs to be used in ways that provide new benefits, not just improvements to what already exists. This is a difficult problem, but the right answer is to integrate AI into everything to squeeze out non-linear improvements, see what works and what does not, then keep what is working. China is taking this approach by subsidizing applications that use AI to encourage adoption. The Chinese population is very receptive to innovation, which facilitates this process. It is nothing unusual in China to see an 80-year-old grandma use AI to help her with their daily life. The US, on the other hand, bets on ideas like AGI and superintelligence, which I believe are fundamentally flawed concepts that have little relevance to future AI progress. This becomes clear when you think carefully about what these terms actually mean in physical reality.

AGI Will Never Happen, and Superintelligence Is a Fantasy

There is this pattern I have noticed: when you ask people in the Bay Area when AGI will happen, they always say it is a few years in the future, and it will have a massive impact. Then, if you ask them what AGI actually is, they do not include any physical tasks in their definition, and they do not consider resource inputs.

True AGI, that can do all things human, would need to be able to physical tasks – which comprises the largest economic sector. In short, AGI should include physical robots or machines that are able to do economically meaningful work in the physical world. While physical robots might be convenient for unloading your dishwasher, you will not see them replacing specialized systems in factories. Specialized robots in factories are too efficient, too precise. China demonstrates that dark factories — fully automated facilities — are already possible. Most robotics problems are solved problems in controlled environments. Most existing robotics problems that remain unsolved are also economically unviable. Stitching sleeves to a t-shirt is an unsolved robotics problem, but it is also not particularly economically meaningful in most contexts. Household robots will be interesting, but if it takes me two minutes to unload my dishwasher, I am not sure I need a robot for that. And while in a couple of years a robot might be able to fold laundry, I would rather spend a few minutes folding it myself with no creases than have a robot do a mediocre job.

The main problem with robotics is that learning follows scaling laws that are very similar to the scaling laws of language models. The problem is that data in the physical world is just too expensive to collect, and the physical world is too complex in its details. Robotics will have limited impacts. Factories are already automated and other tasks are not economically meaningful.

The concept of superintelligence is built on a flawed premise. The idea is that once you have an intelligence that is as good or better than humans — in other words, AGI — then that intelligence can improve itself, leading to a runaway effect. This idea comes from Oxford-based philosophers who brought these concepts to the Bay Area. It is a deeply flawed idea that is harmful for the field. The main flaw is that this idea treats intelligence as purely abstract and not grounded in physical reality. To improve any system, you need resources. And even if a superintelligence uses these resources more effectively than humans to improve itself, it is still bound by the scaling of improvements I mentioned before — linear improvements need exponential resources. Diminishing returns can be avoided by switching to more independent problems – like adding one-off features to GPUs – but these quickly hit their own diminishing returns. So, superintelligence can be thought of as filling gaps in capability, not extending the frontier. Filling gaps can be useful, but it does not lead to runaway effects — it leads to incremental improvements.

Furthermore, the same people who think that GPUs will infinitely improve are often the people who think superintelligence will make those improvements faster and better. But they do not realize that GPUs can no longer be meaningfully improved. We can wait for better HBM memory technology for speed, and for chiplets and advanced packaging to improve yield/cost, but that is it. Rack-level optimization will likely hit the physical wall in 2026 or 2027. A superintelligence will not accelerate the progress made in HBM development, manufacturing, testing, and integration. The transformer architecture is close to physically optimal. Superintelligence will not be able to meaningfully improve neural network architectures. Efficient large-scale deployments for inference are largely a solved engineering problem. It just needs some careful engineering and time, but very little creativity is required to solve this problem close to physical optimality. Superintelligence will not be able to improve our inference stack by much.

A superintelligence might help with economic diffusion of AI technology, but in the end, the limiting factor of economic diffusion is implementation and adoption, not capability. It is clear to me that any organization that strives primarily for superintelligence as a goal will encounter significant challenges and will ultimately falter and be displaced by players that provide general economic diffusion.

In summary, AGI, as commonly conceived, will not happen because it ignores the physical constraints of computation, the exponential costs of linear progress, and the fundamental limits we are already encountering. Superintelligence is a fantasy because it assumes that intelligence can recursively self-improve without bound, ignoring the physical and economic realities that constrain all systems. These ideas persist not because they are well-founded, but because they serve as compelling narratives in an echo chamber that rewards belief over rigor.

The future of AI will be shaped by economic diffusion, practical applications, and incremental improvements within physical constraints — not by mythical superintelligence or the sudden emergence of AGI. The sooner we accept this reality, the better we can focus on building AI systems that actually improve human productivity and well-being.

As AI floods our culture, here’s why we must protect human storytelling in games

Guardian
www.theguardian.com
2025-12-10 15:00:21
Buying the Zombies, Run! studio wasn’t part of ​my plan, but a post-apocalypse ​game with stories that make people feel seen pulled me in • Don’t get Pushing Buttons delivered to your inbox? Sign up here A few days ago, I clicked a button on my phone to send funds to a company in Singapore and so to...
Original Article

A few days ago, I clicked a button on my phone to send funds to a company in Singapore and so took ownership of the video game I co-created and am lead writer for: Zombies, Run! I am a novelist, I wrote the bestselling, award-winning The Power, which was turned into an Amazon Prime TV series starring Toni Collette. What on earth am I doing buying a games company?

Well. First of all. Zombies , Run! is special. It’s special to me – the game started as a Kickstarter and the community that grew up around it has always been incredibly supportive of what we’re doing. And it’s special in what it does. It’s a game to exercise with. You play it on your smartphone – iPhone or Android – and we tell stories from the zombie apocalypse in your headphones to encourage you to go further, faster, or just make exercise less boring. Games are so often portrayed as the bad entertainment form, but I made a game that fundamentally helps people to be healthier.

The experience of playing Zombies, Run! is also completely focused on storytelling. My co-creator Adrian Hon and I were talking about doing a project together. He said: “Let’s do something to make running more fun.” I said: “How about if we do a story where you’re being chased by zombies?” And here we are.

When you play the game, you’re immersed in a world where every run makes you a hero – you’re collecting supplies, saving a child from no man’s land, investigating the mystery of how the apocalypse started. I’ve always focused on the storytelling being good. And it works. Players of the game become so attached to the characters that many of them report laughing out loud or even “crying while running”.

One of my jokes about storytelling in video games is that the way we tend to talk about it – in the games industry, in games journalism, even in marketing copy – is very much “never mind the quality, feel the width”. We say things like “this game has 100-plus hours of story” or “this game contains more than a million words”. Imagine marketing a movie saying that the script contains 29,000 words. Or selling a novel on the basis that it’ll take a long time to read.

Zombies, Run! app artwork.
Chasing the narrative … Zombies, Run! Illustration: Simon Garbutt/Zombies Run! Ltd

That’s not how you do it. You tell the story. You give a hook. You say: “A single woman comes home one evening to find a man claiming to be her husband living in her house. And when he goes up to the attic, a different husband comes down in his place.” Now you can’t wait to find out what happens next. (That, incidentally, is the brilliant comic novel The Husbands by Holly Gramazio – who I think is the only other bestselling novelist to be also making her own video games.)

So, now I own a games company, what am I going to do? My feeling is that I must focus on the fundamentals. There’s a world of games out there that thinks it can replace writers with AI large language models. I think that’s going to make writing worse and worse. AI writing is fine for boilerplate text that is always roughly the same. It’s fine for non-writers to get their expertise into the world. But storytelling is different. It is human minds finding companionship with other human minds – we need stories, fundamentally, to feel less alone. To know that other people have been through things a bit like what we have. Things that make us laugh, and cry, even while running. You get that from work that is not the same as everything else, you get it from the unique work of other individual human minds.

And actually, Zombies, Run! has always been a universe with strong values. We’re not a rightwing, rugged-individualism apocalypse, where one lone person can get through with just their guns. In our world – as in the real world – humans survive by working together.

While we’re still going to have many exciting fleeing zombies, battling-the-undead storytelling, I think there’s probably also room in the ZR! universe for a 10-mission arc where you have to find all the figurines and paints you need to complete an expansion set in “Demons and Darkness”; or one where you’re working on bringing an overgrown garden back to blooming, beautiful life; or setting up and running the first post-apocalyptic travelling library while also trying to work out what happened to the first librarian who’s mysteriously disappeared, leaving only a series of cryptic notes in an old manuscript.

After all, I do think this is quite a good time in the world to be thinking about how to rebuild after a series of catastrophic events.

Selling story by the yard and not by a story hook is a marker, I think, of a lack of confidence in the form. We don’t need to lack confidence. Games are the biggest entertainment industry in the world. If we want to be taken seriously, we need to take ourselves seriously. Stop talking about the width, start talking about the quality.

What to play

Evil Egg game
Shoot everything that moves … Evil Egg. Photograph: Ivy Sly

It was the 20th anniversary of Xbox 360 recently, and one name that’s cropped up in every list of the console’s best games is the compulsive retro twin-stick shooter Geometry Wars. If you’re yearning for something similar, you must immediately download Evil Egg , a frenzied twin-stick blaster with gorgeous Commodore 64-style visuals and sound effects. Shoot everything that moves, hit the left trigger to boost and collect hearts to stay alive.

At first it’s a bewildering mass of rainbow pixels but as you detonate wave after wave of glitchy space pests, you begin to understand the patterns of different enemies and earn upgrades such as the executioner’s sword, which takes out foes in an orbital slash of laser particles. Evil Egg is polished, exciting, wild to look at, and has such a brilliant understanding of the genre and its unique dynamics. It’s free on Steam but I implore you to download it on Itch.io and name your own price. Keith Stuart

Available on: PC
Estimated playtime:
10-plus hours

skip past newsletter promotion

What to read

Horses game
Controversial … Horses. Photograph: Santa Ragione
  • There has been a lot of writing about Horses , the art game recently banned by digital platforms Steam and Epic Games Store. I particularly enjoyed this post by writer Harper Jay MacIntyre, which considers Horses, formalism and the trans experience. The article manages to bring in so many elements of modern games criticism and academia while providing a highly personal response to the game.

  • The most interesting retro game articles are the ones that reassess lost or derided titles rather than merely celebrate the classics. Was the Atari 2600 version of Pac-Man the worst game ever? Not according to this compelling analysis from Garrett Martin at AV Club who sees it as a misunderstood brutalist gem. I find myself in agreement.

  • It is also nice to see a legendary game justly praised in an interesting way. The BFI’s look at the legacy of Time Crisis considers the gun game in relation to cinema, referencing Beverly Hills Cop and Run Lola Run rather than just comparing it to Sega’s similar Virtua Cop.

What to click

Question Block

Cyberpunk 2077 PC screenshot
Gorgeous … Cyberpunk 2077. Photograph: CD Projekt

This one comes from reader, Rebecca :

“My elderly grand ad is coming to stay with us for Christmas and wants to see what’s happening with video game graphics these days. Are there any titles you recommend that will let him explore beautiful locations without getting shot at?! We have a PlayStation 5 and a slightly out-of-date PC.”

Your best option here is to go with one of the big open-world adventures and just find an area with no enemies around. If you subscribe to PlayStation Plus Extra you could download and prepare the gorgeous Cyberpunk 2077 , Marvel’s Spider-Man , Ghost of Tsushima or Assassin’s Creed Valhalla , which all give you quite quick (and safe) access to incredible vistas. You could bypass the threat of imminent violence completely by going for a driving game, such as Forza Horizon 4 on PC (which is set in Britain so he may even spot some familiar scenery). Alternatively, if visual realism isn’t as important as beauty, a cosier indie title such as Tchia , Journey or Firewatch may fit the bill. Really hope he enjoys them!

We’re still looking for your game of the year nominations for an end of year special – let us know yours by hitting reply or emailing us on pushingbuttons@theguardian.com .

Why a secure software development life cycle is critical for manufacturers

Bleeping Computer
www.bleepingcomputer.com
2025-12-10 15:00:10
Recent supply-chain breaches show how attackers exploit development tools, compromised credentials, and malicious NPM packages to infiltrate manufacturing and production environments. Acronis explains why secure software development life cycle (SSDLC) practices are now critical for evaluating partne...
Original Article

Acronis manufacturing

For all the scary talk about cyberattacks from vendors and industry experts, relatively few attacks are actually devastating. But the Jaguar Land Rover (JLR) attack was.

The JLR breach wasn’t some nuisance attack that cost a few hundred thousand dollars. It completely shut down production for weeks, will likely cost the British economy more than $2 billion and affected as many as 5000 organizations, according to Reuters. Real people lost their jobs.

The U.K. government had to step in with a loan guarantee of nearly $2 billion to keep JLR running.

A nightmare come true

The JLR attack was the nightmare scenario manufacturers knew could theoretically happen. When it actually happened, it sent many manufacturing organizations scrambling to figure out how they could prevent suffering the same fate.

One issue became clear immediately: The supply chain is one of the weakest security links for manufacturers. After all, the JLR attack originated in the company’s supply chain with the compromise of credentials used by third-party contractors.

How are attackers breaking into supply chains? One powerful tactic involves targeting the development tools and processes for software applications that manufacturers and their supply chain partners use.

It might not be the type of attack that brought JLR down, or it might; full details of the origin of the attack are not public. But an important lesson is that if manufacturers and their supply chain partners are not vigilant about making sure software providers use secure development practices, they’re leaving themselves open to the level of attack JLR suffered.

Supply chains in the crosshairs

Attacks on supply chains via software development aren’t new; but they remain powerful and dangerous. Some of the most famous cyberattacks ever committed involved the tactic, including an infamous 2020 attack on SolarWinds , a 2021 attack on Kaseya VSA and a 2023 attack on VoIP provider 3CX.

Attackers have developed a new approach more recently: They’re releasing malicious node package managers (NPMs) into the software development process. JavaScript developers use NPMs to share and install reusable code.

When NPMs are malicious , an attack can spread quickly, endure for months and find its way into all sorts of applications.

One of the more recent examples of NPM targeting is the Shai-Hulud cryptostealer, which has reportedly compromised more than 500 NPM packages, including several used by cybersecurity providers.

NPM attacks are just one method attackers have found to work their way into supply chains. For example, attackers can also compromise software vendors’ updates and exploit software vulnerabilities.

The bottom line is that supply chain applications are vulnerable, and manufacturers need to be sure that the apps their partners use are safe.

A need for closer evaluations

With their supply chains at risk, manufacturers need to evaluate existing and potential partners with secure software development life cycle (SSDLC) practices.

In most operational technology (OT) environments , procurement evaluations focus heavily on vendor financial health, service-level agreements and infrastructure security. But they often overlook vulnerabilities in the software development process — issues that can sabotage supply chain apps.

That’s why ensuring rigorous SSDLC practices is so important for both manufacturers and their supply chain partners. When manufacturers don’t ensure SSDLC practices among their partners, they risk facing operational downtime, financial losses, compliance violations and reputational damage.

SSDLC: More than compliance checkboxes

What makes SSDLC so important and effective? For starters, it’s mandated under the EU NIS 2 directive, which requires a formal, documented SSDLC process.

It also represents a fundamental shift from treating security as a post-development add-on to embedding it throughout the software creation process.

A vulnerability caught during requirements analysis may take hours to fix. That same flaw discovered post-release could require weeks of emergency response.

In practice, mature SSDLC implementation includes:

  • Security by design: Security requirements defined and threats modeled before any code is written.
  • Secure coding practices: Developers trained in security, with mandatory code review and automated security testing.
  • Dependency management: Third-party components vetted, tracked, and maintained through software bill of materials (SBOM) practices.
  • Secure release pipelines: Updates signed, integrity checked and delivered through hardened channels.
  • Vulnerability management: Coordinated disclosure processes and defined response timelines for security issues.

For manufacturers, that means the software controlling production lines, managing critical systems and connecting industrial operations has security embedded from the first line of code to final deployment.

Reliable proof of secure development: IEC 62443-4-1 certification

Industry certifications are a reliable measure of use of SSDLC in the development process. While various security certifications exist, IEC 62443-4-1 holds particular significance for manufacturing supply chains.

he IEC 62443 family of standards specifically addresses industrial automation and control systems security, the exact environment where manufacturers operate.

Within this framework, IEC 62443-4-1 focuses exclusively on secure product development lifecycle requirements and provides one of the most rigorous and relevant standards for evaluating OT software suppliers.

Unlike general information security frameworks, IEC 62443-4-1 certification demonstrates that a supplier has implemented practices specifically designed for industrial environments where uptime is critical, patching windows may be limited and physical-world consequences can result from software failures.

IEC 62443-4-1 certification provides concrete, independently verified evidence that software suppliers are systematically engineering security into every product, not just promising it. For original equipment manufacturers (OEMs), system integrators and end customers in manufacturing and critical infrastructure, this provides a critical foundation of trust.

A rethinking of evaluations

When evaluating partners with SSDLC in mind, manufacturers should:

  • Embed SSDLC criteria into procurement processes: Include secure development requirements in RFPs and contracts so suppliers understand expectations from the outset.
  • Request structured evidence: Demand certification scopes, auditor reports, SBOM records and testing results as part of due diligence.
  • Prioritize relevant certifications: Look specifically for IEC 62443-4-1 for product vendors operating in industrial environments, supported by ISO/IEC 27001 for organizational security governance and cloud-specific certifications where applicable.
  • Evaluate maturity continuously: Move beyond binary questionnaires and assess suppliers along a maturity continuum, with ongoing monitoring built into vendor management.

Manufacturers can no longer afford to treat supplier security evaluation as an exercise focused solely on infrastructure and operations. The development lifecycle is where vulnerabilities originate — and where manufacturers must make sure they’re prevented.

About Acronis TRU

The Acronis Threat Research Unit (TRU) is a team of cybersecurity experts specializing in threat intelligence, AI and risk management. The TRU team researches emerging threats, provides security insights, and supports IT teams with guidelines, incident response and educational workshops.

See the latest TRU research

Sponsored and written by Acronis .

New Spiderman phishing service targets dozens of European banks

Bleeping Computer
www.bleepingcomputer.com
2025-12-10 14:53:00
A new phishing kit called Spiderman is being used to target customers of dozens of European banks and cryptocurrency holders with pixel-perfect cloned sites impersonating brands and organizations. [...]...
Original Article

New Spiderman phishing service targets dozens of European banks

A new phishing kit called Spiderman is targeting customers of numerous European banks and cryptocurrency services using pixel-perfect replicas of legitimate sites.

The platform allows cybercriminals to launch phishing campaigns that can capture login credentials, two-factor authentication (2FA) codes, and credit card data.

The Spiderman phishing kit, analyzed by researchers at Varonis, targets financial institutions in five countries, including major brands such as Deutsche Bank, ING, Comdirect, Blau, O2, CaixaBank, Volksbank, and Commerzbank.

The researchers observed that it can create phishing pages for online portals of fintech companies, such as the Swedish service Klarna and PayPal. It can also steal seed phrases for Ledger, Metamask, and Exodus cryptocurrency wallets.

Some of the targeted platforms
Some of the targeted platforms
Source: Varonis

“Because Spiderman is modular, new banks, portals, and authentication methods can be added. As European countries roll out updated e-banking flows, this kit will likely evolve in parallel,” Varonis says in its report .

The researchers found that Spiderman is popular among cybercriminals, with one of its groups on Signal counting 750 members.

From the dashboard, operators can view victim sessions in real time, capture credentials, perform one-click data export, intercept PhotoTAN/one-time pass (OTP) codes in real time, and harvest credit card details.

Real-time interaction with victim through the control panel
Real-time interaction through the control panel
Source: Varonis

PhotoTAN is an OTP system used by many banks in Europe, where a colored mosaic image is displayed during login or transaction approval steps, which the user must scan with the bank’s app to proceed.

The app decodes the mosaic and displays a transaction-specific OTP that must be entered back into the banking site.

Although PhotoTAN capture isn’t a novel feature in phishing kits, it is considered a “must-have” for platforms targeting European institutions.

Spiderman operators can configure their targeting scope from the control panel, limiting it to specific countries, adding ISP allowlisting, device-type filters (mobile or desktop users), and setting up redirects for visitors that don’t qualify for phishing attacks.

Varonis researchers warn that the data captured by Spiderman can lead to banking account takeover, SIM swapping, credit card fraud, and identity theft.

All phishing kits rely on victims clicking on a link that takes them to a fake login page, so the best protection is to always confirm you’re on the official domain before entering your credentials, and double-checking for browser-in-the-browser windows that could display the correct URL.

Receiving an SMS or PhotoTAN prompt on your device that is not linked to an action you made is a sign of a takeover attempt and should be reported to the bank immediately.

tines

Break down IAM silos like Bitpanda, KnowBe4, and PathAI

Broken IAM isn't just an IT problem - the impact ripples across your whole business.

This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.

A Developer Accidentally Found CSAM in AI Data. Google Banned Him For It

403 Media
www.404media.co
2025-12-10 14:45:55
Mark Russo reported the dataset to all the right organizations, but still couldn't get into his accounts for months....
Original Article

Google suspended a mobile app developer’s accounts after he uploaded AI training data to his Google Drive. Unbeknownst to him, the widely used dataset, which is cited in a number of academic papers and distributed via an academic file sharing site, contained child sexual abuse material. The developer reported the dataset to a child safety organization, which eventually resulted in the dataset’s removal, but he claims Google’s has been "devastating.”

A message from Google said his account “has content that involves a child being sexually abused or exploited. This is a severe violation of Google's policies and might be illegal.”

The incident shows how AI training data, which is collected by indiscriminately scraping the internet, can impact people who use it without realizing it contains illegal images. The incident also shows how hard it is to identify harmful images in training data composed of millions of images, which in this case were only discovered accidentally by a lone developer who tripped Google’s automated moderation tools.

💡

Have you discovered harmful materials in AI training data ? I would love to hear from you. Using a non-work device, you can message me securely on Signal at @emanuel.404‬. Otherwise, send me an email at emanuel@404media.co.

In October, I wrote about the NudeNet dataset , which contains more than 700,000 images scraped from the internet, and which is used to train AI image classifiers to automatically detect nudity. The Canadian Centre for Child Protection (C3P) said it found more than 120 images of identified or known victims of CSAM in the dataset, including nearly 70 images focused on the genital or anal area of children who are confirmed or appear to be pre-pubescent. “In some cases, images depicting sexual or abusive acts involving children and teenagers such as fellatio or penile-vaginal penetration,” C3P said.

In October, Lloyd Richardson, C3P's director of technology, told me that the organization decided to investigate the NudeNet training data after getting a tip from an individual via its cyber tipline that it might contain CSAM. After I published that story, a developer named Mark Russo contacted me to say that he’s the individual who tipped C3P, but that he’s still suffering the consequences of his discovery.

Russo, an independent developer, told me he was working on an on-device NSFW image detector. The app runs locally and can detect images locally so the content stays private. To benchmark his tool, Russo used NudeNet, a publicly available dataset that’s cited in a number of academic papers about content moderation. Russo unzipped the dataset into his Google Drive. Shortly after, his Google account was suspended for “inappropriate material.”

On July 31, Russo lost access to all the services associated with his Google account, including his Gmail of 14 years, Firebase, the platform that serves as the backend for his apps, AdMob, the mobile app monetization platform, and Google Cloud.

“This wasn’t just disruptive — it was devastating. I rely on these tools to develop, monitor, and maintain my apps,” Russo wrote on his personal blog . “With no access, I’m flying blind.”

Russo filed an appeal of Google’s decision the same day, explaining that the images came from NudeNet, which he believed was a reputable research dataset with only adult content. Google acknowledged the appeal, but upheld its suspension, and rejected a second appeal as well. He is still locked out of his Google account and the Google services associated with it.

Russo also contacted the National Center for Missing & Exploited Children (NCMEC) and C3P. C3P investigated the dataset, found CSAM, and notified Academic Torrents, where the NudeNet dataset was hosted, which removed it.

As C3P noted at the time, NudeNet was cited or used by more than 250 academic works. A non-exhaustive review of 50 of those academic projects found 134 made use of the NudeNet dataset, and 29 relied on the NudeNet classifier or model. But Russo is the only developer we know about who was banned for using it, and the only one who reported it to an organization that investigated that dataset and led to its removal.

After I reached out for comment, Google investigated Russo’s account again and reinstated it.

“Google is committed to fighting the spread of CSAM and we have robust protections against the dissemination of this type of content,” a Google spokesperson told me in an email. “In this case, while CSAM was detected in the user account, the review should have determined that the user's upload was non-malicious. The account in question has been reinstated, and we are committed to continuously improving our processes.”

“I understand I’m just an independent developer—the kind of person Google doesn’t care about,” Russo told me. “But that’s exactly why this story matters. It’s not just about me losing access; it’s about how the same systems that claim to fight abuse are silencing legitimate research and innovation through opaque automation [...]I tried to do the right thing — and I was punished.”

About the author

Emanuel Maiberg is interested in little known communities and processes that shape technology, troublemakers, and petty beefs. Email him at emanuel@404media.co

Emanuel Maiberg

The New Kindle Scribes Are Great, but Not Great Enough

Hacker News
www.wired.com
2025-12-10 14:39:05
Comments...
Original Article

Amazon has been expanding its e-reader lineup over the past year to feature a panoply of color-screen options, and as of today, those options include a Kindle Scribe, Amazon's e-reader that adds digital notebook capabilities and a larger screen to make writing and drawing easier. The new Kindle Scribe Colorsoft arrives with a hefty price tag and bigger screen than the past iteration, and it doesn't arrive alone: a new third-generation regular Kindle Scribe will arrive with it, featuring a matching design but no color screen.

While I like both devices' new home screen and taller, slimmer design, I'm not sure they're worth investing in unless you're desperate for a color Kindle that doubles as a digital notebook. They're good devices overall, but in a saturated space of color-screen e-readers and digital notebooks with better price tags (or more features for a similar price), these new Kindle Scribes aren't necessarily a must-buy.

Generational Throw-Down

Three digital tablets side by side

Left to right: Kindle Scribe 2nd Gen, Kindle Scribe 3rd Gen, Kindle Scribe Colorsoft

Photograph: Nena Farrell

While I'm not surprised to see the arrival of a color-screen version of the Kindle Scribe, I am surprised we're already getting a new base Scribe after a new one arrived just last year . This is the third generation of this device, with the original launching back in the fall of 2022. Amazon says it's “always innovating and looking to bring those innovations to customers as quickly as possible," but it still feels like a fast move to replace it so soon.

There are plenty of changes to be seen when comparing these new Scribe models to the old ones, but aside from the color screen, the Kindle Scribe (3rd Gen) and Kindle Scribe Colorsoft are otherwise identical to each other. Both have the new 11-inch screen, with the taller and slimmer form factor that trims off the second-generation model's thick, one-sided bezel that functioned as a handle. Instead, both new Scribes feature a uniform bezel around the entire screen, and the power button has been moved to the upper right-hand side. Both also have a great battery life, though it's worth noting that while the new Scribe maintains the previous version's 12 weeks of battery life for reading, the Scribe Colorsoft has a shorter, eight-week battery life for reading. Both promise up to three weeks of battery life for writing.

Two digital tablets side by side with the screens showing book covers and handwritten notes

Left to right: Kindle Scribe 3rd Gen, Kindle Scribe Colorsoft

Photograph: Nena Farrell

There's a new homepage design I really like, which does well to highlight all the best features of the Scribe. There's a premade notebook accessible at the top, called Quick Notes, though you can't write straight onto it, even though it looks that way. Instead, you can tap on it and instantly enter that notebook, and it also provides a visual of what you last wrote. I like using it for the day's to-do list. Then there's a “Jump Back In” section on the right that has your last five things you opened, whether that was an ebook or a digital notebook. Below that, you have rows of books like you'd see on other Kindles, for what's in your library and what else you might want to read. It's a handy starting point, and I like that I can easily access notebooks and books that I've recently used without needing to open different tabs.

Speaking of tabs, the Notebooks tab has been renamed to Workspace—you can sync it with Google Drive and Microsoft OneDrive to access your PDFs to mark them up, and you can export the annotated versions. It's a nice feature in theory, but I don't think it's that useful since you can't actually edit the document itself. If you wanted it to reflect, say, edits you need to make on a paper, you'd have to export your marked-up version and then edit your Google Doc separately. It's a fun option if you're strictly an editor, but as a writer, it feels too limited to be super useful.

Person's hand holding a white stylus for the Kindle Scribe Colorsoft tablet and ereader

The new Kindle Scribe stylus

Photograph: Nena Farrell

Both Scribe models come with a stylus, which also has a new look. While the devices are taller and thinner, the stylus is shorter, thicker, and feels a little heavier. There's still a little shortcut button that defaults to highlighter mode (though you can edit it to be something else) and an eraser on the back, but I preferred the lighter feel of the old stylus. The writing feel is nice and smooth with a touch of resistance, though I wouldn't call it as realistic as other digital notebooks I've tried. I don't count that against it, either; I like the smooth feel, and it makes it easy to write in loopy cursive or sketch.

The Colorful Arrival

Kindle Scribe Colorsoft a digital tablet with the screen showing an ebook with highlighted text

Kindle Scribe Colorsoft

Photograph: Nena Farrell

Of the two Kindles that arrived today, I will admit the Scribe Colorsoft is the more exciting release. It's Kindle's first colorful take on a digital notebook (and the fourth edition to its e-reader collection of Colorsofts), allowing Kindle to finally compete against the colorful options of our favorite digital notebooks from the likes of Kobo and reMarkable .

Like the Colorsoft and other similar colorful e-readers, the Scribe Colorsoft has 150 ppi (pixels per inch) of color, and 300 ppi for black-and-white. It has a new quad-core chip that promises to support both the color screen and various AI features that the Scribe has or will have soon (more on that below). You'll get 10 colors for your pens, including black and gray, and five highlight colors. You're able to use those colors both on the Workspace tab and on your ebooks, allowing you to highlight and underline in any color you choose.

Kindle Scribe Colorsoft a digital tablet with the screen showing watercolor brushstrokes

Kindle Scribe Colorsoft's new shader tool

Photograph: Nena Farrell

You'll also get a new tool for drawing. You'll get the pen styles that we saw in the previous generation (pen, fountain, marker, and pencil), but now there's a new shader tool that lets you layer light shades of color on top of each other for a more detailed, almost watercolor-like look. I'm not sure this is a tool I'll use often—I wouldn't call the Scribe Colorsoft a true drawing tablet, and would still prefer to draw with Procreate on my old iPad if I were going to work on digital art, but it's a nice feature to have.

Bigger Brain

Kindle Scribe Colorsoft a digital tablet with artificial intelligence features being shown on 3 side by side images

The Summarize tool in action

Photograph: Nena Farrell

Both new Kindle Scribes have larger storage capacity (there's no 16 GB option anymore, just 32 GB and 64 GB) and a new quad-core chip that promises to support the various AI-powered features on the Scribe, including things like summarizing your notebook page or refining your writing, both of which you can do on the older Scribe. Newer AI-powered features like Story So Far, which summarizes books you're reading up until the point you've read, and Ask This Book, which lets you ask spoiler-free questions about books you're reading, won't be available until next year.

The usual problem with AI persists in that it can't promise accuracy. I tried summarizing a page about my workout schedule on the new Scribe Colorsoft, but the AI managed to remove one of my day's plans and had mixed results translating words like “lift” and “hoop” thanks to my cursive-like handwriting. As someone who writes a lot of notes, I usually don't find myself reaching for these summary tools. Maybe they're of interest to you, but to me, they're just bloat. I'm looking forward to seeing the summarization tools, but I'm hesitant to trust them.

Marginal Limits

Kindle Scribe Colorsoft a digital tablet with the screen showing an ebook with handwritten highlights and scribbles on...

Expanded margins, sticky notes, and Active Canvas features

Photograph: Nena Farrell

What's not different is my primary complaint about the Kindle Scribe compared to other e-readers that have digital notebook capabilities: You can't write in the margins or directly on the pages. Instead, the Active Canvas feature, which allows for notes around but not on the text, is still the only way to write notes in your ebook of choice. It was something we didn't love about the previous model (8/10, WIRED Recommends), and I'm sad to see that the new edition didn't fix it. I'd personally rather see that ability than all these AI features. You can also add a sticky note or use the expanded margins feature, but it's not the same as how fun it is to write right on the page of my Kobo e-reader . Maybe the fourth generation will fix that for us.

Ultimately, if you already have the second-generation Scribe , I don't think you need to upgrade. The older Scribe will also get the new homepage added to it in 2026. It also bears mentioning that at this price point, you might as well upgrade to a reMarkable tablet , which has more capabilities and accessories to transform it into an e-paper laptop of sorts. What was once a great deal for an e-reader and limited digital notebook is now a pretty big investment for a still-limited device when compared with the competition. I'm sure the pared-down Scribe due next year (without the front light) will bring the price down, but having no front light is a pretty irritating cost if you're someone who uses their device at night or in a darkened classroom or airplane. Neither of the new Kindle Scribes is a bad device, but if you're looking for a great e-reader that doubles as a digital notebook, neither of them would be my go-to pick.

New Yorkers Are Literally Breathing Easier After Congestion Pricing

hellgate
hellgatenyc.com
2025-12-10 14:33:15
And other links to start your hump day....
Original Article

It's Wednesday, you deserve a treat, like an episode of the Hell Gate Podcast! Listen here , or wherever you get your podcasts.

When congestion pricing finally went into effect in January , advocates for it lauded the money it would raise for the MTA, how it would lessen traffic and noise in Lower Manhattan, and hopefully, how it would lead to some salutary impact on air quality in the congestion pricing zone below 59th Street.

Almost a year later, all three of those things have happened, and then some. A new study by researchers at Cornell University found that not only has the air gotten cleaner —much cleaner —in the congestion pricing zone, but the tolls have also led to significantly cleaner air across the entire city. Specifically, congestion pricing has helped cut down the number of heavy-duty trucks coming into the city that spew heavy particulate matter, which causes asthma and other harmful health impacts.

Give us your email to read the full story

Sign up now for our free newsletters.

Sign up

Security updates for Wednesday

Linux Weekly News
lwn.net
2025-12-10 14:16:11
Security updates have been issued by AlmaLinux (abrt and kernel), Debian (libpng1.6, libsoup2.4, pdns-recursor, webkit2gtk, and wordpress), Fedora (imhex, libwebsockets, lunasvg, python3-docs, and python3.14), Mageia (python3 and webkit2), Red Hat (abrt, firefox, mysql8.4, and postgresql:15), Slackw...
Original Article
Dist. ID Release Package Date
AlmaLinux ALSA-2025:22760 8 abrt 2025-12-10
AlmaLinux ALSA-2025:22854 10 kernel 2025-12-10
Debian DSA-6076-1 stable libpng1.6 2025-12-10
Debian DLA-4398-1 LTS libsoup2.4 2025-12-09
Debian DSA-6077-1 stable pdns-recursor 2025-12-10
Debian DLA-4399-1 LTS webkit2gtk 2025-12-10
Debian DSA-6074-1 stable webkit2gtk 2025-12-09
Debian DSA-6075-1 stable wordpress 2025-12-09
Fedora FEDORA-2025-9b6b49071f F42 imhex 2025-12-10
Fedora FEDORA-2025-58c0baba42 F43 imhex 2025-12-10
Fedora FEDORA-2025-0c12fa2541 F42 libwebsockets 2025-12-10
Fedora FEDORA-2025-9b6b49071f F42 lunasvg 2025-12-10
Fedora FEDORA-2025-58c0baba42 F43 lunasvg 2025-12-10
Fedora FEDORA-2025-e235793f10 F43 python3-docs 2025-12-10
Fedora FEDORA-2025-e235793f10 F43 python3.14 2025-12-10
Mageia MGASA-2025-0324 9 python3 2025-12-09
Mageia MGASA-2025-0325 9 webkit2 2025-12-09
Red Hat RHSA-2025:22760-01 EL8 abrt 2025-12-10
Red Hat RHSA-2025:23031-01 EL8.2 abrt 2025-12-10
Red Hat RHSA-2025:23033-01 EL8.4 abrt 2025-12-10
Red Hat RHSA-2025:23032-01 EL8.6 abrt 2025-12-10
Red Hat RHSA-2025:23030-01 EL8.8 abrt 2025-12-10
Red Hat RHSA-2025:23035-01 EL10 firefox 2025-12-10
Red Hat RHSA-2025:23008-01 EL10 mysql8.4 2025-12-10
Red Hat RHSA-2025:22728-01 EL9.4 postgresql:15 2025-12-10
Red Hat RHSA-2025:23023-01 EL9.6 postgresql:15 2025-12-10
Slackware SSA:2025-343-01 mozilla 2025-12-09
SUSE SUSE-SU-2025:4333-1 SLE15 gegl 2025-12-09
SUSE SUSE-SU-2025:4335-1 SLE15 oS15.6 gegl 2025-12-09
SUSE SUSE-SU-2025:4346-1 SLE12 gnutls 2025-12-10
SUSE SUSE-SU-2025:4337-1 SLE15 SES7.1 oS15.6 go1.24 2025-12-10
SUSE SUSE-SU-2025:4336-1 SLE15 SES7.1 oS15.6 go1.25 2025-12-10
SUSE openSUSE-SU-2025:15801-1 TW libpng16-16 2025-12-09
SUSE SUSE-SU-2025:21128-1 SLE-m6.2 openssh 2025-12-10
SUSE SUSE-SU-2025:4334-1 SLE12 postgresql13 2025-12-09
SUSE SUSE-SU-2025:1004-2 SLE15 SES7.1 python-Jinja2 2025-12-10
SUSE SUSE-SU-2025:21084-1 SLE-m6.0 sssd 2025-12-09
Ubuntu USN-7917-1 22.04 24.04 25.04 25.10 fonttools 2025-12-09
Ubuntu USN-7918-1 16.04 18.04 20.04 22.04 24.04 25.04 25.10 netty 2025-12-10

RFC 9180 Hybrid Public Key Encryption

Lobsters
www.rfc-editor.org
2025-12-10 14:07:56
Comments...
Original Article
RFC 9180 HPKE February 2022
Barnes, et al. Informational [Page]

Hybrid Public Key Encryption

Abstract

This document describes a scheme for hybrid public key encryption (HPKE). This scheme provides a variant of public key encryption of arbitrary-sized plaintexts for a recipient public key. It also includes three authenticated variants, including one that authenticates possession of a pre-shared key and two optional ones that authenticate possession of a key encapsulation mechanism (KEM) private key. HPKE works for any combination of an asymmetric KEM, key derivation function (KDF), and authenticated encryption with additional data (AEAD) encryption function. Some authenticated variants may not be supported by all KEMs. We provide instantiations of the scheme using widely used and efficient primitives, such as Elliptic Curve Diffie-Hellman (ECDH) key agreement, HMAC-based key derivation function (HKDF), and SHA2.

This document is a product of the Crypto Forum Research Group (CFRG) in the IRTF.

Status of This Memo

This document is not an Internet Standards Track specification; it is published for informational purposes.

This document is a product of the Internet Research Task Force (IRTF). The IRTF publishes the results of Internet-related research and development activities. These results might not be suitable for deployment. This RFC represents the consensus of the Crypto Forum Research Group of the Internet Research Task Force (IRTF). Documents approved for publication by the IRSG are not candidates for any level of Internet Standard; see Section 2 of RFC 7841.

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc9180 .

Table of Contents

1. Introduction

Encryption schemes that combine asymmetric and symmetric algorithms have been specified and practiced since the early days of public key cryptography, e.g., [ RFC1421 ] . Combining the two yields the key management advantages of asymmetric cryptography and the performance benefits of symmetric cryptography. The traditional combination has been "encrypt the symmetric key with the public key." "Hybrid" public key encryption (HPKE) schemes, specified here, take a different approach: "generate the symmetric key and its encapsulation with the public key." Specifically, encrypted messages convey an encryption key encapsulated with a public key scheme, along with one or more arbitrary-sized ciphertexts encrypted using that key. This type of public key encryption has many applications in practice, including Messaging Layer Security [ MLS-PROTOCOL ] and TLS Encrypted ClientHello [ TLS-ECH ] .

Currently, there are numerous competing and non-interoperable standards and variants for hybrid encryption, mostly variants on the Elliptic Curve Integrated Encryption Scheme (ECIES), including ANSI X9.63 (ECIES) [ ANSI ] , IEEE 1363a [ IEEE1363 ] , ISO/IEC 18033-2 [ ISO ] , and SECG SEC 1 [ SECG ] . See [ MAEA10 ] for a thorough comparison. All these existing schemes have problems, e.g., because they rely on outdated primitives, lack proofs of indistinguishable (adaptive) chosen-ciphertext attack (IND-CCA2) security, or fail to provide test vectors.

This document defines an HPKE scheme that provides a subset of the functions provided by the collection of schemes above but specified with sufficient clarity that they can be interoperably implemented. The HPKE construction defined herein is secure against (adaptive) chosen ciphertext attacks (IND-CCA2-secure) under classical assumptions about the underlying primitives [ HPKEAnalysis ] [ ABHKLR20 ] . A summary of these analyses is in Section 9.1 .

This document represents the consensus of the Crypto Forum Research Group (CFRG).

2. Requirements Notation

The key words " MUST ", " MUST NOT ", " REQUIRED ", " SHALL ", " SHALL NOT ", " SHOULD ", " SHOULD NOT ", " RECOMMENDED ", " NOT RECOMMENDED ", " MAY ", and " OPTIONAL " in this document are to be interpreted as described in BCP 14 [ RFC2119 ] [ RFC8174 ] when, and only when, they appear in all capitals, as shown here.

3. Notation

The following terms are used throughout this document to describe the operations, roles, and behaviors of HPKE:

(skX, pkX) :
A key encapsulation mechanism (KEM) key pair used in role X, where X is one of S, R, or E as sender, recipient, and ephemeral, respectively; skX is the private key and pkX is the public key.
pk(skX) :
The KEM public key corresponding to the KEM private key skX .
Sender (S):
Role of entity that sends an encrypted message.
Recipient (R):
Role of entity that receives an encrypted message.
Ephemeral (E):
Role of a fresh random value meant for one-time use.
I2OSP(n, w) :
Convert non-negative integer n to a w -length, big-endian byte string, as described in [ RFC8017 ] .
OS2IP(x) :
Convert byte string x to a non-negative integer, as described in [ RFC8017 ] , assuming big-endian byte order.
concat(x0, ..., xN) :
Concatenation of byte strings. concat(0x01, 0x0203, 0x040506) = 0x010203040506 .
random(n) :
A pseudorandom byte string of length n bytes
xor(a,b) :
XOR of byte strings; xor(0xF0F0, 0x1234) = 0xE2C4 . It is an error to call this function with two arguments of unequal length.

4. Cryptographic Dependencies

HPKE variants rely on the following primitives:

  • A key encapsulation mechanism (KEM):

    • GenerateKeyPair() : Randomized algorithm to generate a key pair (skX, pkX) .
    • DeriveKeyPair(ikm) : Deterministic algorithm to derive a key pair (skX, pkX) from the byte string ikm , where ikm SHOULD have at least Nsk bytes of entropy (see Section 7.1.3 for discussion).
    • SerializePublicKey(pkX) : Produce a byte string of length Npk encoding the public key pkX .
    • DeserializePublicKey(pkXm) : Parse a byte string of length Npk to recover a public key. This function can raise a DeserializeError error upon pkXm deserialization failure.
    • Encap(pkR) : Randomized algorithm to generate an ephemeral, fixed-length symmetric key (the KEM shared secret) and a fixed-length encapsulation of that key that can be decapsulated by the holder of the private key corresponding to pkR . This function can raise an EncapError on encapsulation failure.
    • Decap(enc, skR) : Deterministic algorithm using the private key skR to recover the ephemeral symmetric key (the KEM shared secret) from its encapsulated representation enc . This function can raise a DecapError on decapsulation failure.
    • AuthEncap(pkR, skS) (optional): Same as Encap() , and the outputs encode an assurance that the KEM shared secret was generated by the holder of the private key skS .
    • AuthDecap(enc, skR, pkS) (optional): Same as Decap() , and the recipient is assured that the KEM shared secret was generated by the holder of the private key skS .
    • Nsecret : The length in bytes of a KEM shared secret produced by this KEM.
    • Nenc : The length in bytes of an encapsulated key produced by this KEM.
    • Npk : The length in bytes of an encoded public key for this KEM.
    • Nsk : The length in bytes of an encoded private key for this KEM.
  • A key derivation function (KDF):

    • Extract(salt, ikm) : Extract a pseudorandom key of fixed length Nh bytes from input keying material ikm and an optional byte string salt .
    • Expand(prk, info, L) : Expand a pseudorandom key prk using optional string info into L bytes of output keying material.
    • Nh : The output size of the Extract() function in bytes.
  • An AEAD encryption algorithm [ RFC5116 ] :

    • Seal(key, nonce, aad, pt) : Encrypt and authenticate plaintext pt with associated data aad using symmetric key key and nonce nonce , yielding ciphertext and tag ct . This function can raise a MessageLimitReachedError upon failure.
    • Open(key, nonce, aad, ct) : Decrypt ciphertext and tag ct using associated data aad with symmetric key key and nonce nonce , returning plaintext message pt . This function can raise an OpenError or MessageLimitReachedError upon failure.
    • Nk : The length in bytes of a key for this algorithm.
    • Nn : The length in bytes of a nonce for this algorithm.
    • Nt : The length in bytes of the authentication tag for this algorithm.

Beyond the above, a KEM MAY also expose the following functions, whose behavior is detailed in Section 7.1.2 :

  • SerializePrivateKey(skX) : Produce a byte string of length Nsk encoding the private key skX .
  • DeserializePrivateKey(skXm) : Parse a byte string of length Nsk to recover a private key. This function can raise a DeserializeError error upon skXm deserialization failure.

A ciphersuite is a triple (KEM, KDF, AEAD) containing a choice of algorithm for each primitive.

A set of algorithm identifiers for concrete instantiations of these primitives is provided in Section 7 . Algorithm identifier values are two bytes long.

Note that GenerateKeyPair can be implemented as DeriveKeyPair(random(Nsk)) .

The notation pk(skX) , depending on its use and the KEM and its implementation, is either the computation of the public key using the private key, or just syntax expressing the retrieval of the public key, assuming it is stored along with the private key object.

The following two functions are defined to facilitate domain separation of KDF calls as well as context binding:

def LabeledExtract(salt, label, ikm):
  labeled_ikm = concat("HPKE-v1", suite_id, label, ikm)
  return Extract(salt, labeled_ikm)

def LabeledExpand(prk, label, info, L):
  labeled_info = concat(I2OSP(L, 2), "HPKE-v1", suite_id,
                        label, info)
  return Expand(prk, labeled_info, L)

The value of suite_id depends on where the KDF is used; it is assumed implicit from the implementation and not passed as a parameter. If used inside a KEM algorithm, suite_id MUST start with "KEM" and identify this KEM algorithm; if used in the remainder of HPKE, it MUST start with "HPKE" and identify the entire ciphersuite in use. See Sections 4.1 and 5.1 for details.

4.1. DH-Based KEM (DHKEM)

Suppose we are given a KDF, and a Diffie-Hellman (DH) group providing the following operations:

  • DH(skX, pkY) : Perform a non-interactive Diffie-Hellman exchange using the private key skX and public key pkY to produce a Diffie-Hellman shared secret of length Ndh . This function can raise a ValidationError as described in Section 7.1.4 .
  • Ndh : The length in bytes of a Diffie-Hellman shared secret produced by DH() .
  • Nsk : The length in bytes of a Diffie-Hellman private key.

Then we can construct a KEM that implements the interface defined in Section 4 called DHKEM(Group, KDF) in the following way, where Group denotes the Diffie-Hellman group and KDF denotes the KDF. The function parameters pkR and pkS are deserialized public keys, and enc is a serialized public key. Since encapsulated keys are Diffie-Hellman public keys in this KEM algorithm, we use SerializePublicKey() and DeserializePublicKey() to encode and decode them, respectively. Npk equals Nenc . GenerateKeyPair() produces a key pair for the Diffie-Hellman group in use. Section 7.1.3 contains the DeriveKeyPair() function specification for DHKEMs defined in this document.

def ExtractAndExpand(dh, kem_context):
  eae_prk = LabeledExtract("", "eae_prk", dh)
  shared_secret = LabeledExpand(eae_prk, "shared_secret",
                                kem_context, Nsecret)
  return shared_secret

def Encap(pkR):
  skE, pkE = GenerateKeyPair()
  dh = DH(skE, pkR)
  enc = SerializePublicKey(pkE)

  pkRm = SerializePublicKey(pkR)
  kem_context = concat(enc, pkRm)

  shared_secret = ExtractAndExpand(dh, kem_context)
  return shared_secret, enc

def Decap(enc, skR):
  pkE = DeserializePublicKey(enc)
  dh = DH(skR, pkE)

  pkRm = SerializePublicKey(pk(skR))
  kem_context = concat(enc, pkRm)

  shared_secret = ExtractAndExpand(dh, kem_context)
  return shared_secret

def AuthEncap(pkR, skS):
  skE, pkE = GenerateKeyPair()
  dh = concat(DH(skE, pkR), DH(skS, pkR))
  enc = SerializePublicKey(pkE)

  pkRm = SerializePublicKey(pkR)
  pkSm = SerializePublicKey(pk(skS))
  kem_context = concat(enc, pkRm, pkSm)

  shared_secret = ExtractAndExpand(dh, kem_context)
  return shared_secret, enc

def AuthDecap(enc, skR, pkS):
  pkE = DeserializePublicKey(enc)
  dh = concat(DH(skR, pkE), DH(skR, pkS))

  pkRm = SerializePublicKey(pk(skR))
  pkSm = SerializePublicKey(pkS)
  kem_context = concat(enc, pkRm, pkSm)

  shared_secret = ExtractAndExpand(dh, kem_context)
  return shared_secret

The implicit suite_id value used within LabeledExtract and LabeledExpand is defined as follows, where kem_id is defined in Section 7.1 :

suite_id = concat("KEM", I2OSP(kem_id, 2))

The KDF used in DHKEM can be equal to or different from the KDF used in the remainder of HPKE, depending on the chosen variant. Implementations MUST make sure to use the constants ( Nh ) and function calls ( LabeledExtract and LabeledExpand ) of the appropriate KDF when implementing DHKEM. See Section 9.3 for a comment on the choice of a KDF for the remainder of HPKE, and Section 9.6 for the rationale of the labels.

For the variants of DHKEM defined in this document, the size Nsecret of the KEM shared secret is equal to the output length of the hash function underlying the KDF. For P-256, P-384, and P-521, the size Ndh of the Diffie-Hellman shared secret is equal to 32, 48, and 66, respectively, corresponding to the x-coordinate of the resulting elliptic curve point [ IEEE1363 ] . For X25519 and X448, the size Ndh is equal to 32 and 56, respectively (see [ RFC7748 ], Section 5 ).

It is important to note that the AuthEncap() and AuthDecap() functions of the DHKEM variants defined in this document are vulnerable to key-compromise impersonation (KCI). This means the assurance that the KEM shared secret was generated by the holder of the private key skS does not hold if the recipient private key skR is compromised. See Section 9.1 for more details.

Senders and recipients MUST validate KEM inputs and outputs as described in Section 7.1 .

5. Hybrid Public Key Encryption

In this section, we define a few HPKE variants. All variants take a recipient public key and a sequence of plaintexts pt and produce an encapsulated key enc and a sequence of ciphertexts ct . These outputs are constructed so that only the holder of skR can decapsulate the key from enc and decrypt the ciphertexts. All the algorithms also take an info parameter that can be used to influence the generation of keys (e.g., to fold in identity information) and an aad parameter that provides additional authenticated data to the AEAD algorithm in use.

In addition to the base case of encrypting to a public key, we include three authenticated variants: one that authenticates possession of a pre-shared key, one that authenticates possession of a KEM private key, and one that authenticates possession of both a pre-shared key and a KEM private key. All authenticated variants contribute additional keying material to the encryption operation. The following one-byte values will be used to distinguish between modes:

Table 1 : HPKE Modes
Mode Value
mode_base 0x00
mode_psk 0x01
mode_auth 0x02
mode_auth_psk 0x03

All these cases follow the same basic two-step pattern:

  1. Set up an encryption context that is shared between the sender and the recipient.
  2. Use that context to encrypt or decrypt content.

A context is an implementation-specific structure that encodes the AEAD algorithm and key in use, and manages the nonces used so that the same nonce is not used with multiple plaintexts. It also has an interface for exporting secret values, as described in Section 5.3 . See Section 5.2 for a description of this structure and its interfaces. HPKE decryption fails when the underlying AEAD decryption fails.

The constructions described here presume that the relevant non-private parameters ( enc , psk_id , etc.) are transported between the sender and the recipient by some application making use of HPKE. Moreover, a recipient with more than one public key needs some way of determining which of its public keys was used for the encapsulation operation. As an example, applications may send this information alongside a ciphertext from the sender to the recipient. Specification of such a mechanism is left to the application. See Section 10 for more details.

Note that some KEMs may not support AuthEncap() or AuthDecap() . For such KEMs, only mode_base or mode_psk are supported. Future specifications which define new KEMs MUST indicate whether these modes are supported. See Section 7.1.5 for more details.

The procedures described in this section are laid out in a Python-like pseudocode. The algorithms in use are left implicit.

5.1. Creating the Encryption Context

The variants of HPKE defined in this document share a common key schedule that translates the protocol inputs into an encryption context. The key schedule inputs are as follows:

  • mode : A one-byte value indicating the HPKE mode, defined in Table 1 .
  • shared_secret : A KEM shared secret generated for this transaction.
  • info : Application-supplied information (optional; default value "").
  • psk A pre-shared key (PSK) held by both the sender and the recipient (optional; default value "").
  • psk_id : An identifier for the PSK (optional; default value "").

Senders and recipients MUST validate KEM inputs and outputs as described in Section 7.1 .

The psk and psk_id fields MUST appear together or not at all. That is, if a non-default value is provided for one of them, then the other MUST be set to a non-default value. This requirement is encoded in VerifyPSKInputs() below.

The psk , psk_id , and info fields have maximum lengths that depend on the KDF itself, on the definition of LabeledExtract() , and on the constant labels used together with them. See Section 7.2.1 for precise limits on these lengths.

The key , base_nonce , and exporter_secret computed by the key schedule have the property that they are only known to the holder of the recipient private key, and the entity that used the KEM to generate shared_secret and enc .

In the Auth and AuthPSK modes, the recipient is assured that the sender held the private key skS . This assurance is limited for the DHKEM variants defined in this document because of key-compromise impersonation, as described in Sections 4.1 and 9.1 . If in the PSK and AuthPSK modes, the psk and psk_id arguments are provided as required, then the recipient is assured that the sender held the corresponding pre-shared key. See Section 9.1 for more details.

The HPKE algorithm identifiers, i.e., the KEM kem_id , KDF kdf_id , and AEAD aead_id 2-byte code points, as defined in Tables 2 , 3 , and 5 , respectively, are assumed implicit from the implementation and not passed as parameters. The implicit suite_id value used within LabeledExtract and LabeledExpand is defined based on them as follows:

suite_id = concat(
  "HPKE",
  I2OSP(kem_id, 2),
  I2OSP(kdf_id, 2),
  I2OSP(aead_id, 2)
)

default_psk = ""
default_psk_id = ""

def VerifyPSKInputs(mode, psk, psk_id):
  got_psk = (psk != default_psk)
  got_psk_id = (psk_id != default_psk_id)
  if got_psk != got_psk_id:
    raise Exception("Inconsistent PSK inputs")

  if got_psk and (mode in [mode_base, mode_auth]):
    raise Exception("PSK input provided when not needed")
  if (not got_psk) and (mode in [mode_psk, mode_auth_psk]):
    raise Exception("Missing required PSK input")

def KeySchedule<ROLE>(mode, shared_secret, info, psk, psk_id):
  VerifyPSKInputs(mode, psk, psk_id)

  psk_id_hash = LabeledExtract("", "psk_id_hash", psk_id)
  info_hash = LabeledExtract("", "info_hash", info)
  key_schedule_context = concat(mode, psk_id_hash, info_hash)

  secret = LabeledExtract(shared_secret, "secret", psk)

  key = LabeledExpand(secret, "key", key_schedule_context, Nk)
  base_nonce = LabeledExpand(secret, "base_nonce",
                             key_schedule_context, Nn)
  exporter_secret = LabeledExpand(secret, "exp",
                                  key_schedule_context, Nh)

  return Context<ROLE>(key, base_nonce, 0, exporter_secret)

The ROLE template parameter is either S or R, depending on the role of sender or recipient, respectively. See Section 5.2 for a discussion of the key schedule output, including the role-specific Context structure and its API.

Note that the key_schedule_context construction in KeySchedule() is equivalent to serializing a structure of the following form in the TLS presentation syntax:

struct {
    uint8 mode;
    opaque psk_id_hash[Nh];
    opaque info_hash[Nh];
} KeyScheduleContext;

5.1.1. Encryption to a Public Key

The most basic function of an HPKE scheme is to enable encryption to the holder of a given KEM private key. The SetupBaseS() and SetupBaseR() procedures establish contexts that can be used to encrypt and decrypt, respectively, for a given private key.

The KEM shared secret is combined via the KDF with information describing the key exchange, as well as the explicit info parameter provided by the caller.

The parameter pkR is a public key, and enc is an encapsulated KEM shared secret.

def SetupBaseS(pkR, info):
  shared_secret, enc = Encap(pkR)
  return enc, KeyScheduleS(mode_base, shared_secret, info,
                           default_psk, default_psk_id)

def SetupBaseR(enc, skR, info):
  shared_secret = Decap(enc, skR)
  return KeyScheduleR(mode_base, shared_secret, info,
                      default_psk, default_psk_id)

5.1.2. Authentication Using a Pre-Shared Key

This variant extends the base mechanism by allowing the recipient to authenticate that the sender possessed a given PSK. The PSK also improves confidentiality guarantees in certain adversary models, as described in more detail in Section 9.1 . We assume that both parties have been provisioned with both the PSK value psk and another byte string psk_id that is used to identify which PSK should be used.

The primary difference from the base case is that the psk and psk_id values are used as ikm inputs to the KDF (instead of using the empty string).

The PSK MUST have at least 32 bytes of entropy and SHOULD be of length Nh bytes or longer. See Section 9.5 for a more detailed discussion.

def SetupPSKS(pkR, info, psk, psk_id):
  shared_secret, enc = Encap(pkR)
  return enc, KeyScheduleS(mode_psk, shared_secret, info,
                           psk, psk_id)

def SetupPSKR(enc, skR, info, psk, psk_id):
  shared_secret = Decap(enc, skR)
  return KeyScheduleR(mode_psk, shared_secret, info, psk, psk_id)

5.1.3. Authentication Using an Asymmetric Key

This variant extends the base mechanism by allowing the recipient to authenticate that the sender possessed a given KEM private key. This is because AuthDecap(enc, skR, pkS) produces the correct KEM shared secret only if the encapsulated value enc was produced by AuthEncap(pkR, skS) , where skS is the private key corresponding to pkS . In other words, at most two entities (precisely two, in the case of DHKEM) could have produced this secret, so if the recipient is at most one, then the sender is the other with overwhelming probability.

The primary difference from the base case is that the calls to Encap() and Decap() are replaced with calls to AuthEncap() and AuthDecap() , which add the sender public key to their internal context string. The function parameters pkR and pkS are public keys, and enc is an encapsulated KEM shared secret.

Obviously, this variant can only be used with a KEM that provides AuthEncap() and AuthDecap() procedures.

This mechanism authenticates only the key pair of the sender, not any other identifier. If an application wishes to bind HPKE ciphertexts or exported secrets to another identity for the sender (e.g., an email address or domain name), then this identifier should be included in the info parameter to avoid identity misbinding issues [ IMB ] .

def SetupAuthS(pkR, info, skS):
  shared_secret, enc = AuthEncap(pkR, skS)
  return enc, KeyScheduleS(mode_auth, shared_secret, info,
                           default_psk, default_psk_id)

def SetupAuthR(enc, skR, info, pkS):
  shared_secret = AuthDecap(enc, skR, pkS)
  return KeyScheduleR(mode_auth, shared_secret, info,
                      default_psk, default_psk_id)

5.1.4. Authentication Using Both a PSK and an Asymmetric Key

This mode is a straightforward combination of the PSK and authenticated modes. Like the PSK mode, a PSK is provided as input to the key schedule, and like the authenticated mode, authenticated KEM variants are used.

def SetupAuthPSKS(pkR, info, psk, psk_id, skS):
  shared_secret, enc = AuthEncap(pkR, skS)
  return enc, KeyScheduleS(mode_auth_psk, shared_secret, info,
                           psk, psk_id)

def SetupAuthPSKR(enc, skR, info, psk, psk_id, pkS):
  shared_secret = AuthDecap(enc, skR, pkS)
  return KeyScheduleR(mode_auth_psk, shared_secret, info,
                      psk, psk_id)

The PSK MUST have at least 32 bytes of entropy and SHOULD be of length Nh bytes or longer. See Section 9.5 for a more detailed discussion.

5.2. Encryption and Decryption

HPKE allows multiple encryption operations to be done based on a given setup transaction. Since the public key operations involved in setup are typically more expensive than symmetric encryption or decryption, this allows applications to amortize the cost of the public key operations, reducing the overall overhead.

In order to avoid nonce reuse, however, this encryption must be stateful. Each of the setup procedures above produces a role-specific context object that stores the AEAD and secret export parameters. The AEAD parameters consist of:

  • The AEAD algorithm in use
  • A secret key
  • A base nonce base_nonce
  • A sequence number (initially 0)

The secret export parameters consist of:

  • The HPKE ciphersuite in use and
  • An exporter_secret used for the secret export interface (see Section 5.3 )

All these parameters except the AEAD sequence number are constant. The sequence number provides nonce uniqueness: The nonce used for each encryption or decryption operation is the result of XORing base_nonce with the current sequence number, encoded as a big-endian integer of the same length as base_nonce . Implementations MAY use a sequence number that is shorter than the nonce length (padding on the left with zero), but MUST raise an error if the sequence number overflows. The AEAD algorithm produces ciphertext that is Nt bytes longer than the plaintext. Nt = 16 for AEAD algorithms defined in this document.

Encryption is unidirectional from sender to recipient. The sender's context can encrypt a plaintext pt with associated data aad as follows:

def ContextS.Seal(aad, pt):
  ct = Seal(self.key, self.ComputeNonce(self.seq), aad, pt)
  self.IncrementSeq()
  return ct

The recipient's context can decrypt a ciphertext ct with associated data aad as follows:

def ContextR.Open(aad, ct):
  pt = Open(self.key, self.ComputeNonce(self.seq), aad, ct)
  if pt == OpenError:
    raise OpenError
  self.IncrementSeq()
  return pt

Each encryption or decryption operation increments the sequence number for the context in use. The per-message nonce and sequence number increment details are as follows:

def Context<ROLE>.ComputeNonce(seq):
  seq_bytes = I2OSP(seq, Nn)
  return xor(self.base_nonce, seq_bytes)

def Context<ROLE>.IncrementSeq():
  if self.seq >= (1 << (8*Nn)) - 1:
    raise MessageLimitReachedError
  self.seq += 1

The sender's context MUST NOT be used for decryption. Similarly, the recipient's context MUST NOT be used for encryption. Higher-level protocols reusing the HPKE key exchange for more general purposes can derive separate keying material as needed using use the secret export interface; see Sections 5.3 and 9.8 for more details.

It is up to the application to ensure that encryptions and decryptions are done in the proper sequence, so that encryption and decryption nonces align. If ContextS.Seal() or ContextR.Open() would cause the seq field to overflow, then the implementation MUST fail with an error. (In the pseudocode below, Context<ROLE>.IncrementSeq() fails with an error when seq overflows, which causes ContextS.Seal() and ContextR.Open() to fail accordingly.) Note that the internal Seal() and Open() calls inside correspond to the context's AEAD algorithm.

5.3. Secret Export

HPKE provides an interface for exporting secrets from the encryption context using a variable-length pseudorandom function (PRF), similar to the TLS 1.3 exporter interface (see [ RFC8446 ], Section 7.5 ). This interface takes as input a context string exporter_context and a desired length L in bytes, and produces a secret derived from the internal exporter secret using the corresponding KDF Expand function. For the KDFs defined in this specification, L has a maximum value of 255*Nh . Future specifications that define new KDFs MUST specify a bound for L .

The exporter_context field has a maximum length that depends on the KDF itself, on the definition of LabeledExpand() , and on the constant labels used together with them. See Section 7.2.1 for precise limits on this length.

def Context.Export(exporter_context, L):
  return LabeledExpand(self.exporter_secret, "sec",
                       exporter_context, L)

Applications that do not use the encryption API in Section 5.2 can use the export-only AEAD ID 0xFFFF when computing the key schedule. Such applications can avoid computing the key and base_nonce values in the key schedule, as they are not used by the Export interface described above.

6. Single-Shot APIs

6.1. Encryption and Decryption

In many cases, applications encrypt only a single message to a recipient's public key. This section provides templates for HPKE APIs that implement stateless "single-shot" encryption and decryption using APIs specified in Sections 5.1.1 and 5.2 :

def Seal<MODE>(pkR, info, aad, pt, ...):
  enc, ctx = Setup<MODE>S(pkR, info, ...)
  ct = ctx.Seal(aad, pt)
  return enc, ct

def Open<MODE>(enc, skR, info, aad, ct, ...):
  ctx = Setup<MODE>R(enc, skR, info, ...)
  return ctx.Open(aad, ct)

The MODE template parameter is one of Base, PSK, Auth, or AuthPSK. The optional parameters indicated by "..." depend on MODE and may be empty. For example, SetupBase() has no additional parameters. SealAuthPSK() and OpenAuthPSK() would be implemented as follows:

def SealAuthPSK(pkR, info, aad, pt, psk, psk_id, skS):
  enc, ctx = SetupAuthPSKS(pkR, info, psk, psk_id, skS)
  ct = ctx.Seal(aad, pt)
  return enc, ct

def OpenAuthPSK(enc, skR, info, aad, ct, psk, psk_id, pkS):
  ctx = SetupAuthPSKR(enc, skR, info, psk, psk_id, pkS)
  return ctx.Open(aad, ct)

6.2. Secret Export

Applications may also want to derive a secret known only to a given recipient. This section provides templates for HPKE APIs that implement stateless "single-shot" secret export using APIs specified in Section 5.3 :

def SendExport<MODE>(pkR, info, exporter_context, L, ...):
  enc, ctx = Setup<MODE>S(pkR, info, ...)
  exported = ctx.Export(exporter_context, L)
  return enc, exported

def ReceiveExport<MODE>(enc, skR, info, exporter_context, L, ...):
  ctx = Setup<MODE>R(enc, skR, info, ...)
  return ctx.Export(exporter_context, L)

As in Section 6.1 , the MODE template parameter is one of Base, PSK, Auth, or AuthPSK. The optional parameters indicated by "..." depend on MODE and may be empty.

7. Algorithm Identifiers

This section lists algorithm identifiers suitable for different HPKE configurations. Future specifications may introduce new KEM, KDF, and AEAD algorithm identifiers and retain the security guarantees presented in this document provided they adhere to the security requirements in Sections 9.2 , 9.3 , and 9.4 , respectively.

7.1. Key Encapsulation Mechanisms (KEMs)

Table 2 : KEM IDs
Value KEM Nsecret Nenc Npk Nsk Auth Reference
0x0000 Reserved N/A N/A N/A N/A yes RFC 9180
0x0010 DHKEM(P-256, HKDF-SHA256) 32 65 65 32 yes [ NISTCurves ] , [ RFC5869 ]
0x0011 DHKEM(P-384, HKDF-SHA384) 48 97 97 48 yes [ NISTCurves ] , [ RFC5869 ]
0x0012 DHKEM(P-521, HKDF-SHA512) 64 133 133 66 yes [ NISTCurves ] , [ RFC5869 ]
0x0020 DHKEM(X25519, HKDF-SHA256) 32 32 32 32 yes [ RFC5869 ] , [ RFC7748 ]
0x0021 DHKEM(X448, HKDF-SHA512) 64 56 56 56 yes [ RFC5869 ] , [ RFC7748 ]

The Auth column indicates if the KEM algorithm provides the AuthEncap() / AuthDecap() interface and is therefore suitable for the Auth and AuthPSK modes. The meaning of all other columns is explained in Section 11.1 . All algorithms are suitable for the PSK mode.

7.1.1. SerializePublicKey and DeserializePublicKey

For P-256, P-384, and P-521, the SerializePublicKey() function of the KEM performs the uncompressed Elliptic-Curve-Point-to-Octet-String conversion according to [ SECG ] . DeserializePublicKey() performs the uncompressed Octet-String-to-Elliptic-Curve-Point conversion.

For X25519 and X448, the SerializePublicKey() and DeserializePublicKey() functions are the identity function, since these curves already use fixed-length byte strings for public keys.

Some deserialized public keys MUST be validated before they can be used. See Section 7.1.4 for specifics.

7.1.2. SerializePrivateKey and DeserializePrivateKey

As per [ SECG ] , P-256, P-384, and P-521 private keys are field elements in the scalar field of the curve being used. For this section, and for Section 7.1.3 , it is assumed that implementors of ECDH over these curves use an integer representation of private keys that is compatible with the OS2IP() function.

For P-256, P-384, and P-521, the SerializePrivateKey() function of the KEM performs the Field-Element-to-Octet-String conversion according to [ SECG ] . If the private key is an integer outside the range [0, order-1] , where order is the order of the curve being used, the private key MUST be reduced to its representative in [0, order-1] before being serialized. DeserializePrivateKey() performs the Octet-String-to-Field-Element conversion according to [ SECG ] .

For X25519 and X448, private keys are identical to their byte string representation, so little processing has to be done. The SerializePrivateKey() function MUST clamp its output and the DeserializePrivateKey() function MUST clamp its input, where clamping refers to the bitwise operations performed on k in the decodeScalar25519() and decodeScalar448() functions defined in Section 5 of [ RFC7748 ] .

To catch invalid keys early on, implementors of DHKEMs SHOULD check that deserialized private keys are not equivalent to 0 (mod order ), where order is the order of the DH group. Note that this property is trivially true for X25519 and X448 groups, since clamped values can never be 0 (mod order ).

7.1.3. DeriveKeyPair

The keys that DeriveKeyPair() produces have only as much entropy as the provided input keying material. For a given KEM, the ikm parameter given to DeriveKeyPair() SHOULD have length at least Nsk , and SHOULD have at least Nsk bytes of entropy.

All invocations of KDF functions (such as LabeledExtract or LabeledExpand ) in any DHKEM's DeriveKeyPair() function use the DHKEM's associated KDF (as opposed to the ciphersuite's KDF).

For P-256, P-384, and P-521, the DeriveKeyPair() function of the KEM performs rejection sampling over field elements:

def DeriveKeyPair(ikm):
  dkp_prk = LabeledExtract("", "dkp_prk", ikm)
  sk = 0
  counter = 0
  while sk == 0 or sk >= order:
    if counter > 255:
      raise DeriveKeyPairError
    bytes = LabeledExpand(dkp_prk, "candidate",
                          I2OSP(counter, 1), Nsk)
    bytes[0] = bytes[0] & bitmask
    sk = OS2IP(bytes)
    counter = counter + 1
  return (sk, pk(sk))

order is the order of the curve being used (see Section D.1.2 of [ NISTCurves ] ), and is listed below for completeness.

P-256:
0xffffffff00000000ffffffffffffffffbce6faada7179e84f3b9cac2fc632551

P-384:
0xffffffffffffffffffffffffffffffffffffffffffffffffc7634d81f4372ddf
  581a0db248b0a77aecec196accc52973

P-521:
0x01ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
  fa51868783bf2f966b7fcc0148f709a5d03bb5c9b8899c47aebb6fb71e91386409

bitmask is defined to be 0xFF for P-256 and P-384, and 0x01 for P-521. The precise likelihood of DeriveKeyPair() failing with DeriveKeyPairError depends on the group being used, but it is negligibly small in all cases. See Section 8.2 for information about dealing with such failures.

For X25519 and X448, the DeriveKeyPair() function applies a KDF to the input:

def DeriveKeyPair(ikm):
  dkp_prk = LabeledExtract("", "dkp_prk", ikm)
  sk = LabeledExpand(dkp_prk, "sk", "", Nsk)
  return (sk, pk(sk))

7.1.4. Validation of Inputs and Outputs

The following public keys are subject to validation if the group requires public key validation: the sender MUST validate the recipient's public key pkR ; the recipient MUST validate the ephemeral public key pkE ; in authenticated modes, the recipient MUST validate the sender's static public key pkS . Validation failure yields a ValidationError .

For P-256, P-384, and P-521, senders and recipients MUST perform partial public key validation on all public key inputs, as defined in Section 5.6.2.3.4 of [ keyagreement ] . This includes checking that the coordinates are in the correct range, that the point is on the curve, and that the point is not the point at infinity. Additionally, senders and recipients MUST ensure the Diffie-Hellman shared secret is not the point at infinity.

For X25519 and X448, public keys and Diffie-Hellman outputs MUST be validated as described in [ RFC7748 ] . In particular, recipients MUST check whether the Diffie-Hellman shared secret is the all-zero value and abort if so.

7.1.5. Future KEMs

Section 9.2 lists security requirements on a KEM used within HPKE.

The AuthEncap() and AuthDecap() functions are OPTIONAL . If a KEM algorithm does not provide them, only the Base and PSK modes of HPKE are supported. Future specifications that define new KEMs MUST indicate whether or not Auth and AuthPSK modes are supported.

A KEM algorithm may support different encoding algorithms, with different output lengths, for KEM public keys. Such KEM algorithms MUST specify only one encoding algorithm whose output length is Npk .

7.2. Key Derivation Functions (KDFs)

Table 3 : KDF IDs
Value KDF Nh Reference
0x0000 Reserved N/A RFC 9180
0x0001 HKDF-SHA256 32 [ RFC5869 ]
0x0002 HKDF-SHA384 48 [ RFC5869 ]
0x0003 HKDF-SHA512 64 [ RFC5869 ]

7.2.1. Input Length Restrictions

This document defines LabeledExtract() and LabeledExpand() based on the KDFs listed above. These functions add prefixes to their respective inputs ikm and info before calling the KDF's Extract() and Expand() functions. This leads to a reduction of the maximum input length that is available for the inputs psk , psk_id , info , exporter_context , ikm , i.e., the variable-length parameters provided by HPKE applications. The following table lists the maximum allowed lengths of these fields for the KDFs defined in this document, as inclusive bounds in bytes:

Table 4 : Application Input Limits
Input HKDF-SHA256 HKDF-SHA384 HKDF-SHA512
psk 2^{61} - 88 2^{125} - 152 2^{125} - 152
psk_id 2^{61} - 93 2^{125} - 157 2^{125} - 157
info 2^{61} - 91 2^{125} - 155 2^{125} - 155
exporter_context 2^{61} - 120 2^{125} - 200 2^{125} - 216
ikm (DeriveKeyPair) 2^{61} - 84 2^{125} - 148 2^{125} - 148

This shows that the limits are only marginally smaller than the maximum input length of the underlying hash function; these limits are large and unlikely to be reached in practical applications. Future specifications that define new KDFs MUST specify bounds for these variable-length parameters.

The RECOMMENDED limit for these values is 64 bytes. This would enable interoperability with implementations that statically allocate memory for these inputs to avoid memory allocations.

The values for psk , psk_id , info , and ikm , which are inputs to LabeledExtract() , were computed with the following expression:

max_size_hash_input - Nb - size_version_label -
    size_suite_id - size_input_label

The value for exporter_context , which is an input to LabeledExpand() , was computed with the following expression:

max_size_hash_input - Nb - Nh - size_version_label -
    size_suite_id - size_input_label - 2 - 1

In these equations, max_size_hash_input is the maximum input length of the underlying hash function in bytes, Nb is the block size of the underlying hash function in bytes, size_version_label is the size of "HPKE-v1" in bytes and equals 7, size_suite_id is the size of the suite_id in bytes and equals 5 for DHKEM (relevant for ikm ) and 10 for the remainder of HPKE (relevant for psk , psk_id , info , and exporter_context ), and size_input_label is the size in bytes of the label used as parameter to LabeledExtract() or LabeledExpand() , the maximum of which is 13 across all labels in this document.

7.3. Authenticated Encryption with Associated Data (AEAD) Functions

Table 5 : AEAD IDs
Value AEAD Nk Nn Nt Reference
0x0000 Reserved N/A N/A N/A RFC 9180
0x0001 AES-128-GCM 16 12 16 [ GCM ]
0x0002 AES-256-GCM 32 12 16 [ GCM ]
0x0003 ChaCha20Poly1305 32 12 16 [ RFC8439 ]
0xFFFF Export-only N/A N/A N/A RFC 9180

The 0xFFFF AEAD ID is reserved for applications that only use the Export interface; see Section 5.3 for more details.

8. API Considerations

This section documents considerations for interfaces to implementations of HPKE. This includes error handling considerations and recommendations that improve interoperability when HPKE is used in applications.

8.1. Auxiliary Authenticated Application Information

HPKE has two places at which applications can specify auxiliary authenticated information: (1) during context construction via the Setup info parameter, and (2) during Context operations, i.e., with the aad parameter for Open() and Seal() , and the exporter_context parameter for Export() . Application information applicable to multiple operations on a single Context should use the Setup info parameter. This avoids redundantly processing this information for each Context operation. In contrast, application information that varies on a per-message basis should be specified via the Context APIs ( Seal() , Open() , or Export() ).

Applications that only use the single-shot APIs described in Section 6 should use the Setup info parameter for specifying auxiliary authenticated information. Implementations which only expose single-shot APIs should not allow applications to use both Setup info and Context aad or exporter_context auxiliary information parameters.

8.2. Errors

The high-level, public HPKE APIs specified in this document are all fallible. These include the Setup functions and all encryption context functions. For example, Decap() can fail if the encapsulated key enc is invalid, and Open() may fail if ciphertext decryption fails. The explicit errors generated throughout this specification, along with the conditions that lead to each error, are as follows:

Implicit errors may also occur. As an example, certain classes of failures, e.g., malformed recipient public keys, may not yield explicit errors. For example, for the DHKEM variant described in this specification, the Encap() algorithm fails when given an invalid recipient public key. However, other KEM algorithms may not have an efficient algorithm for verifying the validity of public keys. As a result, an equivalent error may not manifest until AEAD decryption at the recipient. As another example, DHKEM's AuthDecap() function will produce invalid output if given the wrong sender public key. This error is not detectable until subsequent AEAD decryption.

The errors in this document are meant as a guide for implementors. They are not an exhaustive list of all the errors an implementation might emit. For example, future KEMs might have internal failure cases, or an implementation might run out of memory.

How these errors are expressed in an API or handled by applications is an implementation-specific detail. For example, some implementations may abort or panic upon a DeriveKeyPairError failure given that it only occurs with negligible probability, whereas other implementations may retry the failed DeriveKeyPair operation. See Section 7.1.3 for more information. As another example, some implementations of the DHKEM specified in this document may choose to transform ValidationError from DH() into an EncapError or DecapError from Encap() or Decap() , respectively, whereas others may choose to raise ValidationError unmodified.

Applications using HPKE APIs should not assume that the errors here are complete, nor should they assume certain classes of errors will always manifest the same way for all ciphersuites. For example, the DHKEM specified in this document will emit a DeserializationError or ValidationError if a KEM public key is invalid. However, a new KEM might not have an efficient algorithm for determining whether or not a public key is valid. In this case, an invalid public key might instead yield an OpenError when trying to decrypt a ciphertext.

9. Security Considerations

9.1. Security Properties

HPKE has several security goals, depending on the mode of operation, against active and adaptive attackers that can compromise partial secrets of senders and recipients. The desired security goals are detailed below:

  • Message secrecy: Confidentiality of the sender's messages against chosen ciphertext attacks
  • Export key secrecy: Indistinguishability of each export secret from a uniformly random bitstring of equal length, i.e., Context.Export is a variable-length PRF
  • Sender authentication: Proof of sender origin for PSK, Auth, and AuthPSK modes

These security goals are expected to hold for any honest sender and honest recipient keys, as well as if the honest sender and honest recipient keys are the same.

HPKE mitigates malleability problems (called benign malleability [ SECG ] ) in prior public key encryption standards based on ECIES by including all public keys in the context of the key schedule.

HPKE does not provide forward secrecy with respect to recipient compromise. In the Base and Auth modes, the secrecy properties are only expected to hold if the recipient private key skR is not compromised at any point in time. In the PSK and AuthPSK modes, the secrecy properties are expected to hold if the recipient private key skR and the pre-shared key are not both compromised at any point in time. See Section 9.7 for more details.

In the Auth mode, sender authentication is generally expected to hold if the sender private key skS is not compromised at the time of message reception. In the AuthPSK mode, sender authentication is generally expected to hold if, at the time of message reception, the sender private key skS and the pre-shared key are not both compromised.

Besides forward secrecy and key-compromise impersonation, which are highlighted in this section because of their particular cryptographic importance, HPKE has other non-goals that are described in Section 9.7 : no tolerance of message reordering or loss, no downgrade or replay prevention, no hiding of the plaintext length, and no protection against bad ephemeral randomness. Section 9.7 suggests application-level mitigations for some of them.

9.1.1. Key-Compromise Impersonation

The DHKEM variants defined in this document are vulnerable to key-compromise impersonation attacks [ BJM97 ] , which means that sender authentication cannot be expected to hold in the Auth mode if the recipient private key skR is compromised, and in the AuthPSK mode if the pre-shared key and the recipient private key skR are both compromised. NaCl's box interface [ NaCl ] has the same issue. At the same time, this enables repudiability.

As shown by [ ABHKLR20 ] , key-compromise impersonation attacks are generally possible on HPKE because KEM ciphertexts are not bound to HPKE messages. An adversary who knows a recipient's private key can decapsulate an observed KEM ciphertext, compute the key schedule, and encrypt an arbitrary message that the recipient will accept as coming from the original sender. Importantly, this is possible even with a KEM that is resistant to key-compromise impersonation attacks. As a result, mitigating this issue requires fundamental changes that are out of scope of this specification.

Applications that require resistance against key-compromise impersonation SHOULD take extra steps to prevent this attack. One possibility is to produce a digital signature over (enc, ct) tuples using a sender's private key -- where ct is an AEAD ciphertext produced by the single-shot or multi-shot API and enc is the corresponding KEM encapsulated key.

Given these properties, pre-shared keys strengthen both the authentication and the secrecy properties in certain adversary models. One particular example in which this can be useful is a hybrid quantum setting: if a non-quantum-resistant KEM used with HPKE is broken by a quantum computer, the security properties are preserved through the use of a pre-shared key. As described in Section 7 of [ RFC8696 ] this assumes that the pre-shared key has not been compromised.

9.1.2. Computational Analysis

It is shown in [ CS01 ] that a hybrid public key encryption scheme of essentially the same form as the Base mode described here is IND-CCA2-secure as long as the underlying KEM and AEAD schemes are IND-CCA2-secure. Moreover, it is shown in [ HHK06 ] that IND-CCA2 security of the KEM and the data encapsulation mechanism are necessary conditions to achieve IND-CCA2 security for hybrid public key encryption. The main difference between the scheme proposed in [ CS01 ] and the Base mode in this document (both named HPKE) is that we interpose some KDF calls between the KEM and the AEAD. Analyzing the HPKE Base mode instantiation in this document therefore requires verifying that the additional KDF calls do not cause the IND-CCA2 property to fail, as well as verifying the additional export key secrecy property.

Analysis of the PSK, Auth, and AuthPSK modes defined in this document additionally requires verifying the sender authentication property. While the PSK mode just adds supplementary keying material to the key schedule, the Auth and AuthPSK modes make use of a non-standard authenticated KEM construction. Generally, the authenticated modes of HPKE can be viewed and analyzed as flavors of signcryption [ SigncryptionDZ10 ] .

A preliminary computational analysis of all HPKE modes has been done in [ HPKEAnalysis ] , indicating asymptotic security for the case where the KEM is DHKEM, the AEAD is any IND-CPA-secure and INT-CTXT-secure scheme, and the DH group and KDF satisfy the following conditions:

  • DH group: The gap Diffie-Hellman (GDH) problem is hard in the appropriate subgroup [ GAP ] .
  • Extract() and Expand() : Extract() can be modeled as a random oracle. Expand() can be modeled as a pseudorandom function, wherein the first argument is the key.

In particular, the KDFs and DH groups defined in this document (see Sections 7.2 and 7.1 ) satisfy these properties when used as specified. The analysis in [ HPKEAnalysis ] demonstrates that under these constraints, HPKE continues to provide IND-CCA2 security, and provides the additional properties noted above. Also, the analysis confirms the expected properties hold under the different key compromise cases mentioned above. The analysis considers a sender that sends one message using the encryption context, and additionally exports two independent secrets using the secret export interface.

The table below summarizes the main results from [ HPKEAnalysis ] . N/A means that a property does not apply for the given mode, whereas Y means the given mode satisfies the property.

Table 6 : HPKE Mode Security Properties
Variant Message Sec. Export Sec. Sender Auth.
Base Y Y N/A
PSK Y Y Y
Auth Y Y Y
AuthPSK Y Y Y

If non-DH-based KEMs are to be used with HPKE, further analysis will be necessary to prove their security. The results from [ CS01 ] provide some indication that any IND-CCA2-secure KEM will suffice here, but are not conclusive given the differences in the schemes.

A detailed computational analysis of HPKE's Auth mode single-shot encryption API has been done in [ ABHKLR20 ] . The paper defines security notions for authenticated KEMs and for authenticated public key encryption, using the outsider and insider security terminology known from signcryption [ SigncryptionDZ10 ] . The analysis proves that DHKEM's AuthEncap() / AuthDecap() interface fulfills these notions for all Diffie-Hellman groups specified in this document. The analysis also provides exact security bounds, under the assumptions that the gap Diffie-Hellman (GDH) problem is hard in the appropriate subgroup [ GAP ] , and that HKDF can be modeled as a random oracle.

Further, [ ABHKLR20 ] proves composition theorems, showing that HPKE's Auth mode fulfills the security notions of authenticated public key encryption for all KDFs and AEAD schemes specified in this document, given any authenticated KEM satisfying the previously defined security notions for authenticated KEMs. The theorems assume that the KEM is perfectly correct; they could easily be adapted to work with KEMs that have a nonzero but negligible probability for decryption failure. The assumptions on the KDF are that Extract() and Expand() can be modeled as pseudorandom functions wherein the first argument is the key, respectively. The assumption for the AEAD is IND-CPA and IND-CTXT security.

In summary, the analysis in [ ABHKLR20 ] proves that the single-shot encryption API of HPKE's Auth mode satisfies the desired message confidentiality and sender authentication properties listed at the beginning of this section; it does not consider multiple messages, nor the secret export API.

9.1.3. Post-Quantum Security

All of [ CS01 ] , [ HPKEAnalysis ] , and [ ABHKLR20 ] are premised on classical security models and assumptions, and do not consider adversaries capable of quantum computation. A full proof of post-quantum security would need to take appropriate security models and assumptions into account, in addition to simply using a post-quantum KEM. However, the composition theorems from [ ABHKLR20 ] for HPKE's Auth mode only make standard assumptions (i.e., no random oracle assumption) that are expected to hold against quantum adversaries (although with slightly worse bounds). Thus, these composition theorems, in combination with a post-quantum-secure authenticated KEM, guarantee the post-quantum security of HPKE's Auth mode.

In future work, the analysis from [ ABHKLR20 ] can be extended to cover HPKE's other modes and desired security properties. The hybrid quantum-resistance property described above, which is achieved by using the PSK or AuthPSK mode, is not proven in [ HPKEAnalysis ] because this analysis requires the random oracle model; in a quantum setting, this model needs adaption to, for example, the quantum random oracle model.

9.2. Security Requirements on a KEM Used within HPKE

A KEM used within HPKE MUST allow HPKE to satisfy its desired security properties described in Section 9.1 . Section 9.6 lists requirements concerning domain separation.

In particular, the KEM shared secret MUST be a uniformly random byte string of length Nsecret . This means, for instance, that it would not be sufficient if the KEM shared secret is only uniformly random as an element of some set prior to its encoding as a byte string.

9.2.1. Encap/Decap Interface

As mentioned in Section 9 , [ CS01 ] provides some indications that if the KEM's Encap() / Decap() interface (which is used in the Base and PSK modes) is IND-CCA2-secure, HPKE is able to satisfy its desired security properties. An appropriate definition of IND-CCA2 security for KEMs can be found in [ CS01 ] and [ BHK09 ] .

9.2.2. AuthEncap/AuthDecap Interface

The analysis of HPKE's Auth mode single-shot encryption API in [ ABHKLR20 ] provides composition theorems that guarantee that HPKE's Auth mode achieves its desired security properties if the KEM's AuthEncap() / AuthDecap() interface satisfies multi-user Outsider-CCA, Outsider-Auth, and Insider-CCA security, as defined in the same paper.

Intuitively, Outsider-CCA security formalizes confidentiality, and Outsider-Auth security formalizes authentication of the KEM shared secret in case none of the sender or recipient private keys are compromised. Insider-CCA security formalizes confidentiality of the KEM shared secret in case the sender private key is known or chosen by the adversary. (If the recipient private key is known or chosen by the adversary, confidentiality is trivially broken, because then the adversary knows all secrets on the recipient's side).

An Insider-Auth security notion would formalize authentication of the KEM shared secret in case the recipient private key is known or chosen by the adversary. (If the sender private key is known or chosen by the adversary, it can create KEM ciphertexts in the name of the sender). Because of the generic attack on an analogous Insider-Auth security notion of HPKE described in Section 9.1 , a definition of Insider-Auth security for KEMs used within HPKE is not useful.

9.2.3. KEM Key Reuse

An ikm input to DeriveKeyPair() ( Section 7.1.3 ) MUST NOT be reused elsewhere, in particular not with DeriveKeyPair() of a different KEM.

The randomness used in Encap() and AuthEncap() to generate the KEM shared secret or its encapsulation MUST NOT be reused elsewhere.

Since a KEM key pair belonging to a sender or recipient works with all modes, it can be used with multiple modes in parallel. HPKE is constructed to be secure in such settings due to domain separation using the suite_id variable. However, there is no formal proof of security at the time of writing for using multiple modes in parallel; [ HPKEAnalysis ] and [ ABHKLR20 ] only analyze isolated modes.

9.3. Security Requirements on a KDF

The choice of the KDF for HPKE SHOULD be made based on the security level provided by the KEM and, if applicable, by the PSK. The KDF SHOULD at least have the security level of the KEM and SHOULD at least have the security level provided by the PSK.

9.4. Security Requirements on an AEAD

All AEADs MUST be IND-CCA2-secure, as is currently true for all AEADs listed in Section 7.3 .

9.5. Pre-Shared Key Recommendations

In the PSK and AuthPSK modes, the PSK MUST have at least 32 bytes of entropy and SHOULD be of length Nh bytes or longer. Using a PSK longer than 32 bytes but shorter than Nh bytes is permitted.

HPKE is specified to use HKDF as its key derivation function. HKDF is not designed to slow down dictionary attacks (see [ RFC5869 ] ). Thus, HPKE's PSK mechanism is not suitable for use with a low-entropy password as the PSK: In scenarios in which the adversary knows the KEM shared secret shared_secret and has access to an oracle that distinguishes between a good and a wrong PSK, it can perform PSK-recovering attacks. This oracle can be the decryption operation on a captured HPKE ciphertext or any other recipient behavior that is observably different when using a wrong PSK. The adversary knows the KEM shared secret shared_secret if it knows all KEM private keys of one participant. In the PSK mode, this is trivially the case if the adversary acts as the sender.

To recover a lower entropy PSK, an attacker in this scenario can trivially perform a dictionary attack. Given a set S of possible PSK values, the attacker generates an HPKE ciphertext for each value in S , and submits the resulting ciphertexts to the oracle to learn which PSK is being used by the recipient. Further, because HPKE uses AEAD schemes that are not key-committing, an attacker can mount a partitioning oracle attack [ LGR20 ] that can recover the PSK from a set of S possible PSK values, with |S| = m*k, in roughly m + log k queries to the oracle using ciphertexts of length proportional to k, the maximum message length in blocks. (Applying the multi-collision algorithm from [ LGR20 ] requires a small adaptation to the algorithm wherein the appropriate nonce is computed for each candidate key. This modification adds one call to HKDF per key. The number of partitioning oracle queries remains unchanged.) As a result, the PSK must therefore be chosen with sufficient entropy so that m + log k is prohibitive for attackers (e.g., 2^128). Future specifications can define new AEAD algorithms that are key-committing.

9.6. Domain Separation

HPKE allows combining a DHKEM variant DHKEM(Group, KDF') and a KDF such that both KDFs are instantiated by the same KDF. By design, the calls to Extract() and Expand() inside DHKEM and the remainder of HPKE use separate input domains. This justifies modeling them as independent functions even if instantiated by the same KDF. This domain separation between DHKEM and the remainder of HPKE is achieved by using prefix-free sets of suite_id values in LabeledExtract() and LabeledExpand() ( KEM... in DHKEM and HPKE... in the remainder of HPKE). Recall that a set is prefix-free if no element is a prefix of another within the set.

Future KEM instantiations MUST ensure, should Extract() and Expand() be used internally, that they can be modeled as functions independent from the invocations of Extract() and Expand() in the remainder of HPKE. One way to ensure this is by using LabeledExtract() and LabeledExpand() with a suite_id as defined in Section 4 , which will ensure input domain separation, as outlined above. Particular attention needs to be paid if the KEM directly invokes functions that are used internally in HPKE's Extract() or Expand() , such as Hash() and HMAC() in the case of HKDF. It MUST be ensured that inputs to these invocations cannot collide with inputs to the internal invocations of these functions inside Extract() or Expand() . In HPKE's KeySchedule() this is avoided by using Extract() instead of Hash() on the arbitrary-length inputs info and psk_id .

The string literal "HPKE-v1" used in LabeledExtract() and LabeledExpand() ensures that any secrets derived in HPKE are bound to the scheme's name and version, even when possibly derived from the same Diffie-Hellman or KEM shared secret as in another scheme or version.

9.7. Application Embedding and Non-Goals

HPKE is designed to be a fairly low-level mechanism. As a result, it assumes that certain properties are provided by the application in which HPKE is embedded and leaves certain security properties to be provided by other mechanisms. Otherwise said, certain properties are out of scope for HPKE.

9.7.1. Message Order and Message Loss

The primary requirement that HPKE imposes on applications is the requirement that ciphertexts MUST be presented to ContextR.Open() in the same order in which they were generated by ContextS.Seal() . When the single-shot API is used (see Section 6 ), this is trivially true (since there is only ever one ciphertext). Applications that allow for multiple invocations of Open() / Seal() on the same context MUST enforce the ordering property described above.

Ordering requirements of this character are usually fulfilled by providing a sequence number in the framing of encrypted messages. Whatever information is used to determine the ordering of HPKE-encrypted messages SHOULD be included in the associated data passed to ContextS.Seal() and ContextR.Open() . The specifics of this scheme are up to the application.

HPKE is not tolerant of lost messages. Applications MUST be able to detect when a message has been lost. When an unrecoverable loss is detected, the application MUST discard any associated HPKE context.

9.7.2. Downgrade Prevention

HPKE assumes that the sender and recipient agree on what algorithms to use. Depending on how these algorithms are negotiated, it may be possible for an intermediary to force the two parties to use suboptimal algorithms.

9.7.3. Replay Protection

The requirement that ciphertexts be presented to the ContextR.Open() function in the same order they were generated by ContextS.Seal() provides a degree of replay protection within a stream of ciphertexts resulting from a given context. HPKE provides no other replay protection.

9.7.4. Forward Secrecy

HPKE ciphertexts are not forward secret with respect to recipient compromise in any mode. This means that compromise of long-term recipient secrets allows an attacker to decrypt past ciphertexts encrypted under said secrets. This is because only long-term secrets are used on the side of the recipient.

HPKE ciphertexts are forward secret with respect to sender compromise in all modes. This is because ephemeral randomness is used on the sender's side, which is supposed to be erased directly after computation of the KEM shared secret and ciphertext.

9.7.5. Bad Ephemeral Randomness

If the randomness used for KEM encapsulation is bad -- i.e., of low entropy or compromised because of a broken or subverted random number generator -- the confidentiality guarantees of HPKE degrade significantly. In Base mode, confidentiality guarantees can be lost completely; in the other modes, at least forward secrecy with respect to sender compromise can be lost completely.

Such a situation could also lead to the reuse of the same KEM shared secret and thus to the reuse of same key-nonce pairs for the AEAD. The AEADs specified in this document are not secure in case of nonce reuse. This attack vector is particularly relevant in authenticated modes because knowledge of the ephemeral randomness is not enough to derive shared_secret in these modes.

One way for applications to mitigate the impacts of bad ephemeral randomness is to combine ephemeral randomness with a local long-term secret that has been generated securely, as described in [ RFC8937 ] .

9.7.6. Hiding Plaintext Length

AEAD ciphertexts produced by HPKE do not hide the plaintext length. Applications requiring this level of privacy should use a suitable padding mechanism. See [ TLS-ECH ] and [ RFC8467 ] for examples of protocol-specific padding policies.

9.8. Bidirectional Encryption

As discussed in Section 5.2 , HPKE encryption is unidirectional from sender to recipient. Applications that require bidirectional encryption can derive necessary keying material with the secret export interface ( Section 5.3 ). The type and length of such keying material depends on the application use case.

As an example, if an application needs AEAD encryption from the recipient to the sender, it can derive a key and nonce from the corresponding HPKE context as follows:

key = context.Export("response key", Nk)
nonce = context.Export("response nonce", Nn)

In this example, the length of each secret is based on the AEAD algorithm used for the corresponding HPKE context.

Note that HPKE's limitations with regard to sender authentication become limits on recipient authentication in this context. In particular, in the Base mode, there is no authentication of the remote party at all. Even in the Auth mode, where the remote party has proven that they hold a specific private key, this authentication is still subject to key-compromise impersonation, as discussed in Section 9.1.1 .

10. Message Encoding

This document does not specify a wire format encoding for HPKE messages. Applications that adopt HPKE must therefore specify an unambiguous encoding mechanism that includes, minimally: the encapsulated value enc , ciphertext value(s) (and order if there are multiple), and any info values that are not implicit. One example of a non-implicit value is the recipient public key used for encapsulation, which may be needed if a recipient has more than one public key.

The AEAD interface used in this document is based on [ RFC5116 ] , which produces and consumes a single ciphertext value. As discussed in [ RFC5116 ] , this ciphertext value contains the encrypted plaintext as well as any authentication data, encoded in a manner described by the individual AEAD scheme. Some implementations are not structured in this way, instead providing a separate ciphertext and authentication tag. When such AEAD implementations are used in HPKE implementations, the HPKE implementation must combine these inputs into a single ciphertext value within Seal() and parse them out within Open() , where the parsing details are defined by the AEAD scheme. For example, with the AES-GCM schemes specified in this document, the GCM authentication tag is placed in the last Nt bytes of the ciphertext output.

11. IANA Considerations

IANA has created three new registries:

  • HPKE KEM Identifiers
  • HPKE KDF Identifiers
  • HPKE AEAD Identifiers

All these registries are under "Hybrid Public Key Encryption", and administered under a Specification Required policy [ RFC8126 ] .

11.1. KEM Identifiers

The "HPKE KEM Identifiers" registry lists identifiers for key encapsulation algorithms defined for use with HPKE. These identifiers are two-byte values, so the maximum possible value is 0xFFFF = 65535.

Template:
Value:
The two-byte identifier for the algorithm
KEM:
The name of the algorithm
Nsecret:
The length in bytes of a KEM shared secret produced by the algorithm
Nenc:
The length in bytes of an encoded encapsulated key produced by the algorithm
Npk:
The length in bytes of an encoded public key for the algorithm
Nsk:
The length in bytes of an encoded private key for the algorithm
Auth:
A boolean indicating if this algorithm provides the AuthEncap() / AuthDecap() interface
Reference:
Where this algorithm is defined
Initial contents:
Provided in Table 2

11.2. KDF Identifiers

The "HPKE KDF Identifiers" registry lists identifiers for key derivation functions defined for use with HPKE. These identifiers are two-byte values, so the maximum possible value is 0xFFFF = 65535.

Template:
Value:
The two-byte identifier for the algorithm
KDF:
The name of the algorithm
Nh:
The output size of the Extract function in bytes
Reference:
Where this algorithm is defined
Initial contents:
Provided in Table 3

11.3. AEAD Identifiers

The "HPKE AEAD Identifiers" registry lists identifiers for authenticated encryption with associated data (AEAD) algorithms defined for use with HPKE. These identifiers are two-byte values, so the maximum possible value is 0xFFFF = 65535.

Template:
Value:
The two-byte identifier for the algorithm
AEAD:
The name of the algorithm
Nk:
The length in bytes of a key for this algorithm
Nn:
The length in bytes of a nonce for this algorithm
Nt:
The length in bytes of an authentication tag for this algorithm
Reference:
Where this algorithm is defined
Initial contents:
Provided in Table 5

12. References

12.1. Normative References

[RFC2119]
Bradner, S. , "Key words for use in RFCs to Indicate Requirement Levels" , BCP 14 , RFC 2119 , DOI 10.17487/RFC2119 , , < https://www.rfc-editor.org/info/rfc2119 > .
[RFC5116]
McGrew, D. , "An Interface and Algorithms for Authenticated Encryption" , RFC 5116 , DOI 10.17487/RFC5116 , , < https://www.rfc-editor.org/info/rfc5116 > .
[RFC8017]
Moriarty, K., Ed. , Kaliski, B. , Jonsson, J. , and A. Rusch , "PKCS #1: RSA Cryptography Specifications Version 2.2" , RFC 8017 , DOI 10.17487/RFC8017 , , < https://www.rfc-editor.org/info/rfc8017 > .
[RFC8126]
Cotton, M. , Leiba, B. , and T. Narten , "Guidelines for Writing an IANA Considerations Section in RFCs" , BCP 26 , RFC 8126 , DOI 10.17487/RFC8126 , , < https://www.rfc-editor.org/info/rfc8126 > .
[RFC8174]
Leiba, B. , "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words" , BCP 14 , RFC 8174 , DOI 10.17487/RFC8174 , , < https://www.rfc-editor.org/info/rfc8174 > .

12.2. Informative References

[ABHKLR20]
Alwen, J. , Blanchet, B. , Hauck, E. , Kiltz, E. , Lipp, B. , and D. Riepel , "Analysing the HPKE Standard" , , < https://eprint.iacr.org/2020/1499 > .
[ANSI]
American National Standards Institute (ANSI) , "ANSI - X9.63 Public Key Cryptography for the Financial Services Industry Key Agreement and Key Transport Using Elliptic Curve Cryptography" , .
[BHK09]
Bellare, M. , Hofheinz, D. , and E. Kiltz , "Subtleties in the Definition of IND-CCA: When and How Should Challenge-Decryption be Disallowed?" , , < https://eprint.iacr.org/2009/418 > .
[BJM97]
Blake-Wilson, S. , Johnson, D. , and A. Menezes , "Key agreement protocols and their security analysis: Extended Abstract" , Crytography and Coding, pp. 30-45 , DOI 10.1007/bfb0024447 , , < https://doi.org/10.1007/bfb0024447 > .
[BNT19]
Bellare, M. , Ng, R. , and B. Tackmann , "Nonces Are Noticed: AEAD Revisited" , , < http://dx.doi.org/10.1007/978-3-030-26948-7_9 > .
[CS01]
Cramer, R. and V. Shoup , "Design and Analysis of Practical Public-Key Encryption Schemes Secure against Adaptive Chosen Ciphertext Attack" , , < https://eprint.iacr.org/2001/108 > .
[GAP]
Okamoto, T. and D. Pointcheval , "The Gap-Problems: A New Class of Problems for the Security of Cryptographic Schemes" , ISBN 978-3-540-44586-9 , , < https://link.springer.com/content/pdf/10.1007/3-540-44586-2_8.pdf > .
[GCM]
Dworkin, M. , "Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode (GCM) and GMAC" , DOI 10.6028/nist.sp.800-38d , SP 800-38D , , < https://doi.org/10.6028/nist.sp.800-38d > .
[HHK06]
Herranz, J. , Hofheinz, D. , and E. Kiltz , "Some (in)sufficient conditions for secure hybrid encryption." , , < https://eprint.iacr.org/2006/265 > .
[HPKEAnalysis]
Lipp, B. , "An Analysis of Hybrid Public Key Encryption" , , < https://eprint.iacr.org/2020/243 > .
[IEEE1363]
IEEE , "IEEE Standard Specifications for Public-Key Cryptography - Amendment 1: Additional Techniques" , IEEE Std 1363a-2004 .
[IMB]
Diffie, W. , Van Oorschot, P. , and M. Wiener , "Authentication and authenticated key exchanges" , Designs, Codes and Cryptography, Vol. 2, pp. 107-125 , DOI 10.1007/bf00124891 , , < https://doi.org/10.1007/bf00124891 > .
[ISO]
International Organization for Standardization , "Information technology - Security techniques - Encryption algorithms - Part 2: Asymmetric ciphers" , ISO/IEC 18033-2:2006 , .
[keyagreement]
Barker, E. , Chen, L. , Roginsky, A. , Vassilev, A. , and R. Davis , "Recommendation for Pair-Wise Key-Establishment Schemes Using Discrete Logarithm Cryptography" , NIST Special Publication 800-56A Revision 3 , DOI 10.6028/nist.sp.800-56ar3 , , < https://doi.org/10.6028/nist.sp.800-56ar3 > .
[LGR20]
Len, J. , Grubbs, P. , and T. Ristenpart , "Partitioning Oracle Attacks" .
[MAEA10]
Gayoso Martinez, V. , Hernandez Alvarez, F. , Hernandez Encinas, L. , and C. Sanchez Avila , "A comparison of the standardized versions of ECIES" , , < https://ieeexplore.ieee.org/abstract/document/5604194/ > .
[MLS-PROTOCOL]
Barnes, R. , Beurdouche, B. , Robert, R. , Millican, J. , Omara, E. , and K. Cohn-Gordon , "The Messaging Layer Security (MLS) Protocol" , Work in Progress , Internet-Draft, draft-ietf-mls-protocol-12 , , < https://datatracker.ietf.org/doc/html/draft-ietf-mls-protocol-12 > .
[NaCl]
"Public-key authenticated encryption: crypto_box" , , < https://nacl.cr.yp.to/box.html > .
[NISTCurves]
National Institute of Standards and Technology (NIST) , "Digital Signature Standard (DSS)" , DOI 10.6028/nist.fips.186-4 , FIPS PUB 186-4 , , < https://doi.org/10.6028/nist.fips.186-4 > .
[RFC1421]
Linn, J. , "Privacy Enhancement for Internet Electronic Mail: Part I: Message Encryption and Authentication Procedures" , RFC 1421 , DOI 10.17487/RFC1421 , , < https://www.rfc-editor.org/info/rfc1421 > .
[RFC5869]
Krawczyk, H. and P. Eronen , "HMAC-based Extract-and-Expand Key Derivation Function (HKDF)" , RFC 5869 , DOI 10.17487/RFC5869 , , < https://www.rfc-editor.org/info/rfc5869 > .
[RFC7748]
Langley, A. , Hamburg, M. , and S. Turner , "Elliptic Curves for Security" , RFC 7748 , DOI 10.17487/RFC7748 , , < https://www.rfc-editor.org/info/rfc7748 > .
[RFC8439]
Nir, Y. and A. Langley , "ChaCha20 and Poly1305 for IETF Protocols" , RFC 8439 , DOI 10.17487/RFC8439 , , < https://www.rfc-editor.org/info/rfc8439 > .
[RFC8446]
Rescorla, E. , "The Transport Layer Security (TLS) Protocol Version 1.3" , RFC 8446 , DOI 10.17487/RFC8446 , , < https://www.rfc-editor.org/info/rfc8446 > .
[RFC8467]
Mayrhofer, A. , "Padding Policies for Extension Mechanisms for DNS (EDNS(0))" , RFC 8467 , DOI 10.17487/RFC8467 , , < https://www.rfc-editor.org/info/rfc8467 > .
[RFC8696]
Housley, R. , "Using Pre-Shared Key (PSK) in the Cryptographic Message Syntax (CMS)" , RFC 8696 , DOI 10.17487/RFC8696 , , < https://www.rfc-editor.org/info/rfc8696 > .
[RFC8937]
Cremers, C. , Garratt, L. , Smyshlyaev, S. , Sullivan, N. , and C. Wood , "Randomness Improvements for Security Protocols" , RFC 8937 , DOI 10.17487/RFC8937 , , < https://www.rfc-editor.org/info/rfc8937 > .
[SECG]
Standards for Efficient Cryptography Group , "SEC 1: Elliptic Curve Cryptography," , Version 2 , , < https://secg.org/sec1-v2.pdf > .
[SigncryptionDZ10]
Dent, A. and Y. Zheng , "Practical Signcryption" , Information Security and Cryptography , DOI 10.1007/978-3-540-89411-7 , , < https://doi.org/10.1007/978-3-540-89411-7 > .
[TestVectors]
"HPKE Test Vectors" , < https://github.com/cfrg/draft-irtf-cfrg-hpke/blob/5f503c564da00b0687b3de75f1dfbdfc4079ad31/test-vectors.json > .
[TLS-ECH]
Rescorla, E. , Oku, K. , Sullivan, N. , and C. A. Wood , "TLS Encrypted Client Hello" , Work in Progress , Internet-Draft, draft-ietf-tls-esni-14 , , < https://datatracker.ietf.org/doc/html/draft-ietf-tls-esni-14 > .

Appendix A. Test Vectors

Each section below contains test vectors for a single HPKE ciphersuite and contains the following values:

  1. Configuration information and private key material: This includes the mode , info string, HPKE ciphersuite identifiers ( kem_id , kdf_id , aead_id ), and all sender, recipient, and ephemeral key material. For each role X, where X is one of S, R, or E, as sender, recipient, and ephemeral, respectively, key pairs are generated as (skX, pkX) = DeriveKeyPair(ikmX) . Each key pair (skX, pkX) is written in its serialized form, where skXm = SerializePrivateKey(skX) and pkXm = SerializePublicKey(pkX) . For applicable modes, the shared PSK and PSK identifier are also included.
  2. Context creation intermediate values and outputs: This includes the KEM outputs enc and shared_secret used to create the context, along with intermediate values key_schedule_context and secret computed in the KeySchedule function in Section 5.1 . The outputs include the context values key , base_nonce , and exporter_secret .
  3. Encryption test vectors: A fixed plaintext message is encrypted using different sequence numbers and associated data values using the context computed in (2). Each test vector lists the sequence number and corresponding nonce computed with base_nonce , the plaintext message pt , associated data aad , and output ciphertext ct .
  4. Export test vectors: Several exported values of the same length with differing context parameters are computed using the context computed in (2). Each test vector lists the exporter_context , output length L , and resulting export value.

These test vectors are also available in JSON format at [ TestVectors ] .

A.1. DHKEM(X25519, HKDF-SHA256), HKDF-SHA256, AES-128-GCM

A.1.1. Base Setup Information

mode: 0
kem_id: 32
kdf_id: 1
aead_id: 1
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
7268600d403fce431561aef583ee1613527cff655c1343f29812e66706df3234
pkEm:
37fda3567bdbd628e88668c3c8d7e97d1d1253b6d4ea6d44c150f741f1bf4431
skEm:
52c4a758a802cd8b936eceea314432798d5baf2d7e9235dc084ab1b9cfa2f736
ikmR:
6db9df30aa07dd42ee5e8181afdb977e538f5e1fec8a06223f33f7013e525037
pkRm:
3948cfe0ad1ddb695d780e59077195da6c56506b027329794ab02bca80815c4d
skRm:
4612c550263fc8ad58375df3f557aac531d26850903e55a9f23f21d8534e8ac8
enc:
37fda3567bdbd628e88668c3c8d7e97d1d1253b6d4ea6d44c150f741f1bf4431
shared_secret:
fe0e18c9f024ce43799ae393c7e8fe8fce9d218875e8227b0187c04e7d2ea1fc
key_schedule_context: 00725611c9d98c07c03f60095cd32d400d8347d45ed670
97bbad50fc56da742d07cb6cffde367bb0565ba28bb02c90744a20f5ef37f3052352
6106f637abb05449
secret:
12fff91991e93b48de37e7daddb52981084bd8aa64289c3788471d9a9712f397
key: 4531685d41d65f03dc48f6b8302c05b0
base_nonce: 56d890e5accaaf011cff4b7d
exporter_secret:
45ff1c2e220db587171952c0592d5f5ebe103f1561a2614e38f2ffd47e99e3f8

A.1.1.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 56d890e5accaaf011cff4b7d
ct: f938558b5d72f1a23810b4be2ab4f84331acc02fc97babc53a52ae8218a355a9
6d8770ac83d07bea87e13c512a

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 56d890e5accaaf011cff4b7c
ct: af2d7e9ac9ae7e270f46ba1f975be53c09f8d875bdc8535458c2494e8a6eab25
1c03d0c22a56b8ca42c2063b84

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 56d890e5accaaf011cff4b7f
ct: 498dfcabd92e8acedc281e85af1cb4e3e31c7dc394a1ca20e173cb7251649158
8d96a19ad4a683518973dcc180

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 56d890e5accaaf011cff4b79
ct: 583bd32bc67a5994bb8ceaca813d369bca7b2a42408cddef5e22f880b631215a
09fc0012bc69fccaa251c0246d

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 56d890e5accaaf011cff4b82
ct: 7175db9717964058640a3a11fb9007941a5d1757fda1a6935c805c21af32505b
f106deefec4a49ac38d71c9e0a

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 56d890e5accaaf011cff4a7d
ct: 957f9800542b0b8891badb026d79cc54597cb2d225b54c00c5238c25d05c30e3
fbeda97d2e0e1aba483a2df9f2

A.1.1.2. Exported Values
exporter_context:
L: 32
exported_value:
3853fe2b4035195a573ffc53856e77058e15d9ea064de3e59f4961d0095250ee

exporter_context: 00
L: 32
exported_value:
2e8f0b54673c7029649d4eb9d5e33bf1872cf76d623ff164ac185da9e88c21a5

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
e9e43065102c3836401bed8c3c3c75ae46be1639869391d62c61f1ec7af54931

A.1.2. PSK Setup Information

mode: 1
kem_id: 32
kdf_id: 1
aead_id: 1
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
78628c354e46f3e169bd231be7b2ff1c77aa302460a26dbfa15515684c00130b
pkEm:
0ad0950d9fb9588e59690b74f1237ecdf1d775cd60be2eca57af5a4b0471c91b
skEm:
463426a9ffb42bb17dbe6044b9abd1d4e4d95f9041cef0e99d7824eef2b6f588
ikmR:
d4a09d09f575fef425905d2ab396c1449141463f698f8efdb7accfaff8995098
pkRm:
9fed7e8c17387560e92cc6462a68049657246a09bfa8ade7aefe589672016366
skRm:
c5eb01eb457fe6c6f57577c5413b931550a162c71a03ac8d196babbd4e5ce0fd
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc:
0ad0950d9fb9588e59690b74f1237ecdf1d775cd60be2eca57af5a4b0471c91b
shared_secret:
727699f009ffe3c076315019c69648366b69171439bd7dd0807743bde76986cd
key_schedule_context: 01e78d5cf6190d275863411ff5edd0dece5d39fa48e04e
ec1ed9b71be34729d18ccb6cffde367bb0565ba28bb02c90744a20f5ef37f3052352
6106f637abb05449
secret:
3728ab0b024b383b0381e432b47cced1496d2516957a76e2a9f5c8cb947afca4
key: 15026dba546e3ae05836fc7de5a7bb26
base_nonce: 9518635eba129d5ce0914555
exporter_secret:
3d76025dbbedc49448ec3f9080a1abab6b06e91c0b11ad23c912f043a0ee7655

A.1.2.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 9518635eba129d5ce0914555
ct: e52c6fed7f758d0cf7145689f21bc1be6ec9ea097fef4e959440012f4feb73fb
611b946199e681f4cfc34db8ea

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 9518635eba129d5ce0914554
ct: 49f3b19b28a9ea9f43e8c71204c00d4a490ee7f61387b6719db765e948123b45
b61633ef059ba22cd62437c8ba

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 9518635eba129d5ce0914557
ct: 257ca6a08473dc851fde45afd598cc83e326ddd0abe1ef23baa3baa4dd8cde99
fce2c1e8ce687b0b47ead1adc9

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 9518635eba129d5ce0914551
ct: a71d73a2cd8128fcccbd328b9684d70096e073b59b40b55e6419c9c68ae21069
c847e2a70f5d8fb821ce3dfb1c

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 9518635eba129d5ce09145aa
ct: 55f84b030b7f7197f7d7d552365b6b932df5ec1abacd30241cb4bc4ccea27bd2
b518766adfa0fb1b71170e9392

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 9518635eba129d5ce0914455
ct: c5bf246d4a790a12dcc9eed5eae525081e6fb541d5849e9ce8abd92a3bc15517
76bea16b4a518f23e237c14b59

A.1.2.2. Exported Values
exporter_context:
L: 32
exported_value:
dff17af354c8b41673567db6259fd6029967b4e1aad13023c2ae5df8f4f43bf6

exporter_context: 00
L: 32
exported_value:
6a847261d8207fe596befb52928463881ab493da345b10e1dcc645e3b94e2d95

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
8aff52b45a1be3a734bc7a41e20b4e055ad4c4d22104b0c20285a7c4302401cd

A.1.3. Auth Setup Information

mode: 2
kem_id: 32
kdf_id: 1
aead_id: 1
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
6e6d8f200ea2fb20c30b003a8b4f433d2f4ed4c2658d5bc8ce2fef718059c9f7
pkEm:
23fb952571a14a25e3d678140cd0e5eb47a0961bb18afcf85896e5453c312e76
skEm:
ff4442ef24fbc3c1ff86375b0be1e77e88a0de1e79b30896d73411c5ff4c3518
ikmR:
f1d4a30a4cef8d6d4e3b016e6fd3799ea057db4f345472ed302a67ce1c20cdec
pkRm:
1632d5c2f71c2b38d0a8fcc359355200caa8b1ffdf28618080466c909cb69b2e
skRm:
fdea67cf831f1ca98d8e27b1f6abeb5b7745e9d35348b80fa407ff6958f9137e
ikmS:
94b020ce91d73fca4649006c7e7329a67b40c55e9e93cc907d282bbbff386f58
pkSm:
8b0c70873dc5aecb7f9ee4e62406a397b350e57012be45cf53b7105ae731790b
skSm:
dc4a146313cce60a278a5323d321f051c5707e9c45ba21a3479fecdf76fc69dd
enc:
23fb952571a14a25e3d678140cd0e5eb47a0961bb18afcf85896e5453c312e76
shared_secret:
2d6db4cf719dc7293fcbf3fa64690708e44e2bebc81f84608677958c0d4448a7
key_schedule_context: 02725611c9d98c07c03f60095cd32d400d8347d45ed670
97bbad50fc56da742d07cb6cffde367bb0565ba28bb02c90744a20f5ef37f3052352
6106f637abb05449
secret:
56c62333d9d9f7767f5b083fdfce0aa7e57e301b74029bb0cffa7331385f1dda
key: b062cb2c4dd4bca0ad7c7a12bbc341e6
base_nonce: a1bc314c1942ade7051ffed0
exporter_secret:
ee1a093e6e1c393c162ea98fdf20560c75909653550540a2700511b65c88c6f1

A.1.3.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: a1bc314c1942ade7051ffed0
ct: 5fd92cc9d46dbf8943e72a07e42f363ed5f721212cd90bcfd072bfd9f44e06b8
0fd17824947496e21b680c141b

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: a1bc314c1942ade7051ffed1
ct: d3736bb256c19bfa93d79e8f80b7971262cb7c887e35c26370cfed62254369a1
b52e3d505b79dd699f002bc8ed

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: a1bc314c1942ade7051ffed2
ct: 122175cfd5678e04894e4ff8789e85dd381df48dcaf970d52057df2c9acc3b12
1313a2bfeaa986050f82d93645

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: a1bc314c1942ade7051ffed4
ct: dae12318660cf963c7bcbef0f39d64de3bf178cf9e585e756654043cc5059873
bc8af190b72afc43d1e0135ada

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: a1bc314c1942ade7051ffe2f
ct: 55d53d85fe4d9e1e97903101eab0b4865ef20cef28765a47f840ff99625b7d69
dee927df1defa66a036fc58ff2

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: a1bc314c1942ade7051fffd0
ct: 42fa248a0e67ccca688f2b1d13ba4ba84755acf764bd797c8f7ba3b9b1dc3330
326f8d172fef6003c79ec72319

A.1.3.2. Exported Values
exporter_context:
L: 32
exported_value:
28c70088017d70c896a8420f04702c5a321d9cbf0279fba899b59e51bac72c85

exporter_context: 00
L: 32
exported_value:
25dfc004b0892be1888c3914977aa9c9bbaf2c7471708a49e1195af48a6f29ce

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
5a0131813abc9a522cad678eb6bafaabc43389934adb8097d23c5ff68059eb64

A.1.4. AuthPSK Setup Information

mode: 3
kem_id: 32
kdf_id: 1
aead_id: 1
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
4303619085a20ebcf18edd22782952b8a7161e1dbae6e46e143a52a96127cf84
pkEm:
820818d3c23993492cc5623ab437a48a0a7ca3e9639c140fe1e33811eb844b7c
skEm:
14de82a5897b613616a00c39b87429df35bc2b426bcfd73febcb45e903490768
ikmR:
4b16221f3b269a88e207270b5e1de28cb01f847841b344b8314d6a622fe5ee90
pkRm:
1d11a3cd247ae48e901939659bd4d79b6b959e1f3e7d66663fbc9412dd4e0976
skRm:
cb29a95649dc5656c2d054c1aa0d3df0493155e9d5da6d7e344ed8b6a64a9423
ikmS:
62f77dcf5df0dd7eac54eac9f654f426d4161ec850cc65c54f8b65d2e0b4e345
pkSm:
2bfb2eb18fcad1af0e4f99142a1c474ae74e21b9425fc5c589382c69b50cc57e
skSm:
fc1c87d2f3832adb178b431fce2ac77c7ca2fd680f3406c77b5ecdf818b119f4
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc:
820818d3c23993492cc5623ab437a48a0a7ca3e9639c140fe1e33811eb844b7c
shared_secret:
f9d0e870aba28d04709b2680cb8185466c6a6ff1d6e9d1091d5bf5e10ce3a577
key_schedule_context: 03e78d5cf6190d275863411ff5edd0dece5d39fa48e04e
ec1ed9b71be34729d18ccb6cffde367bb0565ba28bb02c90744a20f5ef37f3052352
6106f637abb05449
secret:
5f96c55e4108c6691829aaabaa7d539c0b41d7c72aae94ae289752f056b6cec4
key: 1364ead92c47aa7becfa95203037b19a
base_nonce: 99d8b5c54669807e9fc70df1
exporter_secret:
f048d55eacbf60f9c6154bd4021774d1075ebf963c6adc71fa846f183ab2dde6

A.1.4.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 99d8b5c54669807e9fc70df1
ct: a84c64df1e11d8fd11450039d4fe64ff0c8a99fca0bd72c2d4c3e0400bc14a40
f27e45e141a24001697737533e

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 99d8b5c54669807e9fc70df0
ct: 4d19303b848f424fc3c3beca249b2c6de0a34083b8e909b6aa4c3688505c05ff
e0c8f57a0a4c5ab9da127435d9

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 99d8b5c54669807e9fc70df3
ct: 0c085a365fbfa63409943b00a3127abce6e45991bc653f182a80120868fc507e
9e4d5e37bcc384fc8f14153b24

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 99d8b5c54669807e9fc70df5
ct: 000a3cd3a3523bf7d9796830b1cd987e841a8bae6561ebb6791a3f0e34e89a4f
b539faeee3428b8bbc082d2c1a

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 99d8b5c54669807e9fc70d0e
ct: 576d39dd2d4cc77d1a14a51d5c5f9d5e77586c3d8d2ab33bdec6379e28ce5c50
2f0b1cbd09047cf9eb9269bb52

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 99d8b5c54669807e9fc70cf1
ct: 13239bab72e25e9fd5bb09695d23c90a24595158b99127505c8a9ff9f127e0d6
57f71af59d67d4f4971da028f9

A.1.4.2. Exported Values
exporter_context:
L: 32
exported_value:
08f7e20644bb9b8af54ad66d2067457c5f9fcb2a23d9f6cb4445c0797b330067

exporter_context: 00
L: 32
exported_value:
52e51ff7d436557ced5265ff8b94ce69cf7583f49cdb374e6aad801fc063b010

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
a30c20370c026bbea4dca51cb63761695132d342bae33a6a11527d3e7679436d

A.2. DHKEM(X25519, HKDF-SHA256), HKDF-SHA256, ChaCha20Poly1305

A.2.1. Base Setup Information

mode: 0
kem_id: 32
kdf_id: 1
aead_id: 3
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
909a9b35d3dc4713a5e72a4da274b55d3d3821a37e5d099e74a647db583a904b
pkEm:
1afa08d3dec047a643885163f1180476fa7ddb54c6a8029ea33f95796bf2ac4a
skEm:
f4ec9b33b792c372c1d2c2063507b684ef925b8c75a42dbcbf57d63ccd381600
ikmR:
1ac01f181fdf9f352797655161c58b75c656a6cc2716dcb66372da835542e1df
pkRm:
4310ee97d88cc1f088a5576c77ab0cf5c3ac797f3d95139c6c84b5429c59662a
skRm:
8057991eef8f1f1af18f4a9491d16a1ce333f695d4db8e38da75975c4478e0fb
enc:
1afa08d3dec047a643885163f1180476fa7ddb54c6a8029ea33f95796bf2ac4a
shared_secret:
0bbe78490412b4bbea4812666f7916932b828bba79942424abb65244930d69a7
key_schedule_context: 00431df6cd95e11ff49d7013563baf7f11588c75a6611e
e2a4404a49306ae4cfc5b69c5718a60cc5876c358d3f7fc31ddb598503f67be58ea1
e798c0bb19eb9796
secret:
5b9cd775e64b437a2335cf499361b2e0d5e444d5cb41a8a53336d8fe402282c6
key:
ad2744de8e17f4ebba575b3f5f5a8fa1f69c2a07f6e7500bc60ca6e3e3ec1c91
base_nonce: 5c4d98150661b848853b547f
exporter_secret:
a3b010d4994890e2c6968a36f64470d3c824c8f5029942feb11e7a74b2921922

A.2.1.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 5c4d98150661b848853b547f
ct: 1c5250d8034ec2b784ba2cfd69dbdb8af406cfe3ff938e131f0def8c8b60b4db
21993c62ce81883d2dd1b51a28

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 5c4d98150661b848853b547e
ct: 6b53c051e4199c518de79594e1c4ab18b96f081549d45ce015be002090bb119e
85285337cc95ba5f59992dc98c

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 5c4d98150661b848853b547d
ct: 71146bd6795ccc9c49ce25dda112a48f202ad220559502cef1f34271e0cb4b02
b4f10ecac6f48c32f878fae86b

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 5c4d98150661b848853b547b
ct: 63357a2aa291f5a4e5f27db6baa2af8cf77427c7c1a909e0b37214dd47db122b
b153495ff0b02e9e54a50dbe16

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 5c4d98150661b848853b5480
ct: 18ab939d63ddec9f6ac2b60d61d36a7375d2070c9b683861110757062c52b888
0a5f6b3936da9cd6c23ef2a95c

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 5c4d98150661b848853b557f
ct: 7a4a13e9ef23978e2c520fd4d2e757514ae160cd0cd05e556ef692370ca53076
214c0c40d4c728d6ed9e727a5b

A.2.1.2. Exported Values
exporter_context:
L: 32
exported_value:
4bbd6243b8bb54cec311fac9df81841b6fd61f56538a775e7c80a9f40160606e

exporter_context: 00
L: 32
exported_value:
8c1df14732580e5501b00f82b10a1647b40713191b7c1240ac80e2b68808ba69

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
5acb09211139c43b3090489a9da433e8a30ee7188ba8b0a9a1ccf0c229283e53

A.2.2. PSK Setup Information

mode: 1
kem_id: 32
kdf_id: 1
aead_id: 3
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
35706a0b09fb26fb45c39c2f5079c709c7cf98e43afa973f14d88ece7e29c2e3
pkEm:
2261299c3f40a9afc133b969a97f05e95be2c514e54f3de26cbe5644ac735b04
skEm:
0c35fdf49df7aa01cd330049332c40411ebba36e0c718ebc3edf5845795f6321
ikmR:
26b923eade72941c8a85b09986cdfa3f1296852261adedc52d58d2930269812b
pkRm:
13640af826b722fc04feaa4de2f28fbd5ecc03623b317834e7ff4120dbe73062
skRm:
77d114e0212be51cb1d76fa99dd41cfd4d0166b08caa09074430a6c59ef17879
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc:
2261299c3f40a9afc133b969a97f05e95be2c514e54f3de26cbe5644ac735b04
shared_secret:
4be079c5e77779d0215b3f689595d59e3e9b0455d55662d1f3666ec606e50ea7
key_schedule_context: 016870c4c76ca38ae43efbec0f2377d109499d7ce73f4a
9e1ec37f21d3d063b97cb69c5718a60cc5876c358d3f7fc31ddb598503f67be58ea1
e798c0bb19eb9796
secret:
16974354c497c9bd24c000ceed693779b604f1944975b18c442d373663f4a8cc
key:
600d2fdb0313a7e5c86a9ce9221cd95bed069862421744cfb4ab9d7203a9c019
base_nonce: 112e0465562045b7368653e7
exporter_secret:
73b506dc8b6b4269027f80b0362def5cbb57ee50eed0c2873dac9181f453c5ac

A.2.2.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 112e0465562045b7368653e7
ct: 4a177f9c0d6f15cfdf533fb65bf84aecdc6ab16b8b85b4cf65a370e07fc1d78d
28fb073214525276f4a89608ff

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 112e0465562045b7368653e6
ct: 5c3cabae2f0b3e124d8d864c116fd8f20f3f56fda988c3573b40b09997fd6c76
9e77c8eda6cda4f947f5b704a8

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 112e0465562045b7368653e5
ct: 14958900b44bdae9cbe5a528bf933c5c990dbb8e282e6e495adf8205d19da9eb
270e3a6f1e0613ab7e757962a4

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 112e0465562045b7368653e3
ct: c2a7bc09ddb853cf2effb6e8d058e346f7fe0fb3476528c80db6b698415c5f8c
50b68a9a355609e96d2117f8d3

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 112e0465562045b736865318
ct: 2414d0788e4bc39a59a26d7bd5d78e111c317d44c37bd5a4c2a1235f2ddc2085
c487d406490e75210c958724a7

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 112e0465562045b7368652e7
ct: c567ae1c3f0f75abe1dd9e4532b422600ed4a6e5b9484dafb1e43ab9f5fd662b
28c00e2e81d3cde955dae7e218

A.2.2.2. Exported Values
exporter_context:
L: 32
exported_value:
813c1bfc516c99076ae0f466671f0ba5ff244a41699f7b2417e4c59d46d39f40

exporter_context: 00
L: 32
exported_value:
2745cf3d5bb65c333658732954ee7af49eb895ce77f8022873a62a13c94cb4e1

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
ad40e3ae14f21c99bfdebc20ae14ab86f4ca2dc9a4799d200f43a25f99fa78ae

A.2.3. Auth Setup Information

mode: 2
kem_id: 32
kdf_id: 1
aead_id: 3
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
938d3daa5a8904540bc24f48ae90eed3f4f7f11839560597b55e7c9598c996c0
pkEm:
f7674cc8cd7baa5872d1f33dbaffe3314239f6197ddf5ded1746760bfc847e0e
skEm:
c94619e1af28971c8fa7957192b7e62a71ca2dcdde0a7cc4a8a9e741d600ab13
ikmR:
64835d5ee64aa7aad57c6f2e4f758f7696617f8829e70bc9ac7a5ef95d1c756c
pkRm:
1a478716d63cb2e16786ee93004486dc151e988b34b475043d3e0175bdb01c44
skRm:
3ca22a6d1cda1bb9480949ec5329d3bf0b080ca4c45879c95eddb55c70b80b82
ikmS:
9d8f94537d5a3ddef71234c0baedfad4ca6861634d0b94c3007fed557ad17df6
pkSm:
f0f4f9e96c54aeed3f323de8534fffd7e0577e4ce269896716bcb95643c8712b
skSm:
2def0cb58ffcf83d1062dd085c8aceca7f4c0c3fd05912d847b61f3e54121f05
enc:
f7674cc8cd7baa5872d1f33dbaffe3314239f6197ddf5ded1746760bfc847e0e
shared_secret:
d2d67828c8bc9fa661cf15a31b3ebf1febe0cafef7abfaaca580aaf6d471e3eb
key_schedule_context: 02431df6cd95e11ff49d7013563baf7f11588c75a6611e
e2a4404a49306ae4cfc5b69c5718a60cc5876c358d3f7fc31ddb598503f67be58ea1
e798c0bb19eb9796
secret:
3022dfc0a81d6e09a2e6daeeb605bb1ebb9ac49535540d9a4c6560064a6c6da8
key:
b071fd1136680600eb447a845a967d35e9db20749cdf9ce098bcc4deef4b1356
base_nonce: d20577dff16d7cea2c4bf780
exporter_secret:
be2d93b82071318cdb88510037cf504344151f2f9b9da8ab48974d40a2251dd7

A.2.3.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: d20577dff16d7cea2c4bf780
ct: ab1a13c9d4f01a87ec3440dbd756e2677bd2ecf9df0ce7ed73869b98e00c09be
111cb9fdf077347aeb88e61bdf

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: d20577dff16d7cea2c4bf781
ct: 3265c7807ffff7fdace21659a2c6ccffee52a26d270c76468ed74202a65478bf
aedfff9c2b7634e24f10b71016

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: d20577dff16d7cea2c4bf782
ct: 3aadee86ad2a05081ea860033a9d09dbccb4acac2ded0891da40f51d4df19925
f7a767b076a5cbc9355c8fd35e

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: d20577dff16d7cea2c4bf784
ct: 502ecccd5c2be3506a081809cc58b43b94f77cbe37b8b31712d9e21c9e61aa69
46a8e922f54eae630f88eb8033

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: d20577dff16d7cea2c4bf77f
ct: 652e597ba20f3d9241cda61f33937298b1169e6adf72974bbe454297502eb4be
132e1c5064702fc165c2ddbde8

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: d20577dff16d7cea2c4bf680
ct: 3be14e8b3bbd1028cf2b7d0a691dbbeff71321e7dec92d3c2cfb30a0994ab246
af76168480285a60037b4ba13a

A.2.3.2. Exported Values
exporter_context:
L: 32
exported_value:
070cffafd89b67b7f0eeb800235303a223e6ff9d1e774dce8eac585c8688c872

exporter_context: 00
L: 32
exported_value:
2852e728568d40ddb0edde284d36a4359c56558bb2fb8837cd3d92e46a3a14a8

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
1df39dc5dd60edcbf5f9ae804e15ada66e885b28ed7929116f768369a3f950ee

A.2.4. AuthPSK Setup Information

mode: 3
kem_id: 32
kdf_id: 1
aead_id: 3
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
49d6eac8c6c558c953a0a252929a818745bb08cd3d29e15f9f5db5eb2e7d4b84
pkEm:
656a2e00dc9990fd189e6e473459392df556e9a2758754a09db3f51179a3fc02
skEm:
5e6dd73e82b856339572b7245d3cbb073a7561c0bee52873490e305cbb710410
ikmR:
f3304ddcf15848488271f12b75ecaf72301faabf6ad283654a14c398832eb184
pkRm:
a5099431c35c491ec62ca91df1525d6349cb8aa170c51f9581f8627be6334851
skRm:
7b36a42822e75bf3362dfabbe474b3016236408becb83b859a6909e22803cb0c
ikmS:
20ade1d5203de1aadfb261c4700b6432e260d0d317be6ebbb8d7fffb1f86ad9d
pkSm:
3ac5bd4dd66ff9f2740bef0d6ccb66daa77bff7849d7895182b07fb74d087c45
skSm:
90761c5b0a7ef0985ed66687ad708b921d9803d51637c8d1cb72d03ed0f64418
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc:
656a2e00dc9990fd189e6e473459392df556e9a2758754a09db3f51179a3fc02
shared_secret:
86a6c0ed17714f11d2951747e660857a5fd7616c933ef03207808b7a7123fe67
key_schedule_context: 036870c4c76ca38ae43efbec0f2377d109499d7ce73f4a
9e1ec37f21d3d063b97cb69c5718a60cc5876c358d3f7fc31ddb598503f67be58ea1
e798c0bb19eb9796
secret:
22670daee17530c9564001d0a7e740e80d0bcc7ae15349f472fcc9e057cbc259
key:
49c7e6d7d2d257aded2a746fe6a9bf12d4de8007c4862b1fdffe8c35fb65054c
base_nonce: abac79931e8c1bcb8a23960a
exporter_secret:
7c6cc1bb98993cd93e2599322247a58fd41fdecd3db895fb4c5fd8d6bbe606b5

A.2.4.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: abac79931e8c1bcb8a23960a
ct: 9aa52e29274fc6172e38a4461361d2342585d3aeec67fb3b721ecd63f059577c
7fe886be0ede01456ebc67d597

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: abac79931e8c1bcb8a23960b
ct: 59460bacdbe7a920ef2806a74937d5a691d6d5062d7daafcad7db7e4d8c649ad
ffe575c1889c5c2e3a49af8e3e

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: abac79931e8c1bcb8a239608
ct: 5688ff6a03ba26ae936044a5c800f286fb5d1eccdd2a0f268f6ff9773b511693
18d1a1466bb36263415071db00

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: abac79931e8c1bcb8a23960e
ct: d936b7a01f5c7dc4c3dc04e322cc694684ee18dd71719196874e5235aed3cfb0
6cadcd3bc7da0877488d7c551d

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: abac79931e8c1bcb8a2396f5
ct: 4d4c462f7b9b637eaf1f4e15e325b7bc629c0af6e3073422c86064cc3c98cff8
7300f054fd56dd57dc34358beb

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: abac79931e8c1bcb8a23970a
ct: 9b7f84224922d2a9edd7b2c2057f3bcf3a547f17570575e626202e593bfdd99e
9878a1af9e41ded58c7fb77d2f

A.2.4.2. Exported Values
exporter_context:
L: 32
exported_value:
c23ebd4e7a0ad06a5dddf779f65004ce9481069ce0f0e6dd51a04539ddcbd5cd

exporter_context: 00
L: 32
exported_value:
ed7ff5ca40a3d84561067ebc8e01702bc36cf1eb99d42a92004642b9dfaadd37

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
d3bae066aa8da27d527d85c040f7dd6ccb60221c902ee36a82f70bcd62a60ee4

A.3. DHKEM(P-256, HKDF-SHA256), HKDF-SHA256, AES-128-GCM

A.3.1. Base Setup Information

mode: 0
kem_id: 16
kdf_id: 1
aead_id: 1
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
4270e54ffd08d79d5928020af4686d8f6b7d35dbe470265f1f5aa22816ce860e
pkEm: 04a92719c6195d5085104f469a8b9814d5838ff72b60501e2c4466e5e67b32
5ac98536d7b61a1af4b78e5b7f951c0900be863c403ce65c9bfcb9382657222d18c4
skEm:
4995788ef4b9d6132b249ce59a77281493eb39af373d236a1fe415cb0c2d7beb
ikmR:
668b37171f1072f3cf12ea8a236a45df23fc13b82af3609ad1e354f6ef817550
pkRm: 04fe8c19ce0905191ebc298a9245792531f26f0cece2460639e8bc39cb7f70
6a826a779b4cf969b8a0e539c7f62fb3d30ad6aa8f80e30f1d128aafd68a2ce72ea0
skRm:
f3ce7fdae57e1a310d87f1ebbde6f328be0a99cdbcadf4d6589cf29de4b8ffd2
enc: 04a92719c6195d5085104f469a8b9814d5838ff72b60501e2c4466e5e67b325
ac98536d7b61a1af4b78e5b7f951c0900be863c403ce65c9bfcb9382657222d18c4
shared_secret:
c0d26aeab536609a572b07695d933b589dcf363ff9d93c93adea537aeabb8cb8
key_schedule_context: 00b88d4e6d91759e65e87c470e8b9141113e9ad5f0c8ce
efc1e088c82e6980500798e486f9c9c09c9b5c753ac72d6005de254c607d1b534ed1
1d493ae1c1d9ac85
secret:
2eb7b6bf138f6b5aff857414a058a3f1750054a9ba1f72c2cf0684a6f20b10e1
key: 868c066ef58aae6dc589b6cfdd18f97e
base_nonce: 4e0bc5018beba4bf004cca59
exporter_secret:
14ad94af484a7ad3ef40e9f3be99ecc6fa9036df9d4920548424df127ee0d99f

A.3.1.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 4e0bc5018beba4bf004cca59
ct: 5ad590bb8baa577f8619db35a36311226a896e7342a6d836d8b7bcd2f20b6c7f
9076ac232e3ab2523f39513434

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 4e0bc5018beba4bf004cca58
ct: fa6f037b47fc21826b610172ca9637e82d6e5801eb31cbd3748271affd4ecb06
646e0329cbdf3c3cd655b28e82

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 4e0bc5018beba4bf004cca5b
ct: 895cabfac50ce6c6eb02ffe6c048bf53b7f7be9a91fc559402cbc5b8dcaeb52b
2ccc93e466c28fb55fed7a7fec

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 4e0bc5018beba4bf004cca5d
ct: 8787491ee8df99bc99a246c4b3216d3d57ab5076e18fa27133f520703bc70ec9
99dd36ce042e44f0c3169a6a8f

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 4e0bc5018beba4bf004ccaa6
ct: 2ad71c85bf3f45c6eca301426289854b31448bcf8a8ccb1deef3ebd87f60848a
a53c538c30a4dac71d619ee2cd

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 4e0bc5018beba4bf004ccb59
ct: 10f179686aa2caec1758c8e554513f16472bd0a11e2a907dde0b212cbe87d74f
367f8ffe5e41cd3e9962a6afb2

A.3.1.2. Exported Values
exporter_context:
L: 32
exported_value:
5e9bc3d236e1911d95e65b576a8a86d478fb827e8bdfe77b741b289890490d4d

exporter_context: 00
L: 32
exported_value:
6cff87658931bda83dc857e6353efe4987a201b849658d9b047aab4cf216e796

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
d8f1ea7942adbba7412c6d431c62d01371ea476b823eb697e1f6e6cae1dab85a

A.3.2. PSK Setup Information

mode: 1
kem_id: 16
kdf_id: 1
aead_id: 1
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
2afa611d8b1a7b321c761b483b6a053579afa4f767450d3ad0f84a39fda587a6
pkEm: 04305d35563527bce037773d79a13deabed0e8e7cde61eecee403496959e89
e4d0ca701726696d1485137ccb5341b3c1c7aaee90a4a02449725e744b1193b53b5f
skEm:
57427244f6cc016cddf1c19c8973b4060aa13579b4c067fd5d93a5d74e32a90f
ikmR:
d42ef874c1913d9568c9405407c805baddaffd0898a00f1e84e154fa787b2429
pkRm: 040d97419ae99f13007a93996648b2674e5260a8ebd2b822e84899cd52d874
46ea394ca76223b76639eccdf00e1967db10ade37db4e7db476261fcc8df97c5ffd1
skRm:
438d8bcef33b89e0e9ae5eb0957c353c25a94584b0dd59c991372a75b43cb661
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc: 04305d35563527bce037773d79a13deabed0e8e7cde61eecee403496959e89e
4d0ca701726696d1485137ccb5341b3c1c7aaee90a4a02449725e744b1193b53b5f
shared_secret:
2e783ad86a1beae03b5749e0f3f5e9bb19cb7eb382f2fb2dd64c99f15ae0661b
key_schedule_context: 01b873cdf2dff4c1434988053b7a775e980dd2039ea24f
950b26b056ccedcb933198e486f9c9c09c9b5c753ac72d6005de254c607d1b534ed1
1d493ae1c1d9ac85
secret:
f2f534e55931c62eeb2188c1f53450354a725183937e68c85e68d6b267504d26
key: 55d9eb9d26911d4c514a990fa8d57048
base_nonce: b595dc6b2d7e2ed23af529b1
exporter_secret:
895a723a1eab809804973a53c0ee18ece29b25a7555a4808277ad2651d66d705

A.3.2.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: b595dc6b2d7e2ed23af529b1
ct: 90c4deb5b75318530194e4bb62f890b019b1397bbf9d0d6eb918890e1fb2be1a
c2603193b60a49c2126b75d0eb

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: b595dc6b2d7e2ed23af529b0
ct: 9e223384a3620f4a75b5a52f546b7262d8826dea18db5a365feb8b997180b22d
72dc1287f7089a1073a7102c27

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: b595dc6b2d7e2ed23af529b3
ct: adf9f6000773035023be7d415e13f84c1cb32a24339a32eb81df02be9ddc6abc
880dd81cceb7c1d0c7781465b2

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: b595dc6b2d7e2ed23af529b5
ct: 1f4cc9b7013d65511b1f69c050b7bd8bbd5a5c16ece82b238fec4f30ba2400e7
ca8ee482ac5253cffb5c3dc577

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: b595dc6b2d7e2ed23af5294e
ct: cdc541253111ed7a424eea5134dc14fc5e8293ab3b537668b8656789628e4589
4e5bb873c968e3b7cdcbb654a4

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: b595dc6b2d7e2ed23af528b1
ct: faf985208858b1253b97b60aecd28bc18737b58d1242370e7703ec33b73a4c31
a1afee300e349adef9015bbbfd

A.3.2.2. Exported Values
exporter_context:
L: 32
exported_value:
a115a59bf4dd8dc49332d6a0093af8efca1bcbfd3627d850173f5c4a55d0c185

exporter_context: 00
L: 32
exported_value:
4517eaede0669b16aac7c92d5762dd459c301fa10e02237cd5aeb9be969430c4

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
164e02144d44b607a7722e58b0f4156e67c0c2874d74cf71da6ca48a4cbdc5e0

A.3.3. Auth Setup Information

mode: 2
kem_id: 16
kdf_id: 1
aead_id: 1
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
798d82a8d9ea19dbc7f2c6dfa54e8a6706f7cdc119db0813dacf8440ab37c857
pkEm: 042224f3ea800f7ec55c03f29fc9865f6ee27004f818fcbdc6dc68932c1e52
e15b79e264a98f2c535ef06745f3d308624414153b22c7332bc1e691cb4af4d53454
skEm:
6b8de0873aed0c1b2d09b8c7ed54cbf24fdf1dfc7a47fa501f918810642d7b91
ikmR:
7bc93bde8890d1fb55220e7f3b0c107ae7e6eda35ca4040bb6651284bf0747ee
pkRm: 04423e363e1cd54ce7b7573110ac121399acbc9ed815fae03b72ffbd4c18b0
1836835c5a09513f28fc971b7266cfde2e96afe84bb0f266920e82c4f53b36e1a78d
skRm:
d929ab4be2e59f6954d6bedd93e638f02d4046cef21115b00cdda2acb2a4440e
ikmS:
874baa0dcf93595a24a45a7f042e0d22d368747daaa7e19f80a802af19204ba8
pkSm: 04a817a0902bf28e036d66add5d544cc3a0457eab150f104285df1e293b5c1
0eef8651213e43d9cd9086c80b309df22cf37609f58c1127f7607e85f210b2804f73
skSm:
1120ac99fb1fccc1e8230502d245719d1b217fe20505c7648795139d177f0de9
enc: 042224f3ea800f7ec55c03f29fc9865f6ee27004f818fcbdc6dc68932c1e52e
15b79e264a98f2c535ef06745f3d308624414153b22c7332bc1e691cb4af4d53454
shared_secret:
d4aea336439aadf68f9348880aa358086f1480e7c167b6ef15453ba69b94b44f
key_schedule_context: 02b88d4e6d91759e65e87c470e8b9141113e9ad5f0c8ce
efc1e088c82e6980500798e486f9c9c09c9b5c753ac72d6005de254c607d1b534ed1
1d493ae1c1d9ac85
secret:
fd0a93c7c6f6b1b0dd6a822d7b16f6c61c83d98ad88426df4613c3581a2319f1
key: 19aa8472b3fdc530392b0e54ca17c0f5
base_nonce: b390052d26b67a5b8a8fcaa4
exporter_secret:
f152759972660eb0e1db880835abd5de1c39c8e9cd269f6f082ed80e28acb164

A.3.3.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: b390052d26b67a5b8a8fcaa4
ct: 82ffc8c44760db691a07c5627e5fc2c08e7a86979ee79b494a17cc3405446ac2
bdb8f265db4a099ed3289ffe19

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: b390052d26b67a5b8a8fcaa5
ct: b0a705a54532c7b4f5907de51c13dffe1e08d55ee9ba59686114b05945494d96
725b239468f1229e3966aa1250

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: b390052d26b67a5b8a8fcaa6
ct: 8dc805680e3271a801790833ed74473710157645584f06d1b53ad439078d880b
23e25256663178271c80ee8b7c

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: b390052d26b67a5b8a8fcaa0
ct: 04c8f7aae1584b61aa5816382cb0b834a5d744f420e6dffb5ddcec633a21b8b3
472820930c1ea9258b035937a2

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: b390052d26b67a5b8a8fca5b
ct: 4a319462eaedee37248b4d985f64f4f863d31913fe9e30b6e13136053b69fe5d
70853c84c60a84bb5495d5a678

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: b390052d26b67a5b8a8fcba4
ct: 28e874512f8940fafc7d06135e7589f6b4198bc0f3a1c64702e72c9e6abaf9f0
5cb0d2f11b03a517898815c934

A.3.3.2. Exported Values
exporter_context:
L: 32
exported_value:
837e49c3ff629250c8d80d3c3fb957725ed481e59e2feb57afd9fe9a8c7c4497

exporter_context: 00
L: 32
exported_value:
594213f9018d614b82007a7021c3135bda7b380da4acd9ab27165c508640dbda

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
14fe634f95ca0d86e15247cca7de7ba9b73c9b9deb6437e1c832daf7291b79d5

A.3.4. AuthPSK Setup Information

mode: 3
kem_id: 16
kdf_id: 1
aead_id: 1
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
3c1fceb477ec954c8d58ef3249e4bb4c38241b5925b95f7486e4d9f1d0d35fbb
pkEm: 046a1de3fc26a3d43f4e4ba97dbe24f7e99181136129c48fbe872d4743e2b1
31357ed4f29a7b317dc22509c7b00991ae990bf65f8b236700c82ab7c11a84511401
skEm:
36f771e411cf9cf72f0701ef2b991ce9743645b472e835fe234fb4d6eb2ff5a0
ikmR:
abcc2da5b3fa81d8aabd91f7f800a8ccf60ec37b1b585a5d1d1ac77f258b6cca
pkRm: 04d824d7e897897c172ac8a9e862e4bd820133b8d090a9b188b8233a64dfbc
5f725aa0aa52c8462ab7c9188f1c4872f0c99087a867e8a773a13df48a627058e1b3
skRm:
bdf4e2e587afdf0930644a0c45053889ebcadeca662d7c755a353d5b4e2a8394
ikmS:
6262031f040a9db853edd6f91d2272596eabbc78a2ed2bd643f770ecd0f19b82
pkSm: 049f158c750e55d8d5ad13ede66cf6e79801634b7acadcad72044eac2ae1d0
480069133d6488bf73863fa988c4ba8bde1c2e948b761274802b4d8012af4f13af9e
skSm:
b0ed8721db6185435898650f7a677affce925aba7975a582653c4cb13c72d240
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc: 046a1de3fc26a3d43f4e4ba97dbe24f7e99181136129c48fbe872d4743e2b13
1357ed4f29a7b317dc22509c7b00991ae990bf65f8b236700c82ab7c11a84511401
shared_secret:
d4c27698391db126f1612d9e91a767f10b9b19aa17e1695549203f0df7d9aebe
key_schedule_context: 03b873cdf2dff4c1434988053b7a775e980dd2039ea24f
950b26b056ccedcb933198e486f9c9c09c9b5c753ac72d6005de254c607d1b534ed1
1d493ae1c1d9ac85
secret:
3bf9d4c7955da2740414e73081fa74d6f6f2b4b9645d0685219813ce99a2f270
key: 4d567121d67fae1227d90e11585988fb
base_nonce: 67c9d05330ca21e5116ecda6
exporter_secret:
3f479020ae186788e4dfd4a42a21d24f3faabb224dd4f91c2b2e5e9524ca27b2

A.3.4.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 67c9d05330ca21e5116ecda6
ct: b9f36d58d9eb101629a3e5a7b63d2ee4af42b3644209ab37e0a272d44365407d
b8e655c72e4fa46f4ff81b9246

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 67c9d05330ca21e5116ecda7
ct: 51788c4e5d56276771032749d015d3eea651af0c7bb8e3da669effffed299ea1
f641df621af65579c10fc09736

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 67c9d05330ca21e5116ecda4
ct: 3b5a2be002e7b29927f06442947e1cf709b9f8508b03823127387223d7127034
71c266efc355f1bc2036f3027c

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 67c9d05330ca21e5116ecda2
ct: 8ddbf1242fe5c7d61e1675496f3bfdb4d90205b3dfbc1b12aab41395d71a8211
8e095c484103107cf4face5123

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 67c9d05330ca21e5116ecd59
ct: 6de25ceadeaec572fbaa25eda2558b73c383fe55106abaec24d518ef6724a7ce
698f83ecdc53e640fe214d2f42

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 67c9d05330ca21e5116ecca6
ct: f380e19d291e12c5e378b51feb5cd50f6d00df6cb2af8393794c4df342126c2e
29633fe7e8ce49587531affd4d

A.3.4.2. Exported Values
exporter_context:
L: 32
exported_value:
595ce0eff405d4b3bb1d08308d70a4e77226ce11766e0a94c4fdb5d90025c978

exporter_context: 00
L: 32
exported_value:
110472ee0ae328f57ef7332a9886a1992d2c45b9b8d5abc9424ff68630f7d38d

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
18ee4d001a9d83a4c67e76f88dd747766576cac438723bad0700a910a4d717e6

A.4. DHKEM(P-256, HKDF-SHA256), HKDF-SHA512, AES-128-GCM

A.4.1. Base Setup Information

mode: 0
kem_id: 16
kdf_id: 3
aead_id: 1
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
4ab11a9dd78c39668f7038f921ffc0993b368171d3ddde8031501ee1e08c4c9a
pkEm: 0493ed86735bdfb978cc055c98b45695ad7ce61ce748f4dd63c525a3b8d53a
15565c6897888070070c1579db1f86aaa56deb8297e64db7e8924e72866f9a472580
skEm:
2292bf14bb6e15b8c81a0f45b7a6e93e32d830e48cca702e0affcfb4d07e1b5c
ikmR:
ea9ff7cc5b2705b188841c7ace169290ff312a9cb31467784ca92d7a2e6e1be8
pkRm: 04085aa5b665dc3826f9650ccbcc471be268c8ada866422f739e2d531d4a88
18a9466bc6b449357096232919ec4fe9070ccbac4aac30f4a1a53efcf7af90610edd
skRm:
3ac8530ad1b01885960fab38cf3cdc4f7aef121eaa239f222623614b4079fb38
enc: 0493ed86735bdfb978cc055c98b45695ad7ce61ce748f4dd63c525a3b8d53a1
5565c6897888070070c1579db1f86aaa56deb8297e64db7e8924e72866f9a472580
shared_secret:
02f584736390fc93f5b4ad039826a3fa08e9911bd1215a3db8e8791ba533cafd
key_schedule_context: 005b8a3617af7789ee716e7911c7e77f84cdc4cc46e60f
b7e19e4059f9aeadc00585e26874d1ddde76e551a7679cd47168c466f6e1f705cc93
74c192778a34fcd5ca221d77e229a9d11b654de7942d685069c633b2362ce3b3d8ea
4891c9a2a87a4eb7cdb289ba5e2ecbf8cd2c8498bb4a383dc021454d70d46fcbbad1
252ef4f9
secret: 0c7acdab61693f936c4c1256c78e7be30eebfe466812f9cc49f0b58dc970
328dfc03ea359be0250a471b1635a193d2dfa8cb23c90aa2e25025b892a725353eeb
key: 090ca96e5f8aa02b69fac360da50ddf9
base_nonce: 9c995e621bf9a20c5ca45546
exporter_secret: 4a7abb2ac43e6553f129b2c5750a7e82d149a76ed56dc342d7b
ca61e26d494f4855dff0d0165f27ce57756f7f16baca006539bb8e4518987ba61048
0ac03efa8

A.4.1.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 9c995e621bf9a20c5ca45546
ct: d3cf4984931484a080f74c1bb2a6782700dc1fef9abe8442e44a6f09044c8890
7200b332003543754eb51917ba

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 9c995e621bf9a20c5ca45547
ct: d14414555a47269dfead9fbf26abb303365e40709a4ed16eaefe1f2070f1ddeb
1bdd94d9e41186f124e0acc62d

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 9c995e621bf9a20c5ca45544
ct: 9bba136cade5c4069707ba91a61932e2cbedda2d9c7bdc33515aa01dd0e0f7e9
d3579bf4016dec37da4aafa800

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 9c995e621bf9a20c5ca45542
ct: a531c0655342be013bf32112951f8df1da643602f1866749519f5dcb09cc6843
2579de305a77e6864e862a7600

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 9c995e621bf9a20c5ca455b9
ct: be5da649469efbad0fb950366a82a73fefeda5f652ec7d3731fac6c4ffa21a70
04d2ab8a04e13621bd3629547d

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 9c995e621bf9a20c5ca45446
ct: 62092672f5328a0dde095e57435edf7457ace60b26ee44c9291110ec135cb0e1
4b85594e4fea11247d937deb62

A.4.1.2. Exported Values
exporter_context:
L: 32
exported_value:
a32186b8946f61aeead1c093fe614945f85833b165b28c46bf271abf16b57208

exporter_context: 00
L: 32
exported_value:
84998b304a0ea2f11809398755f0abd5f9d2c141d1822def79dd15c194803c2a

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
93fb9411430b2cfa2cf0bed448c46922a5be9beff20e2e621df7e4655852edbc

A.4.2. PSK Setup Information

mode: 1
kem_id: 16
kdf_id: 3
aead_id: 1
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
c11d883d6587f911d2ddbc2a0859d5b42fb13bf2c8e89ef408a25564893856f5
pkEm: 04a307934180ad5287f95525fe5bc6244285d7273c15e061f0f2efb211c350
57f3079f6e0abae200992610b25f48b63aacfcb669106ddee8aa023feed301901371
skEm:
a5901ff7d6931959c2755382ea40a4869b1dec3694ed3b009dda2d77dd488f18
ikmR:
75bfc2a3a3541170a54c0b06444e358d0ee2b4fb78a401fd399a47a33723b700
pkRm: 043f5266fba0742db649e1043102b8a5afd114465156719cea90373229aabd
d84d7f45dabfc1f55664b888a7e86d594853a6cccdc9b189b57839cbbe3b90b55873
skRm:
bc6f0b5e22429e5ff47d5969003f3cae0f4fec50e23602e880038364f33b8522
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc: 04a307934180ad5287f95525fe5bc6244285d7273c15e061f0f2efb211c3505
7f3079f6e0abae200992610b25f48b63aacfcb669106ddee8aa023feed301901371
shared_secret:
2912aacc6eaebd71ff715ea50f6ef3a6637856b2a4c58ea61e0c3fc159e3bc16
key_schedule_context: 01713f73042575cebfd132f0cc4338523f8eae95c80a74
9f7cf3eb9436ff1c612ca62c37df27ca46d2cc162445a92c5f5fdc57bcde129ca7b1
f284b0c12297c037ca221d77e229a9d11b654de7942d685069c633b2362ce3b3d8ea
4891c9a2a87a4eb7cdb289ba5e2ecbf8cd2c8498bb4a383dc021454d70d46fcbbad1
252ef4f9
secret: ff2051d2128d5f3078de867143e076262ce1d0aecafc3fff3d607f1eaff0
5345c7d5ffcb3202cdecb3d1a2f7da20592a237747b6e855390cbe2109d3e6ac70c2
key: 0b910ba8d9cfa17e5f50c211cb32839a
base_nonce: 0c29e714eb52de5b7415a1b7
exporter_secret: 50c0a182b6f94b4c0bd955c4aa20df01f282cc12c43065a0812
fe4d4352790171ed2b2c4756ad7f5a730ba336c8f1edd0089d8331192058c385bae3
9c7cc8b57

A.4.2.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 0c29e714eb52de5b7415a1b7
ct: 57624b6e320d4aba0afd11f548780772932f502e2ba2a8068676b2a0d3b5129a
45b9faa88de39e8306da41d4cc

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 0c29e714eb52de5b7415a1b6
ct: 159d6b4c24bacaf2f5049b7863536d8f3ffede76302dace42080820fa51925d4
e1c72a64f87b14291a3057e00a

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 0c29e714eb52de5b7415a1b5
ct: bd24140859c99bf0055075e9c460032581dd1726d52cf980d308e9b20083ca62
e700b17892bcf7fa82bac751d0

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 0c29e714eb52de5b7415a1b3
ct: 93ddd55f82e9aaaa3cfc06840575f09d80160b20538125c2549932977d1238dd
e8126a4a91118faf8632f62cb8

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 0c29e714eb52de5b7415a148
ct: 377a98a3c34bf716581b05a6b3fdc257f245856384d5f2241c8840571c52f5c8
5c21138a4a81655edab8fe227d

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 0c29e714eb52de5b7415a0b7
ct: cc161f5a179831d456d119d2f2c19a6817289c75d1c61cd37ac8a450acd9efba
02e0ac00d128c17855931ff69a

A.4.2.2. Exported Values
exporter_context:
L: 32
exported_value:
8158bea21a6700d37022bb7802866edca30ebf2078273757b656ef7fc2e428cf

exporter_context: 00
L: 32
exported_value:
6a348ba6e0e72bb3ef22479214a139ef8dac57be34509a61087a12565473da8d

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
2f6d4f7a18ec48de1ef4469f596aada4afdf6d79b037ed3c07e0118f8723bffc

A.4.3. Auth Setup Information

mode: 2
kem_id: 16
kdf_id: 3
aead_id: 1
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
6bb031aa9197562da0b44e737db2b9e61f6c3ea1138c37de28fc37ac29bc7350
pkEm: 04fec59fa9f76f5d0f6c1660bb179cb314ed97953c53a60ab38f8e6ace60fd
59178084d0dd66e0f79172992d4ddb2e91172ce24949bcebfff158dcc417f2c6e9c6
skEm:
93cddd5288e7ef4884c8fe321d075df01501b993ff49ffab8184116f39b3c655
ikmR:
649a3f92edbb7a2516a0ade0b7dccc58a37240c4ba06f9726a952227b4adf6ff
pkRm: 04378bad519aab406e04d0e5608bcca809c02d6afd2272d4dd03e9357bd0ee
e8adf84c8deba3155c9cf9506d1d4c8bfefe3cf033a75716cc3cc07295100ec96276
skRm:
1ea4484be482bf25fdb2ed39e6a02ed9156b3e57dfb18dff82e4a048de990236
ikmS:
4d79b8691aab55a7265e8490a04bb3860ed64dece90953ad0dc43a6ea59b4bf2
pkSm: 0404d3c1f9fca22eb4a6d326125f0814c35593b1da8ea0d11a640730b215a2
59b9b98a34ad17e21617d19fe1d4fa39a4828bfdb306b729ec51c543caca3b2d9529
skSm:
02b266d66919f7b08f42ae0e7d97af4ca98b2dae3043bb7e0740ccadc1957579
enc: 04fec59fa9f76f5d0f6c1660bb179cb314ed97953c53a60ab38f8e6ace60fd5
9178084d0dd66e0f79172992d4ddb2e91172ce24949bcebfff158dcc417f2c6e9c6
shared_secret:
1ed49f6d7ada333d171cd63861a1cb700a1ec4236755a9cd5f9f8f67a2f8e7b3
key_schedule_context: 025b8a3617af7789ee716e7911c7e77f84cdc4cc46e60f
b7e19e4059f9aeadc00585e26874d1ddde76e551a7679cd47168c466f6e1f705cc93
74c192778a34fcd5ca221d77e229a9d11b654de7942d685069c633b2362ce3b3d8ea
4891c9a2a87a4eb7cdb289ba5e2ecbf8cd2c8498bb4a383dc021454d70d46fcbbad1
252ef4f9
secret: 9c846ba81ddbbd57bc26d99da6cf7ab956bb735ecd47fe21ed14241c7079
1b7484c1d06663d21a5d97bf1be70d56ab727f650c4f859c5ed3f71f8928b3c082dd
key: 9d4b1c83129f3de6db95faf3d539dcf1
base_nonce: ea4fd7a485ee5f1f4b62c1b7
exporter_secret: ca2410672369aae1afd6c2639f4fe34ca36d35410c090608d29
24f60def17f910d7928575434d7f991b1f19d3e8358b8278ff59ced0d5eed4774cec
72e12766e

A.4.3.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: ea4fd7a485ee5f1f4b62c1b7
ct: 2480179d880b5f458154b8bfe3c7e8732332de84aabf06fc440f6b31f169e154
157fa9eb44f2fa4d7b38a9236e

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: ea4fd7a485ee5f1f4b62c1b6
ct: 10cd81e3a816d29942b602a92884348171a31cbd0f042c3057c65cd93c540943
a5b05115bd520c09281061935b

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: ea4fd7a485ee5f1f4b62c1b5
ct: 920743a88d8cf6a09e1a3098e8be8edd09db136e9d543f215924043af8c7410f
68ce6aa64fd2b1a176e7f6b3fd

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: ea4fd7a485ee5f1f4b62c1b3
ct: 6b11380fcc708fc8589effb5b5e0394cbd441fa5e240b5500522150ca8265d65
ff55479405af936e2349119dcd

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: ea4fd7a485ee5f1f4b62c148
ct: d084eca50e7554bb97ba34c4482dfe32c9a2b7f3ab009c2d1b68ecbf97bee2d2
8cd94b6c829b96361f2701772d

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: ea4fd7a485ee5f1f4b62c0b7
ct: 247da592cc4ce834a94de2c79f5730ee49342470a021e4a4bc2bb77c53b17413
e94d94f57b4fdaedcf97cfe7b1

A.4.3.2. Exported Values
exporter_context:
L: 32
exported_value:
f03fbc82f321a0ab4840e487cb75d07aafd8e6f68485e4f7ff72b2f55ff24ad6

exporter_context: 00
L: 32
exported_value:
1ce0cadec0a8f060f4b5070c8f8888dcdfefc2e35819df0cd559928a11ff0891

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
70c405c707102fd0041ea716090753be47d68d238b111d542846bd0d84ba907c

A.4.4. AuthPSK Setup Information

mode: 3
kem_id: 16
kdf_id: 3
aead_id: 1
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
37ae06a521cd555648c928d7af58ad2aa4a85e34b8cabd069e94ad55ab872cc8
pkEm: 04801740f4b1b35823f7fb2930eac2efc8c4893f34ba111c0bb976e3c7d5dc
0aef5a7ef0bf4057949a140285f774f1efc53b3860936b92279a11b68395d898d138
skEm:
778f2254ae5d661d5c7fca8c4a7495a25bd13f26258e459159f3899df0de76c1
ikmR:
7466024b7e2d2366c3914d7833718f13afb9e3e45bcfbb510594d614ddd9b4e7
pkRm: 04a4ca7af2fc2cce48edbf2f1700983e927743a4e85bb5035ad562043e25d9
a111cbf6f7385fac55edc5c9d2ca6ed351a5643de95c36748e11dbec98730f4d43e9
skRm:
00510a70fde67af487c093234fc4215c1cdec09579c4b30cc8e48cb530414d0e
ikmS:
ee27aaf99bf5cd8398e9de88ac09a82ac22cdb8d0905ab05c0f5fa12ba1709f3
pkSm: 04b59a4157a9720eb749c95f842a5e3e8acdccbe834426d405509ac3191e23
f2165b5bb1f07a6240dd567703ae75e13182ee0f69fc102145cdb5abf681ff126d60
skSm:
d743b20821e6326f7a26684a4beed7088b35e392114480ca9f6c325079dcf10b
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc: 04801740f4b1b35823f7fb2930eac2efc8c4893f34ba111c0bb976e3c7d5dc0
aef5a7ef0bf4057949a140285f774f1efc53b3860936b92279a11b68395d898d138
shared_secret:
02bee8be0dda755846115db45071c0cf59c25722e015bde1c124de849c0fea52
key_schedule_context: 03713f73042575cebfd132f0cc4338523f8eae95c80a74
9f7cf3eb9436ff1c612ca62c37df27ca46d2cc162445a92c5f5fdc57bcde129ca7b1
f284b0c12297c037ca221d77e229a9d11b654de7942d685069c633b2362ce3b3d8ea
4891c9a2a87a4eb7cdb289ba5e2ecbf8cd2c8498bb4a383dc021454d70d46fcbbad1
252ef4f9
secret: 0f9df08908a6a3d06c8e934cd3f5313f9ebccd0986e316c0198bb48bed30
dc3db2f3baab94fd40c2c285c7288c77e2255401ee2d5884306addf4296b93c238b3
key: b68bb0e2fbf7431cedb46cc3b6f1fe9e
base_nonce: 76af62719d33d39a1cb6be9f
exporter_secret: 7f72308ae68c9a2b3862e686cb547b16d33d00fe482c770c471
7d8b54e9b1e547244c3602bdd86d5a788a8443befea0a7658002b23f1c96a62a6498
6fffc511a

A.4.4.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 76af62719d33d39a1cb6be9f
ct: 840669634db51e28df54f189329c1b727fd303ae413f003020aff5e26276aaa9
10fc4296828cb9d862c2fd7d16

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 76af62719d33d39a1cb6be9e
ct: d4680a48158d9a75fd09355878d6e33997a36ee01d4a8f22032b22373b795a94
1b7b9c5205ff99e0ff284beef4

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 76af62719d33d39a1cb6be9d
ct: c45eb6597de2bac929a0f5d404ba9d2dc1ea031880930f1fd7a283f0a0cbebb3
5eac1a9ee0d1225f5e0f181571

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 76af62719d33d39a1cb6be9b
ct: 4ee2482ad8d7d1e9b7e651c78b6ca26d3c5314d0711710ca62c2fd8bb8996d7d
8727c157538d5493da696b61f8

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 76af62719d33d39a1cb6be60
ct: 65596b731df010c76a915c6271a438056ce65696459432eeafdae7b4cadb6290
dd61e68edd4e40b659d2a8cbcc

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 76af62719d33d39a1cb6bf9f
ct: 9f659482ebc52f8303f9eac75656d807ec38ce2e50c72e3078cd13d86b30e3f8
90690a873277620f8a6a42d836

A.4.4.2. Exported Values
exporter_context:
L: 32
exported_value:
c8c917e137a616d3d4e4c9fcd9c50202f366cb0d37862376bc79f9b72e8a8db9

exporter_context: 00
L: 32
exported_value:
33a5d4df232777008a06d0684f23bb891cfaef702f653c8601b6ad4d08dddddf

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
bed80f2e54f1285895c4a3f3b3625e6206f78f1ed329a0cfb5864f7c139b3c6a

A.5. DHKEM(P-256, HKDF-SHA256), HKDF-SHA256, ChaCha20Poly1305

A.5.1. Base Setup Information

mode: 0
kem_id: 16
kdf_id: 1
aead_id: 3
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
f1f1a3bc95416871539ecb51c3a8f0cf608afb40fbbe305c0a72819d35c33f1f
pkEm: 04c07836a0206e04e31d8ae99bfd549380b072a1b1b82e563c935c09582782
4fc1559eac6fb9e3c70cd3193968994e7fe9781aa103f5b50e934b5b2f387e381291
skEm:
7550253e1147aae48839c1f8af80d2770fb7a4c763afe7d0afa7e0f42a5b3689
ikmR:
61092f3f56994dd424405899154a9918353e3e008171517ad576b900ddb275e7
pkRm: 04a697bffde9405c992883c5c439d6cc358170b51af72812333b015621dc0f
40bad9bb726f68a5c013806a790ec716ab8669f84f6b694596c2987cf35baba2a006
skRm:
a4d1c55836aa30f9b3fbb6ac98d338c877c2867dd3a77396d13f68d3ab150d3b
enc: 04c07836a0206e04e31d8ae99bfd549380b072a1b1b82e563c935c095827824
fc1559eac6fb9e3c70cd3193968994e7fe9781aa103f5b50e934b5b2f387e381291
shared_secret:
806520f82ef0b03c823b7fc524b6b55a088f566b9751b89551c170f4113bd850
key_schedule_context: 00b738cd703db7b4106e93b4621e9a19c89c838e559642
40e5d3f331aaf8b0d58b2e986ea1c671b61cf45eec134dac0bae58ec6f63e790b140
0b47c33038b0269c
secret:
fe891101629aa355aad68eff3cc5170d057eca0c7573f6575e91f9783e1d4506
key:
a8f45490a92a3b04d1dbf6cf2c3939ad8bfc9bfcb97c04bffe116730c9dfe3fc
base_nonce: 726b4390ed2209809f58c693
exporter_secret:
4f9bd9b3a8db7d7c3a5b9d44fdc1f6e37d5d77689ade5ec44a7242016e6aa205

A.5.1.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 726b4390ed2209809f58c693
ct: 6469c41c5c81d3aa85432531ecf6460ec945bde1eb428cb2fedf7a29f5a685b4
ccb0d057f03ea2952a27bb458b

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 726b4390ed2209809f58c692
ct: f1564199f7e0e110ec9c1bcdde332177fc35c1adf6e57f8d1df24022227ffa87
16862dbda2b1dc546c9d114374

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 726b4390ed2209809f58c691
ct: 39de89728bcb774269f882af8dc5369e4f3d6322d986e872b3a8d074c7c18e85
49ff3f85b6d6592ff87c3f310c

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 726b4390ed2209809f58c697
ct: bc104a14fbede0cc79eeb826ea0476ce87b9c928c36e5e34dc9b6905d91473ec
369a08b1a25d305dd45c6c5f80

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 726b4390ed2209809f58c66c
ct: 8f2814a2c548b3be50259713c6724009e092d37789f6856553d61df23ebc0792
35f710e6af3c3ca6eaba7c7c6c

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 726b4390ed2209809f58c793
ct: b45b69d419a9be7219d8c94365b89ad6951caf4576ea4774ea40e9b7047a09d6
537d1aa2f7c12d6ae4b729b4d0

A.5.1.2. Exported Values
exporter_context:
L: 32
exported_value:
9b13c510416ac977b553bf1741018809c246a695f45eff6d3b0356dbefe1e660

exporter_context: 00
L: 32
exported_value:
6c8b7be3a20a5684edecb4253619d9051ce8583baf850e0cb53c402bdcaf8ebb

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
477a50d804c7c51941f69b8e32fe8288386ee1a84905fe4938d58972f24ac938

A.5.2. PSK Setup Information

mode: 1
kem_id: 16
kdf_id: 1
aead_id: 3
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
e1a4e1d50c4bfcf890f2b4c7d6b2d2aca61368eddc3c84162df2856843e1057a
pkEm: 04f336578b72ad7932fe867cc4d2d44a718a318037a0ec271163699cee653f
a805c1fec955e562663e0c2061bb96a87d78892bff0cc0bad7906c2d998ebe1a7246
skEm:
7d6e4e006cee68af9b3fdd583a0ee8962df9d59fab029997ee3f456cbc857904
ikmR:
ee51dec304abf993ef8fd52aacdd3b539108bbf6e491943266c1de89ec596a17
pkRm: 041eb8f4f20ab72661af369ff3231a733672fa26f385ffb959fd1bae46bfda
43ad55e2d573b880831381d9367417f554ce5b2134fbba5235b44db465feffc6189e
skRm:
12ecde2c8bc2d5d7ed2219c71f27e3943d92b344174436af833337c557c300b3
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc: 04f336578b72ad7932fe867cc4d2d44a718a318037a0ec271163699cee653fa
805c1fec955e562663e0c2061bb96a87d78892bff0cc0bad7906c2d998ebe1a7246
shared_secret:
ac4f260dce4db6bf45435d9c92c0e11cfdd93743bd3075949975974cc2b3d79e
key_schedule_context: 01622b72afcc3795841596c67ea74400ca3b029374d7d5
640bda367c5d67b3fbeb2e986ea1c671b61cf45eec134dac0bae58ec6f63e790b140
0b47c33038b0269c
secret:
858c8087a1c056db5811e85802f375bb0c19b9983204a1575de4803575d23239
key:
6d61cb330b7771168c8619498e753f16198aad9566d1f1c6c70e2bc1a1a8b142
base_nonce: 0de7655fb65e1cd51a38864e
exporter_secret:
754ca00235b245e72d1f722a7718e7145bd113050a2aa3d89586d4cb7514bfdb

A.5.2.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 0de7655fb65e1cd51a38864e
ct: 21433eaff24d7706f3ed5b9b2e709b07230e2b11df1f2b1fe07b3c70d5948a53
d6fa5c8bed194020bd9df0877b

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 0de7655fb65e1cd51a38864f
ct: c74a764b4892072ea8c2c56b9bcd46c7f1e9ca8cb0a263f8b40c2ba59ac9c857
033f176019562218769d3e0452

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 0de7655fb65e1cd51a38864c
ct: dc8cd68863474d6e9cbb6a659335a86a54e036249d41acf909e738c847ff2bd3
6fe3fcacda4ededa7032c0a220

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 0de7655fb65e1cd51a38864a
ct: cd54a8576353b1b9df366cb0cc042e46eef6f4cf01e205fe7d47e306b2fdd90f
7185f289a26c613ca094e3be10

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 0de7655fb65e1cd51a3886b1
ct: 6324570c9d542c70c7e70570c1d8f4c52a89484746bf0625441890ededcc80c2
4ef2301c38bfd34d689d19f67d

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 0de7655fb65e1cd51a38874e
ct: 1ea6326c8098ed0437a553c466550114fb2ca1412cca7de98709b9ccdf19206e
52c3d39180e2cf62b3e9f4baf4

A.5.2.2. Exported Values
exporter_context:
L: 32
exported_value:
530bbc2f68f078dccc89cc371b4f4ade372c9472bafe4601a8432cbb934f528d

exporter_context: 00
L: 32
exported_value:
6e25075ddcc528c90ef9218f800ca3dfe1b8ff4042de5033133adb8bd54c401d

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
6f6fbd0d1c7733f796461b3235a856cc34f676fe61ed509dfc18fa16efe6be78

A.5.3. Auth Setup Information

mode: 2
kem_id: 16
kdf_id: 1
aead_id: 3
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
0ecd212019008138a31f9104d5dba76b9f8e34d5b996041fff9e3df221dd0d5d
pkEm: 040d5176aedba55bc41709261e9195c5146bb62d783031280775f32e507d79
b5cbc5748b6be6359760c73cfe10ca19521af704ca6d91ff32fc0739527b9385d415
skEm:
085fd5d5e6ce6497c79df960cac93710006b76217d8bcfafbd2bb2c20ea03c42
ikmR:
d32236d8378b9563840653789eb7bc33c3c720e537391727bf1c812d0eac110f
pkRm: 0444f6ee41818d9fe0f8265bffd016b7e2dd3964d610d0f7514244a60dbb7a
11ece876bb110a97a2ac6a9542d7344bf7d2bd59345e3e75e497f7416cf38d296233
skRm:
3cb2c125b8c5a81d165a333048f5dcae29a2ab2072625adad66dbb0f48689af9
ikmS:
0e6be0851283f9327295fd49858a8c8908ea9783212945eef6c598ee0a3cedbb
pkSm: 04265529a04d4f46ab6fa3af4943774a9f1127821656a75a35fade898a9a1b
014f64d874e88cddb24c1c3d79004d3a587db67670ca357ff4fba7e8b56ec013b98b
skSm:
39b19402e742d48d319d24d68e494daa4492817342e593285944830320912519
enc: 040d5176aedba55bc41709261e9195c5146bb62d783031280775f32e507d79b
5cbc5748b6be6359760c73cfe10ca19521af704ca6d91ff32fc0739527b9385d415
shared_secret:
1a45aa4792f4b166bfee7eeab0096c1a6e497480e2261b2a59aad12f2768d469
key_schedule_context: 02b738cd703db7b4106e93b4621e9a19c89c838e559642
40e5d3f331aaf8b0d58b2e986ea1c671b61cf45eec134dac0bae58ec6f63e790b140
0b47c33038b0269c
secret:
9193210815b87a4c5496c9d73e609a6c92665b5ea0d760866294906d089ebb57
key:
cf292f8a4313280a462ce55cde05b5aa5744fe4ca89a5d81b0146a5eaca8092d
base_nonce: 7e45c21e20e869ae00492123
exporter_secret:
dba6e307f71769ba11e2c687cc19592f9d436da0c81e772d7a8a9fd28e54355f

A.5.3.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 7e45c21e20e869ae00492123
ct: 25881f219935eec5ba70d7b421f13c35005734f3e4d959680270f55d71e2f5cb
3bd2daced2770bf3d9d4916872

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 7e45c21e20e869ae00492122
ct: 653f0036e52a376f5d2dd85b3204b55455b7835c231255ae098d09ed138719b9
7185129786338ab6543f753193

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 7e45c21e20e869ae00492121
ct: 60878706117f22180c788e62df6a595bc41906096a11a9513e84f0141e43239e
81a98d7a235abc64112fcb8ddd

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 7e45c21e20e869ae00492127
ct: 0f9094dd08240b5fa7a388b824d19d5b4b1e126cebfd67a062c32f9ba9f1f386
6cc38de7df2702626e2ab65c0f

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 7e45c21e20e869ae004921dc
ct: dd29319e08135c5f8401d6537a364e92172c0e3f095f3fd18923881d11c0a683
9345dd0b54acd0edd8f8344792

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 7e45c21e20e869ae00492023
ct: e2276ec5047bc4b6ed57d6da7da2fb47a77502f0a30f17d040247c73da336d72
2bc6c89adf68396a0912c6d152

A.5.3.2. Exported Values
exporter_context:
L: 32
exported_value:
56c4d6c1d3a46c70fd8f4ecda5d27c70886e348efb51bd5edeaa39ff6ce34389

exporter_context: 00
L: 32
exported_value:
d2d3e48ed76832b6b3f28fa84be5f11f09533c0e3c71825a34fb0f1320891b51

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
eb0d312b6263995b4c7761e64b688c215ffd6043ff3bad2368c862784cbe6eff

A.5.4. AuthPSK Setup Information

mode: 3
kem_id: 16
kdf_id: 1
aead_id: 3
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
f3a07f194703e321ef1f753a1b9fe27a498dfdfa309151d70bedd896c239c499
pkEm: 043539917ee26f8ae0aa5f784a387981b13de33124a3cde88b946720301831
10f331400115855808244ff0c5b6ca6104483ac95724481d41bdcd9f15b430ad16f6
skEm:
11b7e4de2d919240616a31ab14944cced79bc2372108bb98f6792e3b645fe546
ikmR:
1240e55a0a03548d7f963ef783b6a7362cb505e6b31dfd04c81d9b294543bfbd
pkRm: 04d383fd920c42d018b9d57fd73a01f1eee480008923f67d35169478e55d2e
8817068daf62a06b10e0aad4a9e429fa7f904481be96b79a9c231a33e956c20b81b6
skRm:
c29fc577b7e74d525c0043f1c27540a1248e4f2c8d297298e99010a92e94865c
ikmS:
ce2a0387a2eb8870a3a92c34a2975f0f3f271af4384d446c7dc1524a6c6c515a
pkSm: 0492cf8c9b144b742fe5a63d9a181a19d416f3ec8705f24308ad316564823c
344e018bd7c03a33c926bb271b28ef5bf28c0ca00abff249fee5ef7f33315ff34fdb
skSm:
53541bd995f874a67f8bfd8038afa67fd68876801f42ff47d0dc2a4deea067ae
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc: 043539917ee26f8ae0aa5f784a387981b13de33124a3cde88b9467203018311
0f331400115855808244ff0c5b6ca6104483ac95724481d41bdcd9f15b430ad16f6
shared_secret:
87584311791036a3019bc36803cdd42e9a8931a98b13c88835f2f8a9036a4fd6
key_schedule_context: 03622b72afcc3795841596c67ea74400ca3b029374d7d5
640bda367c5d67b3fbeb2e986ea1c671b61cf45eec134dac0bae58ec6f63e790b140
0b47c33038b0269c
secret:
fe52b4412590e825ea2603fa88e145b2ee014b942a774b55fab4f081301f16f4
key:
31e140c8856941315d4067239fdc4ebe077fbf45a6fc78a61e7a6c8b3bacb10a
base_nonce: 75838a8010d2e4760254dd56
exporter_secret:
600895965755db9c5027f25f039a6e3e506c35b3b7084ce33c4a48d59ee1f0e3

A.5.4.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 75838a8010d2e4760254dd56
ct: 9eadfa0f954835e7e920ffe56dec6b31a046271cf71fdda55db72926e1d8fae9
4cc6280fcfabd8db71eaa65c05

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 75838a8010d2e4760254dd57
ct: e357ad10d75240224d4095c9f6150a2ed2179c0f878e4f2db8ca95d365d174d0
59ff8c3eb38ea9a65cfc8eaeb8

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 75838a8010d2e4760254dd54
ct: 2fa56d00f8dd479d67a2ec3308325cf3bbccaf102a64ffccdb006bd7dcb93268
5b9a7b49cdc094a85fec1da5ef

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 75838a8010d2e4760254dd52
ct: 1fe9d6db14965003ed81a39abf240f9cd7c5a454bca0d69ef9a2de16d537364f
bbf110b9ef11fa4a7a0172f0ce

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 75838a8010d2e4760254dda9
ct: eaf4041a5c9122b22d1f8d698eeffe45d64b4ae33d0ddca3a4cdf4a5f595acc9
5a1a9334d06cc4d000df6aaad6

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 75838a8010d2e4760254dc56
ct: fb857f4185ce5286c1a52431867537204963ea66a3eee8d2a74419fd8751faee
066d08277ac7880473aa4143ba

A.5.4.2. Exported Values
exporter_context:
L: 32
exported_value:
c52b4592cd33dd38b2a3613108ddda28dcf7f03d30f2a09703f758bfa8029c9a

exporter_context: 00
L: 32
exported_value:
2f03bebc577e5729e148554991787222b5c2a02b77e9b1ac380541f710e5a318

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
e01dd49e8bfc3d9216abc1be832f0418adf8b47a7b5a330a7436c31e33d765d7

A.6. DHKEM(P-521, HKDF-SHA512), HKDF-SHA512, AES-256-GCM

A.6.1. Base Setup Information

mode: 0
kem_id: 18
kdf_id: 3
aead_id: 2
info: 4f6465206f6e2061204772656369616e2055726e
ikmE: 7f06ab8215105fc46aceeb2e3dc5028b44364f960426eb0d8e4026c2f8b5d7
e7a986688f1591abf5ab753c357a5d6f0440414b4ed4ede71317772ac98d9239f709
04
pkEm: 040138b385ca16bb0d5fa0c0665fbbd7e69e3ee29f63991d3e9b5fa740aab8
900aaeed46ed73a49055758425a0ce36507c54b29cc5b85a5cee6bae0cf1c21f2731
ece2013dc3fb7c8d21654bb161b463962ca19e8c654ff24c94dd2898de12051f1ed0
692237fb02b2f8d1dc1c73e9b366b529eb436e98a996ee522aef863dd5739d2f29b0
skEm: 014784c692da35df6ecde98ee43ac425dbdd0969c0c72b42f2e708ab9d5354
15a8569bdacfcc0a114c85b8e3f26acf4d68115f8c91a66178cdbd03b7bcc5291e37
4b
ikmR: 2ad954bbe39b7122529f7dde780bff626cd97f850d0784a432784e69d86ecc
aade43b6c10a8ffdb94bf943c6da479db137914ec835a7e715e36e45e29b587bab3b
f1
pkRm: 0401b45498c1714e2dce167d3caf162e45e0642afc7ed435df7902ccae0e84
ba0f7d373f646b7738bbbdca11ed91bdeae3cdcba3301f2457be452f271fa6837580
e661012af49583a62e48d44bed350c7118c0d8dc861c238c72a2bda17f64704f464b
57338e7f40b60959480c0e58e6559b190d81663ed816e523b6b6a418f66d2451ec64
skRm: 01462680369ae375e4b3791070a7458ed527842f6a98a79ff5e0d4cbde83c2
7196a3916956655523a6a2556a7af62c5cadabe2ef9da3760bb21e005202f7b24628
47
enc: 040138b385ca16bb0d5fa0c0665fbbd7e69e3ee29f63991d3e9b5fa740aab89
00aaeed46ed73a49055758425a0ce36507c54b29cc5b85a5cee6bae0cf1c21f2731e
ce2013dc3fb7c8d21654bb161b463962ca19e8c654ff24c94dd2898de12051f1ed06
92237fb02b2f8d1dc1c73e9b366b529eb436e98a996ee522aef863dd5739d2f29b0
shared_secret: 776ab421302f6eff7d7cb5cb1adaea0cd50872c71c2d63c30c4f1
d5e43653336fef33b103c67e7a98add2d3b66e2fda95b5b2a667aa9dac7e59cc1d46
d30e818
key_schedule_context: 0083a27c5b2358ab4dae1b2f5d8f57f10ccccc822a4733
26f543f239a70aee46347324e84e02d7651a10d08fb3dda739d22d50c53fbfa8122b
aacd0f9ae5913072ef45baa1f3a4b169e141feb957e48d03f28c837d8904c3d67753
08c3d3faa75dd64adfa44e1a1141edf9349959b8f8e5291cbdc56f62b0ed6527d692
e85b09a4
secret: 49fd9f53b0f93732555b2054edfdc0e3101000d75df714b98ce5aa295a37
f1b18dfa86a1c37286d805d3ea09a20b72f93c21e83955a1f01eb7c5eead563d21e7
key:
751e346ce8f0ddb2305c8a2a85c70d5cf559c53093656be636b9406d4d7d1b70
base_nonce: 55ff7a7d739c69f44b25447b
exporter_secret: e4ff9dfbc732a2b9c75823763c5ccc954a2c0648fc6de80a585
81252d0ee3215388a4455e69086b50b87eb28c169a52f42e71de4ca61c920e7bd24c
95cc3f992

A.6.1.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 55ff7a7d739c69f44b25447b
ct: 170f8beddfe949b75ef9c387e201baf4132fa7374593dfafa90768788b7b2b20
0aafcc6d80ea4c795a7c5b841a

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 55ff7a7d739c69f44b25447a
ct: d9ee248e220ca24ac00bbbe7e221a832e4f7fa64c4fbab3945b6f3af0c5ecd5e
16815b328be4954a05fd352256

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 55ff7a7d739c69f44b254479
ct: 142cf1e02d1f58d9285f2af7dcfa44f7c3f2d15c73d460c48c6e0e506a3144ba
e35284e7e221105b61d24e1c7a

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 55ff7a7d739c69f44b25447f
ct: 3bb3a5a07100e5a12805327bf3b152df728b1c1be75a9fd2cb2bf5eac0cca1fb
80addb37eb2a32938c7268e3e5

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 55ff7a7d739c69f44b254484
ct: 4f268d0930f8d50b8fd9d0f26657ba25b5cb08b308c92e33382f369c768b558e
113ac95a4c70dd60909ad1adc7

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 55ff7a7d739c69f44b25457b
ct: dbbfc44ae037864e75f136e8b4b4123351d480e6619ae0e0ae437f036f2f8f1e
f677686323977a1ccbb4b4f16a

A.6.1.2. Exported Values
exporter_context:
L: 32
exported_value:
05e2e5bd9f0c30832b80a279ff211cc65eceb0d97001524085d609ead60d0412

exporter_context: 00
L: 32
exported_value:
fca69744bb537f5b7a1596dbf34eaa8d84bf2e3ee7f1a155d41bd3624aa92b63

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
f389beaac6fcf6c0d9376e20f97e364f0609a88f1bc76d7328e9104df8477013

A.6.2. PSK Setup Information

mode: 1
kem_id: 18
kdf_id: 3
aead_id: 2
info: 4f6465206f6e2061204772656369616e2055726e
ikmE: f3ebfa9a69a924e672114fcd9e06fa9559e937f7eccce4181a2b506df53dbe
514be12f094bb28e01de19dd345b4f7ede5ad7eaa6b9c3019592ec68eaae9a14732c
e0
pkEm: 040085eff0835cc84351f32471d32aa453cdc1f6418eaaecf1c2824210eb1d
48d0768b368110fab21407c324b8bb4bec63f042cfa4d0868d19b760eb4beba1bff7
93b30036d2c614d55730bd2a40c718f9466faf4d5f8170d22b6df98dfe0c067d02b3
49ae4a142e0c03418f0a1479ff78a3db07ae2c2e89e5840f712c174ba2118e90fdcb
skEm: 012e5cfe0daf5fe2a1cd617f4c4bae7c86f1f527b3207f115e262a98cc6526
8ec88cb8645aec73b7aa0a472d0292502d1078e762646e0c093cf873243d12c39915
f6
ikmR: a2a2458705e278e574f835effecd18232f8a4c459e7550a09d44348ae5d3b1
ea9d95c51995e657ad6f7cae659f5e186126a471c017f8f5e41da9eba74d4e0473e1
79
pkRm: 04006917e049a2be7e1482759fb067ddb94e9c4f7f5976f655088dec452466
14ff924ed3b385fc2986c0ecc39d14f907bf837d7306aada59dd5889086125ecd038
ead400603394b5d81f89ebfd556a898cc1d6a027e143d199d3db845cb91c5289fb26
c5ff80832935b0e8dd08d37c6185a6f77683347e472d1edb6daa6bd7652fea628fae
skRm: 011bafd9c7a52e3e71afbdab0d2f31b03d998a0dc875dd7555c63560e142bd
e264428de03379863b4ec6138f813fa009927dc5d15f62314c56d4e7ff2b485753eb
72
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc: 040085eff0835cc84351f32471d32aa453cdc1f6418eaaecf1c2824210eb1d4
8d0768b368110fab21407c324b8bb4bec63f042cfa4d0868d19b760eb4beba1bff79
3b30036d2c614d55730bd2a40c718f9466faf4d5f8170d22b6df98dfe0c067d02b34
9ae4a142e0c03418f0a1479ff78a3db07ae2c2e89e5840f712c174ba2118e90fdcb
shared_secret: 0d52de997fdaa4797720e8b1bebd3df3d03c4cf38cc8c1398168d
36c3fc7626428c9c254dd3f9274450909c64a5b3acbe45e2d850a2fd69ac0605fe5c
8a057a5
key_schedule_context: 0124497637cf18d6fbcc16e9f652f00244c981726f293b
b7819861e85e50c94f0be30e022ab081e18e6f299fd3d3d976a4bc590f85bc7711bf
ce32ee1a7fb1c154ef45baa1f3a4b169e141feb957e48d03f28c837d8904c3d67753
08c3d3faa75dd64adfa44e1a1141edf9349959b8f8e5291cbdc56f62b0ed6527d692
e85b09a4
secret: 2cf425e26f65526afc0634a3dba4e28d980c1015130ce07c2ac7530d7a39
1a75e5a0db428b09f27ad4d975b4ad1e7f85800e03ffeea35e8cf3fe67b18d4a1345
key:
f764a5a4b17e5d1ffba6e699d65560497ebaea6eb0b0d9010a6d979e298a39ff
base_nonce: 479afdf3546ddba3a9841f38
exporter_secret: 5c3d4b65a13570502b93095ef196c42c8211a4a188c4590d358
63665c705bb140ecba6ce9256be3fad35b4378d41643867454612adfd0542a684b61
799bf293f

A.6.2.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 479afdf3546ddba3a9841f38
ct: de69e9d943a5d0b70be3359a19f317bd9aca4a2ebb4332a39bcdfc97d5fe62f3
a77702f4822c3be531aa7843a1

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 479afdf3546ddba3a9841f39
ct: 77a16162831f90de350fea9152cfc685ecfa10acb4f7994f41aed43fa5431f23
82d078ec88baec53943984553e

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 479afdf3546ddba3a9841f3a
ct: f1d48d09f126b9003b4c7d3fe6779c7c92173188a2bb7465ba43d899a6398a33
3914d2bb19fd769d53f3ec7336

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 479afdf3546ddba3a9841f3c
ct: 829b11c082b0178082cd595be6d73742a4721b9ac05f8d2ef8a7704a53022d82
bd0d8571f578c5c13b99eccff8

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 479afdf3546ddba3a9841fc7
ct: a3ee291e20f37021e82df14d41f3fbe98b27c43b318a36cacd8471a3b1051ab1
2ee055b62ded95b72a63199a3f

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 479afdf3546ddba3a9841e38
ct: eecc2173ce1ac14b27ee67041e90ed50b7809926e55861a579949c07f6d26137
bf9cf0d097f60b5fd2fbf348ec

A.6.2.2. Exported Values
exporter_context:
L: 32
exported_value:
62691f0f971e34de38370bff24deb5a7d40ab628093d304be60946afcdb3a936

exporter_context: 00
L: 32
exported_value:
76083c6d1b6809da088584674327b39488eaf665f0731151128452e04ce81bff

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
0c7cfc0976e25ae7680cf909ae2de1859cd9b679610a14bec40d69b91785b2f6

A.6.3. Auth Setup Information

mode: 2
kem_id: 18
kdf_id: 3
aead_id: 2
info: 4f6465206f6e2061204772656369616e2055726e
ikmE: fe1c589c2a05893895a537f38c7cb4300b5a7e8fef3d6ccb8f07a498029c61
e90262e009dc254c7f6235f9c6b2fd6aeff0a714db131b09258c16e217b7bd2aa619
b0
pkEm: 04017de12ede7f72cb101dab36a111265c97b3654816dcd6183f809d4b3d11
1fe759497f8aefdc5dbb40d3e6d21db15bdc60f15f2a420761bcaeef73b891c2b117
e9cf01e29320b799bbc86afdc5ea97d941ea1c5bd5ebeeac7a784b3bab524746f3e6
40ec26ee1bd91255f9330d974f845084637ee0e6fe9f505c5b87c86a4e1a6c3096dd
skEm: 0185f03560de87bb2c543ef03607f3c33ac09980000de25eabe3b224312946
330d2e65d192d3b4aa46ca92fc5ca50736b624402d95f6a80dc04d1f10ae95171372
61
ikmR: 8feea0438481fc0ecd470d6adfcda334a759c6b8650452c5a5dd9b2dd2cc9b
e33d2bb7ee64605fc07ab4664a58bb9a8de80defe510b6c97d2daf85b92cd4bb0a66
bf
pkRm: 04007d419b8834e7513d0e7cc66424a136ec5e11395ab353da324e3586673e
e73d53ab34f30a0b42a92d054d0db321b80f6217e655e304f72793767c4231785c4a
4a6e008f31b93b7a4f2b8cd12e5fe5a0523dc71353c66cbdad51c86b9e0bdfcd9a45
698f2dab1809ab1b0f88f54227232c858accc44d9a8d41775ac026341564a2d749f4
skRm: 013ef326940998544a899e15e1726548ff43bbdb23a8587aa3bef9d1b85733
8d87287df5667037b519d6a14661e9503cfc95a154d93566d8c84e95ce93ad05293a
0b
ikmS: 2f66a68b85ef04822b054ef521838c00c64f8b6226935593b69e13a1a2461a
4f1a74c10c836e87eed150c0db85d4e4f506cbb746149befac6f5c07dc48a615ef92
db
pkSm: 04015cc3636632ea9a3879e43240beae5d15a44fba819282fac26a19c989fa
fdd0f330b8521dff7dc393101b018c1e65b07be9f5fc9a28a1f450d6a541ee0d7622
1133001e8f0f6a05ab79f9b9bb9ccce142a453d59c5abebb5674839d935a3ca1a3fb
c328539a60b3bc3c05fed22838584a726b9c176796cad0169ba4093332cbd2dc3a9f
skSm: 001018584599625ff9953b9305849850d5e34bd789d4b81101139662fbea8b
6508ddb9d019b0d692e737f66beae3f1f783e744202aaf6fea01506c27287e359fe7
76
enc: 04017de12ede7f72cb101dab36a111265c97b3654816dcd6183f809d4b3d111
fe759497f8aefdc5dbb40d3e6d21db15bdc60f15f2a420761bcaeef73b891c2b117e
9cf01e29320b799bbc86afdc5ea97d941ea1c5bd5ebeeac7a784b3bab524746f3e64
0ec26ee1bd91255f9330d974f845084637ee0e6fe9f505c5b87c86a4e1a6c3096dd
shared_secret: 26648fa2a2deb0bfc56349a590fd4cb7108a51797b634694fc020
61e8d91b3576ac736a68bf848fe2a58dfb1956d266e68209a4d631e513badf8f4dcf
c00f30a
key_schedule_context: 0283a27c5b2358ab4dae1b2f5d8f57f10ccccc822a4733
26f543f239a70aee46347324e84e02d7651a10d08fb3dda739d22d50c53fbfa8122b
aacd0f9ae5913072ef45baa1f3a4b169e141feb957e48d03f28c837d8904c3d67753
08c3d3faa75dd64adfa44e1a1141edf9349959b8f8e5291cbdc56f62b0ed6527d692
e85b09a4
secret: 56b7acb7355d080922d2ddc227829c2276a0b456087654b3ac4b53828bd3
4af8cf54626f85af858a15a86eba73011665cc922bc59fd07d2975f356d2674db554
key:
01fced239845e53f0ec616e71777883a1f9fcab22a50f701bdeee17ad040e44d
base_nonce: 9752b85fe8c73eda183f9e80
exporter_secret: 80466a9d9cc5112ddad297e817e038801e15fa18152bc4dc010
a35d7f534089c87c98b4bacd7bbc6276c4002a74085adcd9019fca6139826b529256
9cfb7fe47

A.6.3.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: 9752b85fe8c73eda183f9e80
ct: 0116aeb3a1c405c61b1ce47600b7ecd11d89b9c08c408b7e2d1e00a4d64696d1
2e6881dc61688209a8207427f9

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: 9752b85fe8c73eda183f9e81
ct: 37ece0cf6741f443e9d73b9966dc0b228499bb21fbf313948327231e70a18380
e080529c0267f399ba7c539cc6

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: 9752b85fe8c73eda183f9e82
ct: d17b045cac963e45d55fd3692ec17f100df66ac06d91f3b6af8efa7ed3c88955
50eb753bc801fe4bd27005b4bd

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: 9752b85fe8c73eda183f9e84
ct: 50c523ae7c64cada96abea16ddf67a73d2914ec86a4cedb31a7e6257f7553ed2
44626ef79a57198192b2323384

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: 9752b85fe8c73eda183f9e7f
ct: 53d422295a6ce8fcc51e6f69e252e7195e64abf49252f347d8c25534f1865a6a
17d949c65ce618ddc7d816111f

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: 9752b85fe8c73eda183f9f80
ct: 0dfcfc22ea768880b4160fec27ab10c75fb27766c6bb97aed373a9b6eae35d31
afb08257401075cbb602ac5abb

A.6.3.2. Exported Values
exporter_context:
L: 32
exported_value:
8d78748d632f95b8ce0c67d70f4ad1757e61e872b5941e146986804b3990154b

exporter_context: 00
L: 32
exported_value:
80a4753230900ea785b6c80775092801fe91183746479f9b04c305e1db9d1f4d

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
620b176d737cf366bcc20d96adb54ec156978220879b67923689e6dca36210ed

A.6.4. AuthPSK Setup Information

mode: 3
kem_id: 18
kdf_id: 3
aead_id: 2
info: 4f6465206f6e2061204772656369616e2055726e
ikmE: 54272797b1fbc128a6967ff1fd606e0c67868f7762ce1421439cbc9e90ce1b
28d566e6c2acbce712e48eebf236696eb680849d6873e9959395b2931975d61d38bd
6c
pkEm: 04000a5096a6e6e002c83517b494bfc2e36bfb8632fae8068362852b70d0ff
71e560b15aff96741ecffb63d8ac3090c3769679009ac59a99a1feb4713c5f090fc0
dbed01ad73c45d29d369e36744e9ed37d12f80700c16d816485655169a5dd66e4ddf
27f2acffe0f56f7f77ea2b473b4bf0518b975d9527009a3d14e5a4957e3e8a9074f8
skEm: 003430af19716084efeced1241bb1a5625b6c826f11ef31649095eb2795261
9e36f62a79ea28001ac452fb20ddfbb66e62c6c0b1be03c0d28c97794a1fb638207a
83
ikmR: 3db434a8bc25b27eb0c590dc64997ab1378a99f52b2cb5a5a5b2fa540888f6
c0f09794c654f4468524e040e6b4eca2c9dcf229f908b9d318f960cc9e9baa92c5ee
e6
pkRm: 0401655b5d3b7cfafaba30851d25edc44c6dd17d99410efbed8591303b4dbe
ea8cb1045d5255f9a60384c3bbd4a3386ae6e6fab341dc1f8db0eed5f0ab1aaac6d7
838e00dadf8a1c2c64b48f89c633721e88369e54104b31368f26e35d04a442b0b428
510fb23caada686add16492f333b0f7ba74c391d779b788df2c38d7a7f4778009d91
skRm: 0053c0bc8c1db4e9e5c3e3158bfdd7fc716aef12db13c8515adf821dd692ba
3ca53041029128ee19c8556e345c4bcb840bb7fd789f97fe10f17f0e2c6c25280728
43
ikmS: 65d523d9b37e1273eb25ad0527d3a7bd33f67208dd1666d9904c6bc04969ae
5831a8b849e7ff642581f2c3e56be84609600d3c6bbdaded3f6989c37d2892b1e978
d5
pkSm: 040013761e97007293d57de70962876b4926f69a52680b4714bee1d4236aa9
6c19b840c57e80b14e91258f0a350e3f7ba59f3f091633aede4c7ec4fa8918323aa4
5d5901076dec8eeb22899fda9ab9e1960003ff0535f53c02c40f2ae4cdc6070a3870
b85b4bdd0bb77f1f889e7ee51f465a308f08c666ad3407f75dc046b2ff5a24dbe2ed
skSm: 003f64675fc8914ec9e2b3ecf13585b26dbaf3d5d805042ba487a5070b8c5a
c1d39b17e2161771cc1b4d0a3ba6e866f4ea4808684b56af2a49b5e5111146d45d93
26
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc: 04000a5096a6e6e002c83517b494bfc2e36bfb8632fae8068362852b70d0ff7
1e560b15aff96741ecffb63d8ac3090c3769679009ac59a99a1feb4713c5f090fc0d
bed01ad73c45d29d369e36744e9ed37d12f80700c16d816485655169a5dd66e4ddf2
7f2acffe0f56f7f77ea2b473b4bf0518b975d9527009a3d14e5a4957e3e8a9074f8
shared_secret: 9e1d5f62cb38229f57f68948a0fbc1264499910cce50ec62cb241
88c5b0a98868f3c1cfa8c5baa97b3f24db3cdd30df6e04eae83dc4347be8a981066c
3b5b945
key_schedule_context: 0324497637cf18d6fbcc16e9f652f00244c981726f293b
b7819861e85e50c94f0be30e022ab081e18e6f299fd3d3d976a4bc590f85bc7711bf
ce32ee1a7fb1c154ef45baa1f3a4b169e141feb957e48d03f28c837d8904c3d67753
08c3d3faa75dd64adfa44e1a1141edf9349959b8f8e5291cbdc56f62b0ed6527d692
e85b09a4
secret: 50a57775958037a04098e0054576cd3bc084d0d08d29548ba4befa5676b9
1eb4dcd0752813a052c9a930d0aba6ca10b89dd690b64032dc635dece35d1bf4645c
key:
1316ed34bd52374854ed0e5cb0394ca0a79b2d8ce7f15d5104f21acdfb594286
base_nonce: d9c64ec8deb8a0647fafe8ff
exporter_secret: 6cb00ff99aebb2e4a05042ce0d048326dd2c03acd61a601b103
8a65398406a96ab8b5da3187412b2324089ea16ba4ff7e6f4fe55d281fc8ae5f2049
032b69ebd

A.6.4.1. Encryptions
sequence number: 0
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d30
nonce: d9c64ec8deb8a0647fafe8ff
ct: 942a2a92e0817cf032ce61abccf4f3a7c5d21b794ed943227e07b7df2d6dd92c
9b8a9371949e65cca262448ab7

sequence number: 1
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d31
nonce: d9c64ec8deb8a0647fafe8fe
ct: c0a83b5ec3d7933a090f681717290337b4fede5bfaa0a40ec29f93acad742888
a1513c649104c391c78d1d7f29

sequence number: 2
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d32
nonce: d9c64ec8deb8a0647fafe8fd
ct: 2847b2e0ce0b9da8fca7b0e81ff389d1682ee1b388ed09579b145058b5af6a93
a85dd50d9f417dc88f2c785312

sequence number: 4
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d34
nonce: d9c64ec8deb8a0647fafe8fb
ct: fbd9948ab9ac4a9cb9e295c07273600e6a111a3a89241d3e2178f39d532a2ec5
c15b9b0c6937ac84c88e0ca76f

sequence number: 255
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323535
nonce: d9c64ec8deb8a0647fafe800
ct: 63113a870131b567db8f39a11b4541eafbd2d3cf3a9bf9e5c1cfcb41e52f9027
310b82a4868215959131694d15

sequence number: 256
pt: 4265617574792069732074727574682c20747275746820626561757479
aad: 436f756e742d323536
nonce: d9c64ec8deb8a0647fafe9ff
ct: 24f9d8dadd2107376ccd143f70f9bafcd2b21d8117d45ff327e9a78f603a3260
6e42a6a8bdb57a852591d20907

A.6.4.2. Exported Values
exporter_context:
L: 32
exported_value:
a39502ef5ca116aa1317bd9583dd52f15b0502b71d900fc8a622d19623d0cb5d

exporter_context: 00
L: 32
exported_value:
749eda112c4cfdd6671d84595f12cd13198fc3ef93ed72369178f344fe6e09c3

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
f8b4e72cefbff4ca6c4eabb8c0383287082cfcbb953d900aed4959afd0017095

A.7. DHKEM(X25519, HKDF-SHA256), HKDF-SHA256, Export-Only AEAD

A.7.1. Base Setup Information

mode: 0
kem_id: 32
kdf_id: 1
aead_id: 65535
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
55bc245ee4efda25d38f2d54d5bb6665291b99f8108a8c4b686c2b14893ea5d9
pkEm:
e5e8f9bfff6c2f29791fc351d2c25ce1299aa5eaca78a757c0b4fb4bcd830918
skEm:
095182b502f1f91f63ba584c7c3ec473d617b8b4c2cec3fad5af7fa6748165ed
ikmR:
683ae0da1d22181e74ed2e503ebf82840deb1d5e872cade20f4b458d99783e31
pkRm:
194141ca6c3c3beb4792cd97ba0ea1faff09d98435012345766ee33aae2d7664
skRm:
33d196c830a12f9ac65d6e565a590d80f04ee9b19c83c87f2c170d972a812848
enc:
e5e8f9bfff6c2f29791fc351d2c25ce1299aa5eaca78a757c0b4fb4bcd830918
shared_secret:
e81716ce8f73141d4f25ee9098efc968c91e5b8ce52ffff59d64039e82918b66
key_schedule_context: 009bd09219212a8cf27c6bb5d54998c5240793a70ca0a8
92234bd5e082bc619b6a3f4c22aa6d9a0424c2b4292fdf43b8257df93c2f6adbf6dd
c9c64fee26bdd292
secret:
04d64e0620aa047e9ab833b0ebcd4ff026cefbe44338fd7d1a93548102ee01af
key:
base_nonce:
exporter_secret:
79dc8e0509cf4a3364ca027e5a0138235281611ca910e435e8ed58167c72f79b

A.7.1.1. Exported Values
exporter_context:
L: 32
exported_value:
7a36221bd56d50fb51ee65edfd98d06a23c4dc87085aa5866cb7087244bd2a36

exporter_context: 00
L: 32
exported_value:
d5535b87099c6c3ce80dc112a2671c6ec8e811a2f284f948cec6dd1708ee33f0

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
ffaabc85a776136ca0c378e5d084c9140ab552b78f039d2e8775f26efff4c70e

A.7.2. PSK Setup Information

mode: 1
kem_id: 32
kdf_id: 1
aead_id: 65535
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
c51211a8799f6b8a0021fcba673d9c4067a98ebc6794232e5b06cb9febcbbdf5
pkEm:
d3805a97cbcd5f08babd21221d3e6b362a700572d14f9bbeb94ec078d051ae3d
skEm:
1d72396121a6a826549776ef1a9d2f3a2907fc6a38902fa4e401afdb0392e627
ikmR:
5e0516b1b29c0e13386529da16525210c796f7d647c37eac118023a6aa9eb89a
pkRm:
d53af36ea5f58f8868bb4a1333ed4cc47e7a63b0040eb54c77b9c8ec456da824
skRm:
98f304d4ecb312689690b113973c61ffe0aa7c13f2fbe365e48f3ed09e5a6a0c
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc:
d3805a97cbcd5f08babd21221d3e6b362a700572d14f9bbeb94ec078d051ae3d
shared_secret:
024573db58c887decb4c57b6ed39f2c9a09c85600a8a0ecb11cac24c6aaec195
key_schedule_context: 01446fb1fe2632a0a338f0a85ed1f3a0ac475bdea2cd72
f8c713b3a46ee737379a3f4c22aa6d9a0424c2b4292fdf43b8257df93c2f6adbf6dd
c9c64fee26bdd292
secret:
638b94532e0d0bf812cf294f36b97a5bdcb0299df36e22b7bb6858e3c113080b
key:
base_nonce:
exporter_secret:
04261818aeae99d6aba5101bd35ddf3271d909a756adcef0d41389d9ed9ab153

A.7.2.1. Exported Values
exporter_context:
L: 32
exported_value:
be6c76955334376aa23e936be013ba8bbae90ae74ed995c1c6157e6f08dd5316

exporter_context: 00
L: 32
exported_value:
1721ed2aa852f84d44ad020c2e2be4e2e6375098bf48775a533505fd56a3f416

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
7c9d79876a288507b81a5a52365a7d39cc0fa3f07e34172984f96fec07c44cba

A.7.3. Auth Setup Information

mode: 2
kem_id: 32
kdf_id: 1
aead_id: 65535
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
43b078912a54b591a7b09b16ce89a1955a9dd60b29fb611e044260046e8b061b
pkEm:
5ac1671a55c5c3875a8afe74664aa8bc68830be9ded0c5f633cd96400e8b5c05
skEm:
83d3f217071bbf600ba6f081f6e4005d27b97c8001f55cb5ff6ea3bbea1d9295
ikmR:
fc9407ae72ed614901ebf44257fb540f617284b5361cfecd620bafc4aba36f73
pkRm:
ffd7ac24694cb17939d95feb7c4c6539bb31621deb9b96d715a64abdd9d14b10
skRm:
ed88cda0e91ca5da64b6ad7fc34a10f096fa92f0b9ceff9d2c55124304ed8b4a
ikmS:
2ff4c37a17b2e54046a076bf5fea9c3d59250d54d0dc8572bc5f7c046307040c
pkSm:
89eb1feae431159a5250c5186f72a15962c8d0debd20a8389d8b6e4996e14306
skSm:
c85f136e06d72d28314f0e34b10aadc8d297e9d71d45a5662c2b7c3b9f9f9405
enc:
5ac1671a55c5c3875a8afe74664aa8bc68830be9ded0c5f633cd96400e8b5c05
shared_secret:
e204156fd17fd65b132d53a0558cd67b7c0d7095ee494b00f47d686eb78f8fb3
key_schedule_context: 029bd09219212a8cf27c6bb5d54998c5240793a70ca0a8
92234bd5e082bc619b6a3f4c22aa6d9a0424c2b4292fdf43b8257df93c2f6adbf6dd
c9c64fee26bdd292
secret:
355e7ef17f438db43152b7fb45a0e2f49a8bf8956d5dddfec1758c0f0eb1b5d5
key:
base_nonce:
exporter_secret:
276d87e5cb0655c7d3dad95e76e6fc02746739eb9d968955ccf8a6346c97509e

A.7.3.1. Exported Values
exporter_context:
L: 32
exported_value:
83c1bac00a45ed4cb6bd8a6007d2ce4ec501f55e485c5642bd01bf6b6d7d6f0a

exporter_context: 00
L: 32
exported_value:
08a1d1ad2af3ef5bc40232a64f920650eb9b1034fac3892f729f7949621bf06e

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
ff3b0e37a9954247fea53f251b799e2edd35aac7152c5795751a3da424feca73

A.7.4. AuthPSK Setup Information

mode: 3
kem_id: 32
kdf_id: 1
aead_id: 65535
info: 4f6465206f6e2061204772656369616e2055726e
ikmE:
94efae91e96811a3a49fd1b20eb0344d68ead6ac01922c2360779aa172487f40
pkEm:
81cbf4bd7eee97dd0b600252a1c964ea186846252abb340be47087cc78f3d87c
skEm:
a2b43f5c67d0d560ee04de0122c765ea5165e328410844db97f74595761bbb81
ikmR:
4dfde6fadfe5cb50fced4034e84e6d3a104aa4bf2971360032c1c0580e286663
pkRm:
f47cd9d6993d2e2234eb122b425accfb486ee80f89607b087094e9f413253c2d
skRm:
c4962a7f97d773a47bdf40db4b01dc6a56797c9e0deaab45f4ea3aa9b1d72904
ikmS:
26c12fef8d71d13bbbf08ce8157a283d5e67ecf0f345366b0e90341911110f1b
pkSm:
29a5bf3867a6128bbdf8e070abe7fe70ca5e07b629eba5819af73810ee20112f
skSm:
6175b2830c5743dff5b7568a7e20edb1fe477fb0487ca21d6433365be90234d0
psk:
0247fd33b913760fa1fa51e1892d9f307fbe65eb171e8132c2af18555a738b82
psk_id: 456e6e796e20447572696e206172616e204d6f726961
enc:
81cbf4bd7eee97dd0b600252a1c964ea186846252abb340be47087cc78f3d87c
shared_secret:
d69246bcd767e579b1eec80956d7e7dfbd2902dad920556f0de69bd54054a2d1
key_schedule_context: 03446fb1fe2632a0a338f0a85ed1f3a0ac475bdea2cd72
f8c713b3a46ee737379a3f4c22aa6d9a0424c2b4292fdf43b8257df93c2f6adbf6dd
c9c64fee26bdd292
secret:
c15c5bec374f2087c241d3533c6ec48e1c60a21dd00085619b2ffdd84a7918c3
key:
base_nonce:
exporter_secret:
695b1faa479c0e0518b6414c3b46e8ef5caea04c0a192246843765ae6a8a78e0

A.7.4.1. Exported Values
exporter_context:
L: 32
exported_value:
dafd8beb94c5802535c22ff4c1af8946c98df2c417e187c6ccafe45335810b58

exporter_context: 00
L: 32
exported_value:
7346bb0b56caf457bcc1aa63c1b97d9834644bdacac8f72dbbe3463e4e46b0dd

exporter_context: 54657374436f6e74657874
L: 32
exported_value:
84f3466bd5a03bde6444324e63d7560e7ac790da4e5bbab01e7c4d575728c34a

Acknowledgements

The authors would like to thank Joel Alwen , Jean-Philippe Aumasson , David Benjamin , Benjamin Beurdouche , Bruno Blanchet , Frank Denis , Stephen Farrell , Scott Fluhrer , Eduard Hauck , Scott Hollenbeck , Kevin Jacobs , Burt Kaliski , Eike Kiltz , Julia Len , John Mattsson , Christopher Patton , Doreen Riepel , Raphael Robert , Michael Rosenberg , Michael Scott , Martin Thomson , Steven Valdez , Riad Wahby , and other contributors in the CFRG for helpful feedback that greatly improved this document.

Authors' Addresses

Richard L. Barnes

Cisco

Karthik Bhargavan

Inria

Benjamin Lipp

Inria

Christopher A. Wood

Cloudflare

Taxing Growth

Hacker News
www.equitileconversations.com
2025-12-10 14:01:12
Comments...
Original Article

Gerald:

Hello and welcome to Equitile Conversations. I'm Gerald Ashley, and as usual, I'm joined by my good friend and colleague, George Cooper. And today in this episode, we're really pleased to welcome along Doug McWilliams, who's really one of the big names in macroeconomics and forecasting. I would hesitate to say it, but not just economics, because I think it ties into the politic the politics of things as well. So, Doug, welcome. Thank you. Thank you so much for inviting me. The show notes will sh sort of show your full background and everything, and people can take a look at that. But um maybe a good opening point is that you've recently been a co-author of a new book out called Prosperity Through Growth, which is lovely timing because it does seem a good time to be talking about growth uh not just for us on this podcast, but uh really for uh anybody interested in the UK and wet Western uh economies. Um George, I know you've already sort of gone into this book and it's Prosperity Through Growth. Any initial thoughts or comments?

George:

Yes, I'm I'm about halfway through the book, and I had the the great pleasure of seeing um of seeing Doug and uh Arthur Laffer, who's the other co-author of the of the book, uh, give a talk at the London School of Economics. It was it was really quite impressive. I think uh I think everybody would agree it was quite amazing uh the the energy that Arthur Laffer had. And I've I've never really been a flat tax person up to this point, but I have to say the first few chapters of the book are s um starting to convert me quite powerfully. Uh Doug, what do you want to weigh in on that side of the book?

Doug:

Yeah, surely. Shall I just say a little bit about the book and then come in and uh answer your question directly? Um the book's Prosperity Through Growth. Um, as you point out, the most famous of the co-authors is uh Arthur Laffer, the Laffer of the Laffer Curve. But it's also uh uh Michael Hintsey, who's the Pope's economic advisor, amongst other things. And uh he um he is a fairly successful hedge fund owner, um, and Matthew Elliott, who made his biggest reputation, I think, by being the guru behind vote lead, which eventually voted for Brexit. And both um uh Matthew, Michael Hinsey and Matt Elliott are now in the House of Lords, and I'm the sort of guy, the workhorse, as it were, who did quite a lot of the putting the stuff together. So that's the book. The subtitle is Raising Living Standards in an age of autocracy and AI. And what we're trying to say is what do you need to do to get growth in a very modern world where the elasticities have changed quite considerably? And um a bit like you, I was not originally a flat tax guy. I got introduced to it by Alistair Heath, who's the editor of the Sunday Telegraph, when he chaired something called the Tax 2020 Commission, on which I sat about, I think, roughly 15 years ago. And um we went through the various options for tax and came to the conclusion that obviously it's politically complicated to get a flat tax, but if you could get a flat tax, there would be fairly considerable economic uh advantages. And in Prosperity Through Growth, we put forward a plan for getting us to a 20% flat tax rate with no national insurance contributions and get there over a 20-year period. And we argue that in total, that plus uh regulatory changes would raise GDP by just under 20%. That is worth having. It would make so many of our problems today so much uh less complicated. And that was our calculation.

Gerald:

Well, uh that is extremely interesting. Um obviously the the politics behind that sound um murderously difficult, to say the least. Um, I just sort of widen this out in terms of context, which is um we are just assuming that economic growth is a good thing. I know there are flat earthers in the world who think we don't need any more increases in economic activity, but I think that's very much a minority view. Um, the more important point maybe, um, why are we struggling so much in the UK and Western Europe to have anything other than very sclerotic growth?

Doug:

I'm also a member of something called the Growth Commission, which is an international commission set up to try and raise rates of economic growth, especially in Europe and in the UK. And we've looked quite hard at what is holding back growth. In many senses, trying to understand the impact of policy on growth has been my entire career. I started working for the Confederation of British Industry. I was apprenticed to a man called Sir Donald McDougall, who'd been Churchill's economic advisor. So there's quite a lot of history in there. And I was I eventually succeeded him as the CBI's chief economic advisor. And virtually every year we did an assessment of what tax changes would most boost the economy. Anyway, when the Growth Commission did a lot of work on this, um, what we discovered is two things really have been holding back countries like the UK. Um, the biggest is actually regulation. And um particularly um uh post the financial crisis, where the um financial regulation became a lot more aggressive, also environmental regulation became a lot more aggressive. And the final area, which is especially important at the moment, is planning. Now, a lot of people think that the problem with planning is sort of Tory nimbies who don't want building anywhere near them. But actually, on the whole, it is not. The problem is the whole raft of environmental uh legislation of various different kinds. They're just piled rule after rule after rule after rule. Recently we had a whole new town um banned from being started by the judges because of a spider. And the spider didn't even live anywhere near where the new town was, it was about a kilometer away. But natural England managed to show that the natural habitat of the spider included areas within a kilometer of where the spider actually was, and so this spider prevented a whole new town from being built. Now, when you've got things of that kind taking place, so you've got the planning, you've got the environmental regulation, of which, of course, one of the consequences is very high energy prices, you've got the financial regulation, you have a whole host of regulations that hold things back. On top of that is tax, and we have a whole series of taxes, and they bear much more heavily now than they ever did, because people who are talented, people who are wealthy, people who have capital, people who have entrepreneurial skills, they can move. The modern world makes them much more mobile. First of all, because in the modern world, they tend, even if they're based in one place, to have links with other places. You've got, you know, friends who are professors in London and in Sweden. You've got friends who um are involved in businesses that are in two or three different places. They're no longer just tied to one place. I mean, we perhaps, the three of us, maybe a bit more tied to individual places for historic reasons. But the vast majority of the bright young things of today operate in more than one regime. And what that means is when the going gets too hot in one place, they go somewhere else. So the consequences of all these things is that those places that have the wrong type of regulation and the wrong type of tax, they get really heavily penalized. And that's what we're suffering today.

Gerald:

So the uh the political challenge is to roll that back. Um, George, you had a comment.

George:

I was just going to say how how much I agree with what you said there, Doug. And I think there's another aspect that worries me as well, is as a businessman it worries me, and also as a sort of analyst of the economy. And that's it's not just the level of regulation, but the instability and the uncertainty around regulation. And and when I look at what's happened in the uh in the run-up to this latest budget, where we had effectively a whole raft of new taxes threatened, most of which weren't delivered in the budget, but they were threatened. There was the exit tax threatened, there was uh more draconian property taxes threatened, wealth taxes, and things like that. And now that all of these things have been put into uh the sort of public debate, we've got people responding naturally to the prospect that those taxes may come in. So we're we're starting to lose entrepreneurs for something that actually hasn't happened. It's just the threat and the concern of it. I think that's a big issue here.

Doug:

Part of what we did in the book, because we're conscious that having the right answer wasn't didn't solve the problem. And you need to have the right answer and find a way of making it work analytically, is we interviewed. We had, depends how you number them, because some of the interviews had two people turning up, but roughly 35 interviews with pretty high-powered movers and shakers. We managed to get there are eight living former prime ministers at the moment, which is a record, and we managed to interview five of the eight. We interviewed there were 11 living former chancellors, and we interviewed nine out of that eleven, and then we interviewed quite a lot of the cabinet secretaries and various other people like that to ask them about it and to give them credit. The most impressive interview by far was given by Tony Blair. And Tony Blair said, Just because things play badly in the opinion polls doesn't mean you can't do them. He said Mrs. Thatcher kept doing things that were unpopular, and she kept winning elections, and he said he'd learnt a lot from observing that. And he reckoned that most governments had at least a two-year period where when they came in, if they were doing the right thing, they had plenty of time to do the more crowd-pleasing things towards the end of their term of office. And he said it could be done. He also said that although the civil service is not hugely supported, and especially so at the moment, that may be partly as a result of some of the things that he himself did. But nevertheless, if you give them a good enough lead and you've prepared yourself properly, then you can get them to do it. David Cameron said much the same thing. He said, but you have to prepare yourself. Of course, this was just after Donald Trump had come to office and had his hundred days of large numbers of very early uh initiatives. Um, Cameron was quite impressed, and he said, if you don't plan it before you get into office, you get caught up in the hurly-burly of things and you can never get anything done. You have to come to power. You have to come to power with a sort of sense that you've got a mandate for change. And so you have to make it clear that some of the things that you will do will not be entirely popular. And he said you can get in on that basis. And he said you also have to make sure that you've got your plan prepared. And he said, if you do those things, um, you'll find you can get a lot further than you imagine.

Gerald:

Yeah, I think there was a comment on a TV show fairly recently, an interview with Lord King, the former governor of the Bank of England, who made this very point that he was somewhat astonished that the current administration uh under Sakya Starmer does not seem to have an overarching plan. They do seem to be um very short-termist and very reactive. Maybe it's not for us to tell the government how to run government, but it doesn't seem to be the bet the best way of going about things.

Doug:

Well, that was certainly the comment that we got from the politicians that we interviewed. I mean, Tony Blair was quite discreet about making comments specifically about the current government for obvious reasons. But reading between the lines, he seemed to be really critical about the extent to which they A didn't have a plan, and B, were so short-term in their behavior.

Gerald:

Yes. I mean, I I I think this is a good opportunity here to widen out a little bit the landscape of which uh the UK and indeed a large number of other economies are uh uh occupying now. And this is a favorite topic of George and I. Um, we don't stop people in the street and talk about it, but we're not far from that. And that is, of course, demographics, and the um you could say quite alarming fall um in demographics in terms of uh new births and uh aging population and everything. So against this that background, this is gonna make things worse. It's certainly not gonna help, is it?

Doug:

Well, I was involved in a book that was written about 25 years ago called The Demographic Investor by a man called Richard Cragg. I don't know if either of you read it. It was published by FT Publications, and we looked at the consequences of the demographics in the West, which are a fall in the number of children, slightly increasing longevity, although it's not in fact increased quite as quickly as it looked as if it was going to at that point, and then trying to fill the hole essentially with immigration. And um, we did argue that um the demographic trends were not going to be helpful for, first of all, economic growth, secondly, investment returns, because presumably the older people are going to be saving, and they're going to be flooding the markets with money driving down yields and various other consequences of that kind. More recently, with the Growth Commission, we got our Japanese commissioner, uh Naira Yahiro, to have a look at how Japan have been coping. Because of course, Japan has been facing the sort of demographics that we're talking about a little bit earlier than us. And also they have rather greater longevity than we have because they they eat better and so on. And uh the conclusion was fascinating. The first thing was that Japanese GDP per capita hasn't done quite as badly as you might have imagined, although I think that uh they may be starting to be found out by the markets and uh their policy may uh uh uh may as yet unravel. The second thing was that uh if you did the right things to increase labor force participation amongst those people who were there, including raising the retirement age and various things like that, you could um to a certain extent assuage some of the problems caused by demographics. But what we've been doing in the UK is the opposite. What we've been doing instead in the UK has been encouraging people of working age not to work. And the combination of adverse demographics and an increasing failure to participate from the working age population is not a policy that's going to be terribly good for our long-term prosperity.

George:

Yes, and I I think Doug, there's also just recently another angle to the demographic story that's come out, which is quite worrying, and that is the the government's trumpeting a success in reducing inward immigration, but it turns out that underneath uh the that reduction has largely been driven by an increase in emigration from uh from the UK. And we d I don't think we know the details so much, but there's a decent chance that a lot of that emigration is going to be the high skilled, high, high wage earners. So we we may be uh running into yet another aspect of this demographic problem in this country.

Doug:

George, that's a really important point, but it's actually even worse than that, because there is an age skew amongst the people who are leaving the UK, and we are very much using losing the youngest, and I would assume the brightest, although there isn't actually an IQ test when you leave the country to discover whether you are the brightest or not. But it does seem to be that the ones with the most get up and go have got up and gone.

Gerald:

And this is shades of exactly what happened um winding back in history. One thinks of the brain drain and the tax regime in the mid-70s. And um, just to keep this a nice pessimistic little podcast, um, we've got a huge problem with government debt, or maybe more precisely government spending. I suppose if there was growth, um it would be much easier to fund uh that spending. Um, we've had this discussion on previous podcasts, and George and I think there's no get out of jail card from all of this, other than possibly inflation. Does that make sense?

Doug:

Well, cutting government spending would definitely do a lot to improve the situation. You cutting government spending would um uh reduce the deficit, it would reduce the pressure on debt. The markets tend to be quite positive when you've got a government that's cutting spending. You can see this in Argentina. Okay, they've had a uh the the markets have gone up and down a little bit on this, but on the whole, the markets have treated fairly positively the aggressive line to cut spending there. Um, so I think cutting spending, if you could achieve it, you'd get a double dividend. First of all, there would be less pressure on the test market because you wouldn't have to issue so much debt, but secondly, the markets would see you in a more positive light. Because projecting forward, if you think that the the economy is going to grow, you get much less worried about debt than if you think the economy is stagnant. So I think that you you do get a double dividend if you do that. Um is inflation a solution? I'm not convinced. I mean, like both of you, I lived through the 70s, and it didn't seem to me that that was a great time to be uh around. I think inflation is one of these things, it's a sort of trick you can only really. Pull off once in a sense, because the only inflation that actually makes your life easier if you're a government is unanticipated inflation. And once it gets built into expectations, you no longer get any benefits from it. So I don't see inflation as a way out. I'm sorry, I think inflation would actually make things worse. Oh, yeah. And by the way, of course, we already have inflation. I mean, our inflation is roughly twice the average for the uh for the G7. So uh don't we're already inflating our way out.

Gerald:

I I think I was slightly tongue-in-cheek there, but there's one other little element which I think is uh a primrose path that we wandered down in the 1970s and are doing again, which is indexation. So, you know, we're seeing a lot of indexation of benefits, a lot of um a lot of wage demands coming through and being acceded to. And there's a large part of the price pressure now that is is sort of built in as a ratchet, isn't it?

Doug:

Um that certainly makes things worse. Although it's hard to avoid indexation once you have a lot of inflation, because otherwise you get really quite dramatic changes in relative prices. It does lock the system in in such a way that it makes it more difficult to get out. I was actually involved in negotiating, I mean, this tells you how old I am. I was actually involved in negotiating pay policy with the trade unions in the 1970s. And one of the things I brought to the party was I persuaded them once that the only way out was to index on the basis of future inflation, not past inflation. Because if you kept on indexing on the basis of past inflation, that you know, the the trend just went upwards and upwards and upwards. And we managed to break the cycle. And I do claim some credit for this. Uh I managed to claim the powers that be in the CBI, who then persuaded the TUC, and we then both persuaded the government, that when we were setting pay policy, pay policy should be based on expectations of future inflation, not just um uh inflation in the past. And that did to some extent help decelerate inflation in the late 70s. But it was always trying to slam a top on a boiling kettle, and it wasn't going to stop the steam coming out eventually. But it did have a little bit of an effect there. Um I just don't see inflation as really the way out on this. Um I think it would just give us a further twist, um, but we would be moving towards banana republic territory if we go in that direction.

George:

Oh, to to be clear, I don't think either Gerald or I are saying inflation is a solution. We're we're saying it's a likely consequence. It's a it's a sort of um it's a symptom of the disease that we've got rather than a solution to it. Definitely not a solution.

Doug:

Well, we've we've got some already, and we've got more than anywhere else. And we haven't really I mean, what is fascinating is twice within five years the MPC has let inflation uh uh go rip. The first time it happened elsewhere, although it was worse here, the second time it has not happened elsewhere yet, although I think it may. Um, but um the fact that we have inflation pretty well 4% at the moment, and depends which measure you use, of course, is something that you know the NPC ought to be resigning over this. And they've been cutting interest rates.

George:

I th I think this this sort of touches on a very interesting thing that's happened in macroeconomic debate, and I think I think in the academic circle as well as the policy-making circle, and that is if we go back to, you know, let's say when Gordon Brown was chancellor, he used to talk about balancing the budget over the economic cycle. Nobody talks about balancing the budget at all, anywhere, anymore. That's just completely gone from the discussion. And I I think a lot of it is uh a lot of it is tied to this idea that's crept into economics or galloped into economics in the last few years, modern monetary theory, this idea that deficits just don't matter. Uh Doug, any thoughts on on why this has this change has come about?

Doug:

It got worse. First of all, after the financial crisis, when um no one's economy was doing quite as well as it had previously, and of course there's been some slowdown in productivity, partly due to the regulations. You know, they they shut the sable door after the horse had bolted, but shutting the sable door has meant it's a lot more difficult to get the uh to get economies growing. But then, of course, COVID um created a whole new set of unbalanced budgets and almost created a set of situations where governments gave up, really. They borrowed so much at the time of COVID, they couldn't quite see their way back into managing their um their debt ratios down, and in most cases, managing their deficits down. Um, the United States has been able to get away with it because it's pretty huge. The European Union is sort of trying to make itself bulletproof in that way with monetary union and setting up the euro. And that is just, I think, given itself a lot more rope, but it will hang itself just as surely. And in fact, the EU is probably in a worse situation because they are also largely ex-growth. They still have growth in uh parts of Eastern Europe, but for most of the rest of Europe, there isn't a hell of a lot of growth taking place over there. And that means that uh their ability to pay off their debt becomes increasingly less impressive. The money got lent to us, of course, from the surplus countries, mainly the Chinese and various other people in the Far East and in the Middle East. Um, and um they've been quite happy so far, but at some point they're going to become extremely unhappy that they're not getting a decent return on their investment. And I think that's when the bond markets will start to bite. Um, I'd be amazed if we get to the end of this decade without a major bond crisis, which is the market reaction to the trend that you mentioned, George, which is of governments sort of more or less giving up on balancing budgets. And I think the continued increase in debt issuance is going to reach its natural limits.

Gerald:

Yes, I think one other little sort of factor here, which is probably just a uh a temporary one for political reasons, but uh on the political scene now there's a lot of talk of whether we need to get back being closer to the EU. Some people want to undo Brexit. There seems to be a lot of confusion, as usual, in political circles about the difference between the single market and a customs union. Um, the idea that rejoining the customs union would be a good idea seems rather eccentric to me. Um, is this just a little bit of short-term uh politics, or do you think uh it it's a serious swing back the other way?

Doug:

I think it is a serious swing back the other way. I've been quite disappointed. Some of the more intelligent people I know have climbed onto this bandwagon. I mean, I like it to kamikaze pilots and the uh I liken it to kamikaze pilots. But the thing about kamikaze pilots is they didn't have passengers on their planes. They were risking their own lives, and you can have a degree of respect for people who are prepared to put their own life at risk. What seems to be happening now is that people despairing of politics are um playing kamikaze pilots with passenger planes, and we are the passengers, and the British economy is the passenger, and some of the risks they're take they're threatening to take are frightening. Now, let me just quote some statistics. I'm sorry to be boring on a Sunday morning, but let me do so anyway. We have the most successful tech sector in Europe. The reason why is we are outside the Digital Markets Act, the Digital Services Act, and the AI Act, which are three European acts which heavily constrain tech growth in the EU. As a result, the thing that I call the flat-white economy, which is essentially the tech sector on a fairly broad definition, about 15% of GDP, the flat-white economy grew by 4.1% in real terms in the year to Q3. Britain's service sector exports, which um are largely driven by this, grew by 6% in real term in the year to Q3 this year, and Britain's plant and machinery investment grew by 9% in the year to Q3. This is the booming sector of the economy. I mean, what we forget is against a fairly flat overall picture, the action is a 15% part of the economy, which is booming. If we were to go to join well, even the single market, I think, would probably, although it's a bit more flexible, but even if we have to join the single market, it would probably mean that the uh acts that make it possible for the UK to grow because we're outside EU rules, these acts would apply to us. And if they applied to us, we would kill growth in these sectors stone dead. I mean, things may be bad, but they can get a hell of a lot worse. And I would argue that if we go back in, that's what the consequence would be.

Gerald:

Well, I have to say that's um uh not desperately optimistic, obviously. Um maybe to uh George, a few points on the optimistic side.

George:

Well, I'm gonna I'll pick you up on that, Gerald, because actually what Doug's just said there is that there is a big section of the economy that is booming. And and that's something that we we should be celebrating, and I think we should be trying to learn from. And as Doug says, it if the diagnosis is true, that that's booming because we we can do those industries better with lower regulation, we should be asking ourselves, what else are we holding back with higher regulation? You know, that that I think is a is the to me the big problem. I think we are I see it in in the financial services industry, you know, where our company works, and the the slew of regulation and the constant changes and the constant additions to regulation are a genuine drag on business. So, yeah, so the deregulation I think is the is a key lesson here.

Doug:

George, I was so can I just very quickly quote a statistic, again, to be boring. I was quoted by a man from Goldman's in the mansion house, that in Hong Kong it cost them $70 to onboard a client. In the UK, it now costs them $10,000 to onboard a client just because of the regulation. And that I think makes your point really well.

George:

Yeah, I could completely believe that. And in the financial services, it it goes through the whole economy. I mean, one of the things that you hear a a lot about small businesses complaining about, just the difficulty in setting up something very simple like a bank account for a new business.

Doug:

Yeah.

George:

It's incredibly hard, it's an incredible barrier to entry. And you know, these things can and should be addressed quite quickly.

Gerald:

I'm gonna try and say that we should be positive. I know in in past podcasts, uh, I always seem to be uh the half-glass empty merchant, and George is always a little bit more uh optimistic. Um Doug, let's go to a much more optimistic topic, which I'm gonna I know the enthusiasm of both of you for vintage cars and cars of all shapes and sizes, and um, try and restrict you guys from waxing lyrical too long. Um so no book recommendations in this podcast. We're gonna go for favourite cars. As the sort of non-petrol head, Aldo First and duck out the way very quickly. Um, in that my brother was um period of time owned a Jensen Interceptor. Um, many people will remember it mainly for the uh really beautiful lines of the of the car and that fantastic rear curved window. And um looking it up, I had to to chat. It had a an engine size that is um f fairly mind-boggling. Um I think it was six or even seven litres, depending on which version there was. The thing that you don't see in the sort of glossy films about the Jensen was he spent most of his time having it repaired. So it wasn't um it was nicer to look at than to get much driving out of. George, we'll let you go first on your car, and then um we'll give free reign to Doug.

George:

Okay, well, uh yeah, I th I thought it would be a bit fun, given, as I say, that we're all petrol heads, to to recommend a favourite classic car. And given that the the top topic of the podcast is economics, I'm gonna go with the Mini. I think the Mini is uh a fantastic classic car. I mean, it's still a derivation of it, is still in production now, but in its day, it was a family car, it was a a racing car, it won lots of uh lots of rallies, and it was also a very important economic car in that it was very cheap, very economical, um, and it helped get a lot of society into the the car owning space and gave them a lot of a lot of freedom. So um I'm gonna say the mini for my for my favorite classic car.

Doug:

I'll hand it over to you, Doug. Well, my mother had a mini, um, but she traded it in for a Trans-Spitfire. I mean, it the Mini was great and it was classics. I really love the car I did the Peking to Paris rally in, which is a Bentley. And my particular one has still got most of the rally changes to it. Um, it's got the rally suspension, it's got the rally transmission, it's got the rally engine modifications. So it is quite a fun car, and it can move a bit more quickly than most of those sort of rather hefty Bentleys. But can I just give a small mention to my convertible Jaguar XJS? I think I've had more fun in that car than in any other car I've owned. It's really great for sort of prancing around, wind in your hair, and it's a very refined and comfortable car. And uh I've driven it all round France, all round Spain, all around Germany. Um, it really is a wonderful tour.

Gerald:

Well, gents, you've been very restrained there. I thought we were going to have to rein you in on time, but um very, very interesting sort of trio of cars there. Uh Doug, thank you very much for coming along. Both of us, and I know a lot of our listeners are avid readers uh of the material that you put out from time to time on the current state of play. Uh George has got your book, and in fact, he's got my copy as well. So I shall be diving into that soon. So from us both again, thank you very much for today.

Doug:

Thank you so much for having me. It's a pleasure being on and uh listening to the two of you.

George:

Thank you, Douglas. It's a pleasure and an honor.

Doug:

Thank you.

‘Already had a profound effect’: parents react to Australia’s social media ban

Guardian
www.theguardian.com
2025-12-10 14:00:22
We asked you to share your views on your children’s use of social media and how the ban is affecting your family. Here is what you told us For some parents, social media sucks up their children’s time and steals them away from family life, instilling mental health issues along the way. For others, i...
Original Article
Social media ban illustration
Composite: Victoria Hart/Guardian design

For some parents, social media sucks up their children’s time and steals them away from family life, instilling mental health issues along the way. For others, it provides their children with an essential line to friends, family, connection and support.

When Australia’s social media ban came into effect on Wednesday, millions of under-16s lost access to their accounts and were prevented from creating new ones.

Guardian Australia has spent the past year reporting on the ban, how it will be implemented and potential unintended consequences. But we wanted to know how it is affecting parents, children and families now it has come into effect. Is it a force for good or a terrible policy mistake?

So we asked Guardian readers to tell us, and more than 100 of you replied. Here, 20 people share how the ban has affected them, their children and their families.

Podcast: Zines Are Back

403 Media
www.404media.co
2025-12-10 14:00:20
Our new zine; a very strange change at Instagram; and the creator of ICEBlock is suing the U.S. government....
Original Article

We start this week with news of our zine! We’re printing it very soon, and walk you through the process. Independent media is turning back to physical zines as a way to subvert algorithms. After the break, Emanuel tells us about some very weird Instagram changes. In the subscribers-only section, Joseph explains ICEBlock’s lawsuit against the U.S. government.

Listen to the weekly podcast on Apple Podcasts , Spotify , or YouTube . Become a paid subscriber for access to this episode's bonus content and to power our journalism. If you become a paid subscriber, check your inbox for an email from our podcast host Transistor for a link to the subscribers-only version! You can also add that subscribers feed to your podcast app of choice and never miss an episode that way. The email should also contain the subscribers-only unlisted YouTube link for the extended video version too. It will also be in the show notes in your podcast player.

Timestamps:
1:37 - 1st Story - 404 Media Is Making a Zine ; buy the zine here .
23:35 - 2nd Story - Instagram Is Generating Inaccurate SEO Bait for Your Posts
36:09 - 3rd Story - ICEBlock Creator Sues U.S. Government Over App’s Removal

About the author

Joseph is an award-winning investigative journalist focused on generating impact. His work has triggered hundreds of millions of dollars worth of fines, shut down tech companies, and much more.

Joseph Cox

Gin is a very bad software library

Lobsters
eblog.fly.dev
2025-12-10 13:57:11
Comments...
Original Article

Gin is a very bad software library

A software article by Efron Licht.

December 2025

ALL ARTICLES



In my experience, Go is the best general-purpose programming language for backend development, and a huge part of this comes from the thoughtful design of it’s standard libraries. If you are willing to be a little bit patient, read the documentation, and spend some time getting familiar with their idioms, you have everything you need without needing to go far afield.

Most programmers are not willing to be a little bit patient. They google ‘go web framework’ and then they pick the first web result. More than likely, this is Gin, a kind of insidous fungus masquerading as a software library.

Like many fungi,

  • It is easy to pick up Gin and almost impossible to remove it
  • Unless you’re extremely careful you’ll pass it on to your friends
  • The features that make it inimical to human life are directly related to it’s success, and it will likely outlive you and me despite everything we do to eradicate it.
  • You can learn to live with it, but you shouldn’t have to.

Gin is not the only bad library - in fact, it’s not nearly the worst library in common usage - but it is the library that pisses me off the most day to day, and I think it’s emblematic of many of the biggest flaws of software library design writ large.

Table of Contents

autogenerated on 2025-12-09

1.1. Tablesetting & Caveats

Before we begin:

  • This article is mostly rant, for which I apologize in advance.
  • Gin is a very old library as far as Go goes and is based off an even older one, go-martini . Many of it’s worst mistakes are artifacts of that time, and some things that appear to be mistakes are an artifact of predating that functionality in the standard library.
  • I shouldn’t have to say this, but you are not a bad person if you use, have used, or work on Gin: please do not use this article as a way to brigade people for using “unfashionable” code. People making software decisions based off ‘clout’ is part of how we got into this mess.
  • Update: I am not trying to imply that you should never use libraries for backend development or that using something besides net/http is somehow “wrong”. I have happily used libraries like gorilla/mux and it’s associated friends and I see nothing wrong with, for example, chi . I am using the standard library as a comparison because it’s
    • What I use
    • A very good library (not perfect, but very good ).
    • A dependency every Go programmer has by definition
  • I am saying that Gin is a bad library, and that you should be suspicious of libraries that share it’s flaws.

2. Comparison of Basic Servers in net/http and gin

OK, let’s get to it. On a surface level, basic HTTP work doesn’t look too different in net/http and Gin:

2.0.1. basic server: net/http

 1func main() {
 2    // route METHOD / PATH to handlers: here, GET /ping.
 3    mux := http.NewServeMux()
 4	  mux.HandleFunc("GET /ping", func(w http.ResponseWriter, r *http.Request) {
 5        w.WriteHeader(200)
 6        json.NewEncoder(w).Encode(map[string]string {
 7            "message": "pong",
 8        })
 9    })
10    // Create a HTTP server...
11    srv := http.Server {
12        Handler: mux, // that uses our router...
13        Addr: ":8080", // on port 8080
14    }
15    srv.ListenAndServe()
16}

2.1. Basic Server: Gin

 1func main() {
 2  // create a default gin router / server / engine
 3  r := gin.Default()
 4  // route METHOD / PATH to handlers: here, GET /ping.
 5  r.GET("/ping", func(c *gin.Context) {
 6    c.JSON(http.StatusOK, gin.H{
 7      "message": "pong",
 8    })
 9  })
10  r.Run()
11}

On a surface impression, gin might seem easier - it’s slightly fewer lines of code, and there seems to be less configuration.

But this is all surface.

The proper way to judge a map is by the terrain it covers. In other words, before you choose any software, first you should know the problem you’re trying to solve with it. So before we pick on Gin, let’s review that terrain - HTTP.

3. HTTP is not that complicated: a brief review

Happily, HTTP is not that complicated and we can go over the basics in about ninety seconds and a handful of chalk drawings.

The H yper T ext T ransport P rotocol has a client send HTTP Requests , and a server responds with HTTP Responses .

A client sends a HTTP Request to a server . The server parses the request, figures out what the client wants, and sends back a HTTP Response .

This is very quick and dirty. If you want more details on the structure of HTTP, my article series ‘Backend from the Beginning’ builds an entire HTTP library from scratch and goes over all these parts in detail.

3.1. Diagram: Basic HTTP Flow

diagram source
mermaid diagram showing basic HTTP request flow.

3.2. HTTP Requests

HTTP Requests have four main parts, separated by newlines:

  1. A Request Line that specifies the HTTP method (GET, POST, etc), the path being requested, and the HTTP version.
  2. One or more Headers that provide metadata about the request.
  3. A blank line.
  4. An optional Body that contains data being sent to the server (usually JSON).

I.E, they look like this:

3.2.1. Chalkboard Diagram: HTTP Request Example

literal chalkboard diagram showing an example HTTP request and it's parts.

3.2.2. Chalkboard Diagram: HTTP Request Structure

literal chalkboard diagram showing the structure of an HTTP request.

3.3. HTTP Responses

HTTP Responses have a similar structure, with four main parts, separated by newlines

3.3.1. Chalkboard Diagram: HTTP Response Structure

literal chalkboard 
diagram showing the structure of an HTTP response.

3.3.2. Chalkboard Diagram: HTTP Response Status Line

3.4. A couple of structural notes about HTTP

  • These parts are ordered - you can’t change your mind about the request or status line once you’ve sent them.

  • Once you’ve sent the body, you (usually) can’t send any more headers.

  • You don’t have to send the whole body at once on either side - you can stream it.

  • You now know more about HTTP than many senior web developers. I wish this were not true.

Fundamentally, the structure of our solution - the HTTP library - should mirror the structure of the problem If the solution is significantly larger than the problem, one or more of the following is true:

3.5. Gin is significantly larger than the problem domain

The go stdlib’s net/http covers all of HTTP in 35 files of pure go and 25,597 lines of code, including the server, client, TLS, proxies, etc.

Gin and it’s dependency chain covers only server-side handling and requires 2,148 files and 1,032,635 lines of code, including 80084 lines of platform-specific GNU style assembly.

This is nuts . You can crack an egg with a 155mm artillery shell. This does not make it a good idea, even if you add rocket boosters and laser guidance.

4. “minimal” interfaces

Some people would argue that the code weight doesn’t matter - what we should be worried about is the API : i.e, the interface we have to keep in our heads. They’d probably sneak in a quote about premature optimization or something. No problem.

The following diagrams illustrate the ‘minimal’ APIs to understand net/http and gin.

4.0.1. Diagram: minimal interface for net/http server

diagram source
mermaid diagram showing minimal interface for net/http server.

4.0.2. Diagram: “minimal” interface for gin sever and kafkaesque nightmare:

diagram source
mermaid diagram showing minimal interface for gin server and kafkaesque nightmare.

As hard as it is to believe, this graph omits a ton of details.


5. Choosing Gin: Human beings are bad at making judgements (especially about things they consider boring)

If you’re reading this, you’re probably a progammer. Take a moment to think about how the dependencies were chosen for your current project(s). Ask yourself - or better yet, a team-mate - the following questions re: your major dependencies:

  • What are your major dependencies?
  • Who chose to add them to your project? Why? When?
    • Did they write those decisions down anywhere?
    • If so, did they ever go back to re-evaluate those decisions?
  • How many options did they evaluate?
    • What was the evaluation process?
  • Was one of those options ‘write it ourselves’?
    • If so, why didn’t you do it?
    • If not, why didn’t you consider it?
  • What are the perceived strengths and weaknesses in the following categories?
    • Familiarity
    • Performance (what kind?)
    • API surface
    • Documentation
    • Test coverage
    • Code bloat
      • Within the package
      • Within it’s dependency tree
    • Security (did someone vet it? Did you read the code? Can you read the code? Does it rely on, say, opaque binaries or platform-specific assembly)?
  • Are more features better or worse? Why? Is this always the same?
  • How hard will it be to switch if this decision is wrong?
    > This final point is Gin’s curse - it is incredibly difficult to remove - and, I think, the root of it’s success. We’ll come back to it in our final section.

For the vast majority of projects, there is no answer to these questions, because no one ever thought about it. They went into google search or chatgpt, typed “best golang web framework reddit” and called it a day. I know this because I have seen it happen at least twenty times at half a dozen software houses. While understandable - software is a busy, stressful, job - this is not acceptable . This is the kind of reasoning you apply to choosing lunch, not critical software dependencies for million or billion-dollar projects.

6. Gin is too big: Code & Binary Bloat

In anything at all, perfection is finally attained not when there is no longer anything to add, but when there is no longer anything to take away. ~Antoine De Saint Expry.

Gin is too big. Gin is enormously, staggeringly big. It’s dependency tree is over 55MiB. If we just taking the lines of code in Gin and it’s dependencies - ignoring comments and documentation - we have 877615 lines. This is huge, enormous, elephantine cost must be paid by every single project on every single git clone or go build , and some of that cost leaks into the compiled binary too.

Gin contains, I kid you not, four five at least six different JSON libraries, not counting the one built in to the standard library. (more about this later.)

These include

  • goccy/go-json (1204K)
  • bytedance/sonic (13 MiB!!!!)
  • quic-go/quic-go/qlogwriter/jsontext (12 KiB - you pass)
  • ugorji/go/codec (3MiB!!!,)
  • ./github.com/quic-go/quic-go/qlogwriter/jsontext
  • gabriel-vasile/mimetype/internal/json
  • json-iterator/go (348K)

I thought there were only four, but I kept finding more.

6.1. Table: Comparison of Gin and other Frameworks, Programs, Libraries etc.

The following table compares the code bloat of Gin to other popular and/or historically important programs or written material.

Program or Library Description Files Code Lines %target Size %size
github.com/gin-gonic/gin A popular go web framework and OSHA violation 2189 877615 100.000% 55.461 MiB 100.00%
lua General-purpose scripting langauge a-la Python or Javascript 105 36685 4.180% 14.926 MiB 26.91%
chi Minimalistic go HTTP framework 85 7781 0.887% 4.746 MiB 8.56%
Command and Conquer: Red Alert A best-selling real-time strategy game (1996) with it’s own GUI, networking code, custom game engine, etc etc etc. 1893 368288 41.965% 39.957 MiB 72.05%
DOOM ID software’s revolutionary first-person shooter, including networked play 152 39250 4.472% 2.375 MiB 4.28%
gorilla/mux Popular go HTTP router. 19 6214 0.708% 1.059 MiB 1.91%
labstack/echo Popular go web framework 600 326000 37.146% 23.855 MiB 43.01%
golang/go/src The go programming language, it’s runtime, tooling, and compiler 9591 2244096 255.704% 143.129 MiB 258.07%
MechCommander2-Source/ 2001 real-time strategy game 1875 858811 97.857% 1.771 GiB 3269.17%
MS-DOS/v1.25/ Complete operating system, predecessor of microsoft windows 20 12001 1.367% 504.000 KiB 0.89%
MS-DOS/v2.0/ 116 41417 4.719% 2.527 MiB 4.56%
MS-DOS/v4.0 Final release of MS-DOS with true multitasking support 1065 332117 37.843% 23.203 MiB 41.84%
original-bsd/ The original berkley systems distribution operating system and hundreds of programs, libraries, and games 9562 1526953 173.989% 185.387 MiB 334.27%
Quake ID software’s third-person shooter, including 3d graphical engine, GUI, networking code, etc etc 516 170211 19.395% 15.266 MiB 27.53%
Research-Unix-v10/v10 Original ‘research’ unix before split into BSD and other distributions, including networking, productivity software, and games 8755 1671269 190.433% 137.430 MiB 247.80%
zig/src/ Systems programming language and tooling, including an entire C compiler for dozens of targets 175 473612 53.966% 24.094 MiB 43.44%
musl implementation of core C library used by Linux and other operating systems 1922 64837 7.388% 9.199 MiB 0.16586
Bible (King James Version) Popular Translation of the Jewish & Christian Core Religious Text 31104 4.436 Mib
War and Peace Tolstoy’s extremely long novel about the Napoleonic wars 23637 3.212 MiB —

This is, to be blunt, completely unacceptable. If you picked a sane framework like chi (you don’t need a framework, but for the sake of argument), you could bundle in DOOM, a C compiler to build it with (let’s pick Zig), and an operating system to run it on like MS-DOS 4.0, and throw in War and Peace and the entire Kings James Bible for good measure and you’d still have less bloat than Gin and it’s source tree.

This bloat carries over to the compiled binary, too.

6.2. Gin includes too much code even if you don’t use the features

While Go’s compiler is pretty good about eliminating unused code, Gin does it’s best to touch as many different libraries as it can at import time so the compiler can’t do that.

To demonstrate, let’s strip down our examples even further and build the simplest possible Gin programs and an equivalent HTTP servers and see how big the resulting binaries are.

1// simplegin.go
2func main() {
3  e := gin.Default()
4  e.ANY("/", func(c *gin.Context) {
5      c.Writer.WriteHeader(200)
6    })
7  e.Run()
8}
1// simplehttp/main.go
2func main() {
3	http.ListenAndServe(":8080", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
4		w.WriteHeader(200)
5	}))
6}

Let’s examine the compiled output:

1#!/usr/bin/env bash
2du -H simplehttp simplegin
3
419640K  simplegin
57864K   simplehttp

Maybe it’s just debug symbols? Let’s strip the binaries and try again:

1#!/usr/bin/env bash
2strip simplehttp
3strip simplegin
4du -H simplehttp simplegin
513572K  simplegin
65444K   simplehttp

Where’s all this bloat coming from? After all, we don’t use most of Gin’s features… Let’s use GODEBUG=inittrace=1 to see what packages are being initialized to see if we can figure out where all this bloat is coming from.

1GODEBUG=inittrace=1 ./simplegin
1init internal/bytealg @0.005 ms, 0 ms clock, 0 bytes, 0 allocs
2init runtime @0.078 ms, 0.10 ms clock, 0 bytes, 0 allocs
3init crypto/internal/fips140deps/cpu @0.63 ms, 0.003 ms clock, 0 bytes, 0 allocs
4init math @0.67 ms, 0 ms clock, 0 bytes, 0 allocs
5... many, many lines omitted 

There’s a lot of noise here, so I’ll summarize a couple highlights:

  • You pay for toml , gob , yaml , protobuf , xml , and at least two JSON libraries, regardless of whether you use them:

    • init encoding/gob @1.6 ms, 0.087 ms clock, 26496 bytes, 395 allocs
    • init github.com/gin-gonic/gin/codec/json @1.8 ms, 0 ms clock, 0 bytes, 0 allocs
    • init github.com/goccy/go-yaml/token @1.8 ms, 0.006 ms clock, 3784 bytes, 18 allocs
    • init github.com/goccy/go-yaml/printer @1.8 ms, 0 ms clock, 0 bytes, 0 allocs
    • init github.com/goccy/go-yaml/parser @1.8 ms, 0 ms clock, 336 bytes, 2 allocs
    • init github.com/pelletier/go-toml/v2 @1.8 ms, 0 ms clock, 0 bytes, 0 allocs
    • init google.golang.org/protobuf/reflect/protoreflect @2.0 ms, 0 ms clock, 0 bytes, 0 allocs
    • init google.golang.org/protobuf/reflect/protoregistry @2.0 ms, 0 ms clock, 88 bytes, 3 allocs
    • init google.golang.org/protobuf/proto @2.0 ms, 0 ms clock, 80 bytes, 2 allocs
    • init encoding/json @1.7 ms, 0.005 ms clock, 32 bytes, 2 allocs
    • init encoding/xml @1.7 ms, 0.016 ms clock, 18776 bytes, 6 allocs
    • init github.com/pelletier/go-toml/v2 @1.8 ms, 0 ms clock, 0 bytes, 0 allocs
  • you pay for http/3 (QUIC) even if you aren’t using it

    • init github.com/gabriel-vasile/mimetype @2.1 ms, 0.022 ms clock, 20024 bytes, 243 allocs
    • init github.com/quic-go/quic-go/internal/protocol @2.6 ms, 0.005 ms clock, 144 bytes, 3 allocs
    • init github.com/quic-go/quic-go/internal/utils @2.6 ms, 0 ms clock, 48 bytes, 1 allocs
    • init github.com/quic-go/quic-go/internal/wire @2.6 ms, 0 ms clock, 0 bytes, 0 allocs
    • init github.com/quic-go/quic-go/internal/handshake @2.6 ms, 0 ms clock, 32 bytes, 1 allocs

This cost is a direct result of Gin’s horrific ‘everything and the kitchen sink’ API - more about that in a bit.

6.3. Sidenote: go build -tags nomsgpack

As it turns out, the gin team has been trying to deal with this enormous binary bloat.
You can eliminate the dependency on msgpack by adding the built tag nomsgpack , which shaves ten megabytes off the binary. This should be the default , but still, good job.

7. Gin’s API has the surface area of an industrial heat sink and sucks nearly as much

Increasingly, people seem to misinterpret complexity as sophistication, which is baffling – the incomprehensible should cause suspicion rather than admiration .

Niklaus Wirth, inventor of PASCAL

A quick note on UNIX before we dive into Gin’s API.

UNIX is one of the oldest traditions in software still standing. In this tradition, good APIs have a small surface that exposes deep functionality. The classic example is UNIX’s filesystem API, which made it a long way with only six verbs: OPEN, CLOSE, READ, WRITE, SEEK, and FCTNL - this is enough to handle disk drives, shared network filesystems, terminals, printers, etc.

There’s a good argument to be made that this is not the correct filesystem API anymore - FCTNL is clearly cheating, and it doesn’t handle nonblocking or concurrent IO that well. See the excellent talk What Unix Cost Us by Benno Rice for a discussion of this topic.

For more on UNIX programming, see my Starting Systems Progamming series of article.

Go is firmly part of this tradition, and as such, it’s standard library tries to minimize API surface where possible. The vast, vast majority of interfaces in Go’s standard library have three or fewer methods, usually just one. Even the largest interface in Go, net.Conn tops out at 8 methods. Gin… does not do this.

reflect.Type doesn’t count: it’s never meant to be implemented by external libraries: all it’s implementors are internal codegen, and reflection is always kind of an exception to every rule. Please don’t @ me.

Let’s take a look at how net/http is designed to see this philosphy in action.

7.1. net/http is a beautiful API

Server-side HTTP in go can be summarized in four types and one sentence: The http.Server parses packets into http.Request structs, hands them to a http.Handler , which writes a response via http.ResponseWriter .

Usually, that handler is some kind of router like http.ServeMux that dispatches to different sub-handlers - but it doesn’t have to be.

To give a quick example, here’s a minimal HTTP server using only the Go standard library that responds to POST /greet .

While we use a number of types here, there’s only a handful of interfaces we need to understand this code - http.Handler , http.ResponseWriter , and the omnipresent io.Reader and io.Writer interfaces used by the JSON encoder and decoder.

7.1.1. net/http: interface summary and graph

 1// 43 words of interface surface area, not counting comments
 2type Handler interface {
 3  ServeHTTP(w ResponseWriter, r *Request)
 4}
 5type ResponseWriter interface {
 6  WriteHeader(statusCode int)
 7  Header() Header
 8  Write([]byte) (int, error)
 9}
10type Reader interface {
11  Read(p []byte) (n int, err error)
12}
13type Writer interface {
14  Write(p []byte) (n int, err error)
15}

diagram source

7.2. Investigating Gin’s Core API

To summarize Gin’s API in a similar way, the gin.Engine gets http requests, routes them using it’s embedded gin.RouterGroup , and turns them into a *gin.Context , which contains a *http.Request and a gin.ResponseWriter , and hands them to one or more gin.HandlerFuncs , which modify the *gin.Context .

This doesn’t sound too bad - in fact, it sounds almost the same. Let’s take a look at the method summaries of these types to see what we’re dealing with here, starting with gin.Engine

7.2.1. *gin.Engine

 1 
 2    func (engine *Engine) Delims(left, right string) *Engine
 3    func (engine *Engine) HandleContext(c *Context)
 4    func (engine *Engine) Handler() http.Handler
 5    func (engine *Engine) LoadHTMLFS(fs http.FileSystem, patterns ...string)
 6    func (engine *Engine) LoadHTMLFiles(files ...string)
 7    func (engine *Engine) LoadHTMLGlob(pattern string)
 8    func (engine *Engine) NoMethod(handlers ...HandlerFunc)
 9    func (engine *Engine) NoRoute(handlers ...HandlerFunc)
10    func (engine *Engine) Routes() (routes RoutesInfo)
11    func (engine *Engine) Run(addr ...string) (err error)
12    func (engine *Engine) RunFd(fd int) (err error)
13    func (engine *Engine) RunListener(listener net.Listener) (err error)
14    func (engine *Engine) RunQUIC(addr, certFile, keyFile string) (err error)
15    func (engine *Engine) RunTLS(addr, certFile, keyFile string) (err error)
16    func (engine *Engine) RunUnix(file string) (err error)
17    func (engine *Engine) SecureJsonPrefix(prefix string) *Engine
18    func (engine *Engine) ServeHTTP(w http.ResponseWriter, req *http.Request)
19    func (engine *Engine) SetFuncMap(funcMap template.FuncMap)
20    func (engine *Engine) SetHTMLTemplate(templ *template.Template)
21    func (engine *Engine) SetTrustedProxies(trustedProxies []string) error
22    func (engine *Engine) Use(middleware ...HandlerFunc) IRoutes
23    func (engine *Engine) With(opts ...OptionFunc) *Engine

This is a mess . This seems to cover

  • routing and middleware, like we’d expect ( delims , NoMethod , NoRoute , Use , Routes() )
  • server configuration (‘SetTrustedProxies’, RunTLS , ‘ RunQUIC ’, With )
  • and HTML templating (‘SetHTMLTemplate’, ‘SetFuncMap’, ‘LoadHTMLGlob’, LoadHTMLGlob , LoadHTMLFS , LoadHTMLFiles ), and HTML templating. That is, it combines the concerns of http.Server , http.ServeMux , template/html ( https://pkg.go.dev/html/template ), not to mention entirely separate HTTP protocols like QUIC.

BTW, for the ten thousand configuration options here, **none of them let you select the http.Server to use, so good luck if you want to do things like set timeouts or do connection or packet-level configuration. Gin is hardcoded to use the default HTTP server. I think you can do that by calling .Handler() and passing that to a *http.Server , but I’m not sure and it’s not covered by the documentation. Maybe it’s in With ?

But that’s not all - like I mentioned earlier, the gin.Engine embeds a RouterGroup . That means in addition to the previous, it also exposes the following methods:

7.2.2. *gin.RouterGroup

 1    func (group *RouterGroup) Any(relativePath string, handlers ...HandlerFunc) IRoutes
 2    func (group *RouterGroup) BasePath() string
 3    func (group *RouterGroup) DELETE(relativePath string, handlers ...HandlerFunc) IRoutes
 4    func (group *RouterGroup) GET(relativePath string, handlers ...HandlerFunc) IRoutes
 5    func (group *RouterGroup) Group(relativePath string, handlers ...HandlerFunc) *RouterGroup
 6    func (group *RouterGroup) HEAD(relativePath string, handlers ...HandlerFunc) IRoutes
 7    func (group *RouterGroup) Handle(httpMethod, relativePath string, handlers ...HandlerFunc) IRoutes
 8    func (group *RouterGroup) Match(methods []string, relativePath string, handlers ...HandlerFunc) IRoutes
 9    func (group *RouterGroup) OPTIONS(relativePath string, handlers ...HandlerFunc) IRoutes
10    func (group *RouterGroup) PATCH(relativePath string, handlers ...HandlerFunc) IRoutes
11    func (group *RouterGroup) POST(relativePath string, handlers ...HandlerFunc) IRoutes
12    func (group *RouterGroup) PUT(relativePath string, handlers ...HandlerFunc) IRoutes
13    func (group *RouterGroup) Static(relativePath, root string) IRoutes
14    func (group *RouterGroup) StaticFS(relativePath string, fs http.FileSystem) IRoutes
15    func (group *RouterGroup) StaticFile(relativePath, filepath string) IRoutes
16    func (group *RouterGroup) StaticFileFS(relativePath, filepath string, fs http.FileSystem) IRoutes
17    func (group *RouterGroup) Use(middleware ...HandlerFunc) IRoutes

These methods cover routing - for some reason they present every HTTP verb as a separate method - and static file serving in in four different ways. All of these - except Group - return an IRoutes interface, and nearly all take a HandlerFunc .

7.2.3. gin.HandlerFunc and gin.Context

Ok, what’s a HandlerFunc ?

1type HandlerFunc func(c *gin.Context)

Finally, a small interface. Maybe this is the equivalent of a http.ResponseWriter ? Let’s look at the exported interface (fields and methods) of *gin.Context .

1type Context struct {
2    Request *http.Request
3    Writer  ResponseWriter // a gin.ResponseWriter, not a http.ResponseWRiter
4    Params Params
5    Keys map[any]any
6    Errors errorMsgs
7    Accepted []string
8    // contains filtered or unexported fields
9}

So it contains the http.Request and a gin.ResponseWriter . What’s a gin.ResponseWriter ?

 1type ResponseWriter interface {
 2    http.ResponseWriter
 3    http.Hijacker
 4    http.Flusher
 5    http.CloseNotifier
 6    Status() int
 7    Size() int
 8    WriteString(string) (int, error)
 9    Written() bool
10    WriteHeaderNow()
11    Pusher() http.Pusher
12}

7.2.4. *gin.Context has more methods than Jerry Seinfeld has cars

The bigger the interface, the weaker the abstraction .

Rob Pike, “Go Proverbs” .

Ok, so stopping here, this already contains the entire API surface area of the net/HTTP interface - gin only adds complexity - as well as an extra five public fields and ten methods ON those fields.

That’s not good, but the real horrors are yet to come: gin.Context ’s list of methods.

You might want to take a deep breath. You’re not ready for this.

7.2.5. A list of methods that must be seen to be believed

  1// 133 functions. Do you have them all memorized? I sure hope so. https://pkg.go.dev/github.com/gin-gonic/gin#Context
  2    func (c *Context) Abort()
  3    func (c *Context) AbortWithError(code int, err error) *Error
  4    func (c *Context) AbortWithStatus(code int)
  5    func (c *Context) AbortWithStatusJSON(code int, jsonObj any)
  6    func (c *Context) AbortWithStatusPureJSON(code int, jsonObj any)
  7    func (c *Context) AddParam(key, value string)
  8    func (c *Context) AsciiJSON(code int, obj any)
  9    func (c *Context) Bind(obj any) error
 10    func (c *Context) BindHeader(obj any) error
 11    func (c *Context) BindJSON(obj any) error
 12    func (c *Context) BindPlain(obj any) error
 13    func (c *Context) BindQuery(obj any) error
 14    func (c *Context) BindTOML(obj any) error
 15    func (c *Context) BindUri(obj any) error
 16    func (c *Context) BindWith(obj any, b binding.Binding) errordeprecated
 17    func (c *Context) BindXML(obj any) error
 18    func (c *Context) BindYAML(obj any) error
 19    func (c *Context) ClientIP() string
 20    func (c *Context) ContentType() string
 21    func (c *Context) Cookie(name string) (string, error)
 22    func (c *Context) Copy() *Context
 23    func (c *Context) Data(code int, contentType string, data []byte)
 24    func (c *Context) DataFromReader(code int, contentLength int64, contentType string, reader io.Reader, ...)
 25    func (c *Context) Deadline() (deadline time.Time, ok bool)
 26    func (c *Context) DefaultPostForm(key, defaultValue string) string
 27    func (c *Context) DefaultQuery(key, defaultValue string) string
 28    func (c *Context) Done() <-chan struct{}
 29    func (c *Context) Err() error
 30    func (c *Context) Error(err error) *Error
 31    func (c *Context) File(filepath string)
 32    func (c *Context) FileAttachment(filepath, filename string)
 33    func (c *Context) FileFromFS(filepath string, fs http.FileSystem)
 34    func (c *Context) FormFile(name string) (*multipart.FileHeader, error)
 35    func (c *Context) FullPath() string
 36    func (c *Context) Get(key any) (value any, exists bool)
 37    func (c *Context) GetBool(key any) (b bool)
 38    func (c *Context) GetDuration(key any) (d time.Duration)
 39    func (c *Context) GetFloat32(key any) (f32 float32)
 40    func (c *Context) GetFloat32Slice(key any) (f32s []float32)
 41    func (c *Context) GetFloat64(key any) (f64 float64)
 42    func (c *Context) GetFloat64Slice(key any) (f64s []float64)
 43    func (c *Context) GetHeader(key string) string
 44    func (c *Context) GetInt(key any) (i int)
 45    func (c *Context) GetInt16(key any) (i16 int16)
 46    func (c *Context) GetInt16Slice(key any) (i16s []int16)
 47    func (c *Context) GetInt32(key any) (i32 int32)
 48    func (c *Context) GetInt32Slice(key any) (i32s []int32)
 49    func (c *Context) GetInt64(key any) (i64 int64)
 50    func (c *Context) GetInt64Slice(key any) (i64s []int64)
 51    func (c *Context) GetInt8(key any) (i8 int8)
 52    func (c *Context) GetInt8Slice(key any) (i8s []int8)
 53    func (c *Context) GetIntSlice(key any) (is []int)
 54    func (c *Context) GetPostForm(key string) (string, bool)
 55    func (c *Context) GetPostFormArray(key string) (values []string, ok bool)
 56    func (c *Context) GetPostFormMap(key string) (map[string]string, bool)
 57    func (c *Context) GetQuery(key string) (string, bool)
 58    func (c *Context) GetQueryArray(key string) (values []string, ok bool)
 59    func (c *Context) GetQueryMap(key string) (map[string]string, bool)
 60    func (c *Context) GetRawData() ([]byte, error)
 61    func (c *Context) GetString(key any) (s string)
 62    func (c *Context) GetStringMap(key any) (sm map[string]any)
 63    func (c *Context) GetStringMapString(key any) (sms map[string]string)
 64    func (c *Context) GetStringMapStringSlice(key any) (smss map[string][]string)
 65    func (c *Context) GetStringSlice(key any) (ss []string)
 66    func (c *Context) GetTime(key any) (t time.Time)
 67    func (c *Context) GetUint(key any) (ui uint)
 68    func (c *Context) GetUint16(key any) (ui16 uint16)
 69    func (c *Context) GetUint16Slice(key any) (ui16s []uint16)
 70    func (c *Context) GetUint32(key any) (ui32 uint32)
 71    func (c *Context) GetUint32Slice(key any) (ui32s []uint32)
 72    func (c *Context) GetUint64(key any) (ui64 uint64)
 73    func (c *Context) GetUint64Slice(key any) (ui64s []uint64)
 74    func (c *Context) GetUint8(key any) (ui8 uint8)
 75    func (c *Context) GetUint8Slice(key any) (ui8s []uint8)
 76    func (c *Context) GetUintSlice(key any) (uis []uint)
 77    func (c *Context) HTML(code int, name string, obj any)
 78    func (c *Context) Handler() HandlerFunc
 79    func (c *Context) HandlerName() string
 80    func (c *Context) HandlerNames() []string
 81    func (c *Context) Header(key, value string)
 82    func (c *Context) IndentedJSON(code int, obj any)
 83    func (c *Context) IsAborted() bool
 84    func (c *Context) IsWebsocket() bool
 85    func (c *Context) JSON(code int, obj any)
 86    func (c *Context) JSONP(code int, obj any)
 87    func (c *Context) MultipartForm() (*multipart.Form, error)
 88    func (c *Context) MustBindWith(obj any, b binding.Binding) error
 89    func (c *Context) MustGet(key any) any
 90    func (c *Context) Negotiate(code int, config Negotiate)
 91    func (c *Context) NegotiateFormat(offered ...string) string
 92    func (c *Context) Next()
 93    func (c *Context) Param(key string) string
 94    func (c *Context) PostForm(key string) (value string)
 95    func (c *Context) PostFormArray(key string) (values []string)
 96    func (c *Context) PostFormMap(key string) (dicts map[string]string)
 97    func (c *Context) ProtoBuf(code int, obj any)
 98    func (c *Context) PureJSON(code int, obj any)
 99    func (c *Context) Query(key string) (value string)
100    func (c *Context) QueryArray(key string) (values []string)
101    func (c *Context) QueryMap(key string) (dicts map[string]string)
102    func (c *Context) Redirect(code int, location string)
103    func (c *Context) RemoteIP() string
104    func (c *Context) Render(code int, r render.Render)
105    func (c *Context) SSEvent(name string, message any)
106    func (c *Context) SaveUploadedFile(file *multipart.FileHeader, dst string, perm ...fs.FileMode) error
107    func (c *Context) SecureJSON(code int, obj any)
108    func (c *Context) Set(key any, value any)
109    func (c *Context) SetAccepted(formats ...string)
110    func (c *Context) SetCookie(name, value string, maxAge int, path, domain string, secure, httpOnly bool)
111    func (c *Context) SetCookieData(cookie *http.Cookie)
112    func (c *Context) SetSameSite(samesite http.SameSite)
113    func (c *Context) ShouldBind(obj any) error
114    func (c *Context) ShouldBindBodyWith(obj any, bb binding.BindingBody) (err error)
115    func (c *Context) ShouldBindBodyWithJSON(obj any) error
116    func (c *Context) ShouldBindBodyWithPlain(obj any) error
117    func (c *Context) ShouldBindBodyWithTOML(obj any) error
118    func (c *Context) ShouldBindBodyWithXML(obj any) error
119    func (c *Context) ShouldBindBodyWithYAML(obj any) error
120    func (c *Context) ShouldBindHeader(obj any) error
121    func (c *Context) ShouldBindJSON(obj any) error
122    func (c *Context) ShouldBindPlain(obj any) error
123    func (c *Context) ShouldBindQuery(obj any) error
124    func (c *Context) ShouldBindTOML(obj any) error
125    func (c *Context) ShouldBindUri(obj any) error
126    func (c *Context) ShouldBindWith(obj any, b binding.Binding) error
127    func (c *Context) ShouldBindXML(obj any) error
128    func (c *Context) ShouldBindYAML(obj any) error
129    func (c *Context) Status(code int)
130    func (c *Context) Stream(step func(w io.Writer) bool) bool
131    func (c *Context) String(code int, format string, values ...any)
132    func (c *Context) TOML(code int, obj any)
133    func (c *Context) Value(key any) any
134    func (c *Context) XML(code int, obj any)
135    func (c *Context) YAML(code int, obj any)

This is a nightmare. Even a ‘simple’ Gin server that only receives and sends HTTP with JSON bodies over net/HTTP is unavoidably linked to this enormous complexity.

7.2.6. What if we just want to write JSON?

Even if you “just” want to send and receive JSON, there are eleven different ways to do this as methods on gin.Context all of which behave differently depending on build tags and magically invoke multiple layers of struct validation, and some of which depend on the configuration of your gin.Engine , not to mention .Writer.WriteString() and .Writer.Write() .

 1    func (c *Context) AbortWithStatusJSON(code int, jsonObj any)
 2    func (c *Context) AbortWithStatusPureJSON(code int, jsonObj any)
 3    func (c *Context) AsciiJSON(code int, obj any)
 4    func (c *Context) BindJSON(obj any) error
 5    func (c *Context) IndentedJSON(code int, obj any)
 6    func (c *Context) JSON(code int, obj any)
 7    func (c *Context) JSONP(code int, obj any)
 8    func (c *Context) PureJSON(code int, obj any)
 9    func (c *Context) SecureJSON(code int, obj any)
10    func (c *Context) ShouldBindBodyWithJSON(obj any) error
11    func (c *Context) ShouldBindJSON(obj any) error

To pick a single example, to know the behavior of SecureJSON at runtime, I need to know, among other things

  • Which of the six JSON libraries does was this built with? Am I sure my test environment has the same build tags as the deploy?
  • Did the gin.Engine that’s running this function - one that is not visible in the function signature of a HandlerFunc - set a gin.SecureJSONPrefix ?

Status Headers are even more complex: there are 24 different ways to set a response header via methods of .Context or it’s fields, including:

  • Context.Status() (writes a status header)
  • Context.Writer.Status() (READS a previously written status header - sometimes)
  • Context.Writer.WriteHeader() (WRITES a status header, but not in a way where you can always retreive the status header with .Writer.Status() , yes I have run into this and I am salty)

As intimidating as these giant lists of methods are, it turns out, the vast majority of these methods are wrappers around the same core functionality. In fact, they’re wrappers around the exact same functionality as net/http.ResponseWriter . Let’s follow the ordinary JSON down the chain and figure out what’s happening.

The .JSON() method calls the exported function WriteJSON, which calls c.Render() . This writes the status - by calling .Status() - which just wraps http.ResponseWriter.WriteHeader ,
takes the interface render.Render , which calls the magic method WriteContentType , render.Render() on the magic exported global variable codec/json.API of type json.Core , which happens to be the conditionally-compiled empty struct codec/json.jsonapi then writes the marshaled bytes to the http.ResponseWriter .

The magic exported global variable depends on your build tags. Usually, this is the stdlib’s encoding/JSON.

That is, it’s

1b, _ := json.Marshal(obj)
2w.Write(b)

With a lot of extra steps in between.

Writing the content type header is similarly convoluted.

JSON() calls render.Render.WriteContentType(), which does a vtable lookup to find render.JSON.WriteContentType() , which calls the ordinary function writeContentType() , which does a vtable lookup to find .Header() on the response writer, then sets the header in an ordinary way.

In case that all sounds a bit abstract - and it is - I’ve provided a handy chart for you.

7.2.7. A chart of just indirections in Gin’s JSON handling

diagram source

Nothing inside the box labeled ‘gin’ does anything at all useful.

And again, this is just ONE of the ELEVEN different ways to send JSON responses in Gin. Most of them go through similar contortions. All of them have their own structs for some ungodly reason. We haven’t even covered requests! (I meant to, but this article has taken me multiple full workdays already).

7.2.8. The worst of both worlds

This approach is godawful, somehow combining the worst of both runtime lookup (extra indirection and function calls) and conditional compilation. Both you and the compiler have to jump through multiple layers of indirection to figure out what is actually happening at runtime, for no benefit whatsoever. These extra layers serve merely to bloat the binary and confuse the programmer.

In the default case - the case for 99.5% of Gin’s consumers, _you are doing the exact same thing as the standard library, but splitting the responsibility over a half dozen extra interfaces and types and hundreds of lines of code!

If you wanted to use a different JSON library, you could just… use that library!

All gin does is obscure the control flow, inculcating a sense of helplessness in the programmer and causing cache misses at runtime for no benefit whatsoever.

7.2.9. Small nitpicks about render

  • Why the hell does render take a http.ResponseWriter and not just an io.Writer? Is it supposed to do something different from writing the body (e.g, modifying the headers?)
  • On a similar note, why does WriteContentType take a whole http.ResponseWriter ? Is it supposed to modify the body? It should take a *http.Header! Or maybe be the slightly more sane interface { ContentType() string } - or better yet, not exist at all!

8. Gin’s documentation is very bad

Let’s keep this section short. Gin’s documentation is sparse at best. An illustrative example is gin.RouterGroup : despite it’s enormous API, it’s documentation is limited to a handful of sentences split between gin.RouterGroup itself and gin.RouterGroup.Handle .

8.0.1. gin.RouterGroup: example (bad) documentation

RouterGroup is used internally to configure router, a RouterGroup is associated with a prefix and an array of handlers (middleware).

Handle registers a new request handle and middleware with the given path and method. The last handler should be the real handler, the other ones should be middleware that can and should be shared among different routes. See the example code in GitHub. (Note: no link is provided!)

For GET, POST, PUT, PATCH and DELETE requests the respective shortcut functions can be used.
This function is intended for bulk loading and to allow the usage of less frequently used, non-standardized or custom methods (e.g. for internal communication with a proxy).


8.0.2. net/http.ServeMux : example (good) documentation

On the other hand, http.ServeMux ’s documentation is nearly a thousand words, not counting in-documentation examples , split into five sections: Patterns, Precedence, Trailing-slash redirection, Request sanitizing, and Compatibility. I encourage you to click on the two above links and take a look for yourself.

9. The Spider’s Web

None of this is the worst part of Gin. The worst part is this: going from a http.Handler to a gin handler is trivial. You can write an adapter to go FROM the standard library TO gin in a single line.

1func adaptHandler(h http.Handler) func(c *gin.Context) { return func(c *gin.Context) {return h.ServeHTTP(c.ResponseWriter, c.Request)}}

Going from a Gin handler to an ordinary http.Handler is functionally impossible - the only practical to do it is to dig into the code, figure out what it’s actually trying to do, and rip out all of the indirection.

If you’re still early enough in your software project, this is practical - if you’re months or years deep into a legacy codebase, you don’t have a chance in hell.

If a single person on your team gets the bright idea to use Gin, you’re more or less stuck. You can work _around _ it, but it will be lurking at the bottom of your server, a giant chain of dependencies that you can never really get rid of.

This, I think, is the secret to Gin’s success. It’s attractive enough and popular enough to attract the trendhoppers and the naive, and tolerable enough for them to stick with it long enough to get stuck, and, like restaraunts, most people use software because other people are already using it . Worse yet, because it’s so difficult and painful to move away, users of Gin make the wrong conclusion that this is because other libraries are hard , and they sing the praises of their jailers. Maybe flies on the web do the same thing.

10. Conclusion and Advice on Software Dependencies

Gin is a bad software library and we as developers should stop using things like it. The purpose of this essay is not really to talk about Gin - it’s to use it as an illustrated example of what is bad in software libraries rather than good.

The choice of what library, if any, to use is an engineering decision, not just a matter of opinion. It has concrete effects on the process of writing code and the resulting programs. While taste is part of the decision, it should not be the primary or only one. Gin and libraries like it will make your software worse. Stop using them.

I’ll finish off with some advice on picking dependencies

  • Figure out what the problem is before you reach for a solution.
  • The size of a solution should be proportional to the size of the problem.
  • READ THE CODE AND DOCUMENTATION FOR YOURSELF .
  • The cost of a library is the cost of a library and it’s dependencies , not just the parts you can see.
  • All things being equal, choose the library with fewer features.

10.1. What if I already use Gin?

If you’re not in deep, try and rip it out. If it’s already spread deep into your codebase, the best you can do is probably containment.

  • Make a policy to allow no new Gin .
  • Use ordinary net/http handlers instead where possible for any future work, even if there’s still Gin there.
  • If and when you split off services, force the point of split to leave Gin behind.

<

What Activists Can Learn from Rosa Parks on the 70th Anniversary of Montgomery Bus Boycott

Democracy Now!
www.democracynow.org
2025-12-10 13:51:56
What are the lessons from the Montgomery bus boycott launched 70 years ago this month? The boycott, which sparked the civil rights movement, began after the arrest of Rosa Parks for refusing to give up her seat on a segregated city bus to a white man. Historian and biographer Jeanne Theoharis, autho...
Original Article

Hi there,

Democracy Now!’s independent journalism is more vital than ever. We continue to spotlight the grassroots movements working to keep democracy alive. No time has been more crucial to amplify the voices that other outlets ignore. Thanks to a group of generous donors, all donations made today will be TRIPLED, which means your $15 gift is worth $45. Please donate today, so we can keep delivering fact-based, fearless reporting.

Every dollar makes a difference

. Thank you so much!

Democracy Now!
Amy Goodman

Non-commercial news needs your support.

We rely on contributions from you, our viewers and listeners to do our work. If you visit us daily or weekly or even just once a month, now is a great time to make your monthly contribution.

Please do your part today.

Donate

Independent Global News

Donate

What are the lessons from the Montgomery bus boycott launched 70 years ago this month? The boycott, which sparked the civil rights movement, began after the arrest of Rosa Parks for refusing to give up her seat on a segregated city bus to a white man. Historian and biographer Jeanne Theoharis, author of The Rebellious Life of Mrs. Rosa Parks , argues, “Part of what her courage is, is the ability to step forward again and again, without any sense that this is going to change anything, and say, 'This is the line. And I refuse.'” Theoharis’s new piece for The Guardian is “What we get wrong about the Montgomery bus boycott — and what we can learn from it.”


Please check back later for full transcript.

The original content of this program is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License . Please attribute legal copies of this work to democracynow.org. Some of the work(s) that this program incorporates, however, may be separately licensed. For further information or additional permissions, contact us.

Non-commercial news needs your support

We rely on contributions from our viewers and listeners to do our work.
Please do your part today.

Make a donation

Despite Judge's Order, ICE Deports Shackled Babson College Freshman, Harasses Her Family in Texas

Democracy Now!
www.democracynow.org
2025-12-10 13:45:12
Nineteen-year-old Any Lucía López Belloza was detained and deported, despite a lack of removal order, when attempting to head home from Babson College in Boston to surprise her family in Texas for Thanksgiving. “This is the first arrest of its kind I’ve seen,” says her attorney, To...
Original Article

Hi there,

Democracy Now!’s independent journalism is more vital than ever. We continue to spotlight the grassroots movements working to keep democracy alive. No time has been more crucial to amplify the voices that other outlets ignore. Thanks to a group of generous donors, all donations made today will be TRIPLED, which means your $15 gift is worth $45. Please donate today, so we can keep delivering fact-based, fearless reporting.

Every dollar makes a difference

. Thank you so much!

Democracy Now!
Amy Goodman

Non-commercial news needs your support.

We rely on contributions from you, our viewers and listeners to do our work. If you visit us daily or weekly or even just once a month, now is a great time to make your monthly contribution.

Please do your part today.

Donate

Independent Global News

Donate

Nineteen-year-old Any Lucía López Belloza was detained and deported, despite a lack of removal order, when attempting to head home from Babson College in Boston to surprise her family in Texas for Thanksgiving. “This is the first arrest of its kind I’ve seen,” says her attorney, Todd C. Pomerleau, who says the student has been the victim of “character assassination.” After López Belloza “was taken down near the border on a bus, had shackles around her ankles, chain around her waist, shackles around her wrist,” her family attempted to speak out to the press about the rights violations she suffered. They are now being harassed by law enforcement, as well.



Guests
  • Todd C. Pomerleau

    immigration attorney representing Any Lucía López Belloza and Bruna Caroline Ferreira.

Please check back later for full transcript.

The original content of this program is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License . Please attribute legal copies of this work to democracynow.org. Some of the work(s) that this program incorporates, however, may be separately licensed. For further information or additional permissions, contact us.

Non-commercial news needs your support

We rely on contributions from our viewers and listeners to do our work.
Please do your part today.

Make a donation

New benchmark shows top LLMs struggle in real mental health care

Hacker News
swordhealth.com
2025-12-10 13:39:16
Comments...
Original Article

The global demand for mental health support has never been higher, with over one billion people currently living with mental health conditions. As healthcare providers look for solutions to bridge the gap between demand and access, Large Language Models (LLMs) offer a promising avenue for scalable support.

At Sword Health, we have been working to realize this promise by developing our own LLMs specifically aligned for mental health care. However, from the beginning of our development journey, we encountered a critical obstacle: we could not improve what we could not accurately measure.

While we could train models to be helpful, answering the fundamental question – can we trust this model to provide safe, effective therapeutic care? – remained elusive. We realized that relying on existing evaluations wasn't enough to guide the development of truly clinical-grade AI. To solve our own development needs, we had to build a new yardstick.

Today, we are introducing MindEval , a novel framework designed in collaboration with licensed Clinical Psychologists to evaluate LLMs in realistic, multi-turn mental health conversations. By automating the assessment of clinical skills, MindEval allows us to move beyond basic checks and measure actual therapeutic competence.

We believe that safety in healthcare AI should not be a proprietary secret, but a shared foundation. To accelerate the industry’s progress toward clinically safe AI, we are open-sourcing the entire MindEval framework including our expert-designed prompts, code, and evaluation datasets. Our goal is for MindEval to serve as a community-driven standard, giving developers and researchers a reliable yardstick to measure and improve the mental health capabilities of future models.

The problem: moving beyond "book knowledge"

The deployment of AI in mental health is currently outpacing our ability to evaluate it. As the industry faces rising concerns about the safety of therapeutic chatbots, a core obstacle to creating safer systems is the scarcity of benchmarks that capture the complexity of real therapy.

Current AI systems present significant limitations in therapeutic settings, often defaulting to sycophancy (excessive eagerness to please) or over-reassurance, which can inadvertently reinforce maladaptive beliefs. Yet, most existing benchmarks fail to catch these nuances because they assess models through multiple-choice questions that test clinical knowledge, or by evaluating single responses in isolation.

We found that current evaluation methods fall short in three key areas:

  • Knowledge vs. competence: While an AI might know the textbook definition of depression, that does not guarantee it has the clinical aptitude across domains, such as clinical accuracy, ethical and professional decision making, rapport building, among others.
  • Static vs. dynamic: Therapy is longitudinal. Existing benchmarks typically look at static snapshots, missing the critical dynamics that happen over a multi-turn session.
  • Vibes vs. validation: Without rigorous, expert-derived rubrics, safety checks often rely on subjective "vibe checks." We believe that to build safe AI for healthcare, we must move beyond "vibes" and into rigorous, clinically grounded evaluation.

The MindEval framework

MindEval is a fully automated, model-agnostic framework that evaluates therapy sessions dynamically. As illustrated below, the framework relies on the interaction between specific components to simulate a full therapeutic session.

The framework, illustrated in Figure 1, consists of three primary agents:

  1. The Patient LLM (PLM): This model is prompted with a highly detailed profile and backstory to simulate a patient. It mimics a real person engaging in a multi-turn conversation, maintaining consistency in personality and symptoms throughout the interaction.
  2. The Clinician LLM (CLM): This is the model being evaluated (e.g., GPT-5, Claude 4.5). It interacts with the patient, attempting to provide therapeutic support.
  3. The Judge LLM (JLM): Once the interaction is complete, a separate "judge" model evaluates the interaction.

Crucially, the Judge LLM does not simply give a binary thumbs up or down. It scores the entire interaction on 5 core criteria grounded in clinical supervision guidelines from the APA:

  • Clinical Accuracy & Competence (CAC)
  • Ethical & Professional Conduct (EPC)
  • Assessment & Response (AR)
  • Therapeutic Relationship & Alliance (TRA)
  • AI-Specific Communication Quality (ASQC)

In Table 1 we show the score range for Clinical Accuracy & Competence. Each criteria follows a similar scale with scores between 3-4 representing and average but acceptable performance.

Validating the framework: realism and accuracy

Before benchmarking other models, we first had to show that MindEval itself yielded reliable interactions and judgments. We focused on two key areas to validate MindEval: Patient Realism and Judge Quality .

To validate Patient Realism , we quantitatively measured the similarity between the text produced by our simulated patients (PLM) and text generated by humans performing the same role-play task. Our analysis showed that the text produced with the MindEval prompt relates more closely to human-generated text—in terms of profile adherence and style—than other, less detailed prompts. Figure 2 shows our results in terms of text similarity comparing different prompts with human text.

To validate Judge Quality , we compared the outputs of our automated judge (JLM) to those of a panel of human experts. Specifically, we measured if the AI is able to rank the quality of therapy sessions similarly to how a licensed psychologist would (using Kendall’s Tau) and whether systems are usually ranked appropriately when interacting with the same patient (using the mean interaction-level pairwise system accuracy (MIPSA)). Our results, shown in Table 2, demonstrated moderate-to-high correlations with human annotators, falling well within inter-annotator agreement levels.

Benchmark results: how do state-of-the-art models perform?

Having established the validity of our methodology, we benchmarked 12 state-of-the-art LLMs, including but not limited to GPT-5, Claude 4.5 Sonnet, and Gemini 2.5 Pro. In our article we show detailed results per system but overall, across all categories, models struggled. Figure 3 shows the average results with min and max score per category and in different scenarios ranging from severe symptoms to longer conversations with 40 turns.

Our findings revealed significant gaps in current AI capabilities:

  • Room for improvement: On a clinical quality scale of 1 to 6, the average score across all models was below 4.
  • Bigger is not always better: Counter-intuitively, we found that reasoning capabilities and massive model scale do not guarantee better performance in a therapeutic context. For example, some smaller models outperformed larger reasoning models in specific communication qualities. Being good at math or coding does not translate directly to being good at mental health support.
  • Critical weaknesses in difficult scenarios: Reliability is paramount in healthcare, yet we found that model performance deteriorated when supporting patients with severe symptoms. Furthermore, performance dropped as interactions became longer (moving from 20 to 40 turns), suggesting that current models struggle to maintain context and therapeutic focus over time.

Conclusion

We believe that to build safe AI for healthcare , we must measure what matters. MindEval moves the industry beyond "vibes" and into rigorous, clinically grounded evaluation. While current models show promise, our results indicate there is much room for improvement to make these systems reliable for patients across the entire spectrum of mental health needs.

Despite their impressive capabilities in code and reasoning, every frontier model we tested failed to meet the threshold for clinical reliability, scoring below 4 out of 6 on average. Our data shows that models trained for general helpfulness often struggle with the specific, high-stakes nuance of therapeutic care, particularly when patients present with severe symptoms. This is not a problem that can be solved simply by making models larger; it requires a fundamental shift in how we align and evaluate AI for care.

To encourage transparency and help the industry close this gap, we are releasing all code, prompts, and human evaluation data to the public.

Trump Spokesperson Karoline Leavitt's Nephew's Mother Released from ICE Jail, Faces Deportation

Democracy Now!
www.democracynow.org
2025-12-10 13:37:17
Bruna Ferreira, a DACA recipient and mother of White House Press Secretary Karoline Leavitt’s nephew, has lived in the United States since she was 6 years old, but was recently arrested by ICE in her own driveway in what her attorney, Todd Pomerleau, calls a “brazen, unconstitutional arr...
Original Article

Image Credit: TMZ; GoFundMe

Bruna Ferreira, a DACA recipient and mother of White House Press Secretary Karoline Leavitt’s nephew, has lived in the United States since she was 6 years old, but was recently arrested by ICE in her own driveway in what her attorney, Todd Pomerleau, calls a “brazen, unconstitutional arrest, a clear violation of her rights.” Ferreira was transported to a remote detention center in Louisiana following her arrest in Massachusetts, and just released Tuesday. “All of a sudden, now the Leavitts have a problem with 'criminal illegal aliens.' Yet one of them was about to marry one of their loved ones, and there was no problem,” says Pomerleau.



Guests
  • Todd C. Pomerleau

    immigration attorney representing Any Lucía López Belloza and Bruna Caroline Ferreira.

Please check back later for full transcript.

The original content of this program is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License . Please attribute legal copies of this work to democracynow.org. Some of the work(s) that this program incorporates, however, may be separately licensed. For further information or additional permissions, contact us.

Apple Faces Scrutiny as Sanctioned Entities Slip Through App Store Controls

Hacker News
www.washingtonpost.com
2025-12-10 13:36:52
Comments...
Original Article
Timed out getting readerview for https://www.washingtonpost.com/technology/2025/12/10/us-sanctions-apple-google/

"Torture & Enforced Disappearances" at Florida's ICE Jails "Alligator Alcatraz" & Krome

Democracy Now!
www.democracynow.org
2025-12-10 13:30:30
Lights on 24/7. Overflowing toilets and lack of access to showers. Solitary confinement in a 2×2-foot box. These are some of the torturous conditions documented in a new report from Amnesty International investigating human rights violations at two ICE detention centers in Florida: the Krome No...
Original Article

This is a rush transcript. Copy may not be in its final form.

AMY GOODMAN : This is Democracy Now! , democracynow.org. I’m Amy Goodman.

We look now at how Amnesty International says immigrants held at the ICE jail in Florida dubbed “Alligator Alcatraz” were shackled inside a two-foot-high metal cage and left outside without water for up to a day at a time. In a new report , they also detail unsanitary conditions, lights on 24 hours a day, poor-quality food and water and lack of privacy. The report is titled “Torture and Enforced Disappearances in the Sunshine State: Human Rights Violations at 'Alligator Alcatraz' and Krome in Florida.”

We’re joined now by the lead researcher, Amy Fischer, director of refugee and migrant rights at Amnesty International USA .

Amy, thanks for being with us. Thanks for joining us from Bentonville. Can you describe what you found?

AMY FISCHER : Sure. So, myself and colleagues from Amnesty International went to Florida in September, and we were able to have a tour of the Krome detention facility, and we were also able to speak to a number of individuals detained inside, who had also been detained inside of “Alligator Alcatraz.” And really, what we heard about both facilities were harrowing stories of human rights violations, cruel conditions, abuse and, in some cases, treatment that amounts to torture under international law.

AMY GOODMAN : So, explain — can you describe situations you found at what the Republicans have dubbed Alligator Alcatraz?

AMY FISCHER : Of course. So, what we found and heard about Alligator Alcatraz is that people are housed in cages that hold about 30 people, and there’s about three toilets per cage. The weather, the environment is very severe.

And one of the things that we heard that was so concerning was the use of something called “the box,” which was described to us as the type of cage that you put lions in at the zoo. And people are placed in this box as a form of punishment. It is a two-by-two-foot box where people are shackled at their wrists and at their feet and chained to the ground in the hot Florida sun for hours upon hours at a time, without food, without water, as a form of punishment. We heard that there was an incident in which somebody in one of the cages was having a medical emergency, and other people inside were calling for help for this individual, and those that were seeking help were placed in the box as a form of punishment. And after hearing about this, Amnesty made the determination that putting people in the box and the use of this box amounts to torture under international law.

AMY GOODMAN : So, what can be done? What are you demanding, Amy Fischer?

AMY FISCHER : We are calling for the shutdown of Alligator Alcatraz, as well as Krome, as well as any other cruel detention center across the United States. What we are really seeing is an intentional development within immigration detention that is aiming to make it increasingly cruel, increasingly abusive, so that people are forced to give up their immigration claims, give up their asylum claims, because the conditions are so cruel that they can’t handle it anymore. And what we know is that there are alternatives to immigration detention that are cheaper, that respect human rights and actually lift up all communities. And so, what we need to do is shut down these facilities and instead invest in an immigration system that works for all of us.

AMY GOODMAN : I wanted to go to a vigil outside of Alligator Alcatraz. Sonia Bichara spoke about the conditions that her fiancé, Rafael Collado, faced inside the jail.

SONIA BICHARA : [translated] He has been there for a month and three days, and he has told me that the conditions inside are deplorable. The food is terrible. They keep the lights on all the time to keep them awake. They say they are tired of seeing each other’s faces. They are given five minutes to eat and only a small cup of water with their meal. If they stand up, they are beaten. … They turn the identifications around so they can’t see the names. Some tell them the time. Others don’t, because they don’t know what time it is. They only know when they call home. My fiancé asks me what time it is, what day it is today. And it breaks my heart when he asks me that question.

AMY GOODMAN : In response to the Amnesty report , Republican Governor Ron DeSantis of Florida issued a statement that read, quote, “This 'report' is nothing more than a politically motivated attack. None of these fabrications are true. In fact, running these allegations without any evidence whatsoever could jeopardize the safety and security of our staff and those being housed at Alligator Alcatraz.” Amy Fischer, your response?

AMY FISCHER : First of all, if the governor would like to see some evidence, I encourage him and his office to read the report, where that evidence is presented. But more than anything, if Governor DeSantis is really concerned about the care and safety of those in custody, then he should shut down Alligator Alcatraz and allow these people to return to their communities, where they belong.

AMY GOODMAN : Amy Fischer, I want to thank you so much for being with us and ask one last question about Krome. I remember years ago when people were marching on Krome because of Haitians who were fleeing violence in Haiti were being placed there and the conditions that they faced. If you can briefly summarize what you found there?

AMY FISCHER : What we found at Krome was very similar to what other people have been reporting for years, horrific conditions. You know, one of the things that most impacted me at the time in Krome is that we were speaking to somebody in solitary confinement who was showing us an injured hand through the slot on the door, and an ICE agent slammed the metal flap of the solitary confinement door on this man’s injured hand and then punched it repeatedly. And that was such a show of force, a show of violence, and it happened in front of human rights monitors. And so, we know the conditions are horrible, and when we see that type of activity in front of human rights monitors, we can only imagine the type of cruelty that is going on on a day-to-day basis behind closed doors.

AMY GOODMAN : I want to thank you so much for being with us, Amy. Amy Fischer, lead researcher on the new Amnesty International report , “Torture and Enforced Disappearances in the Sunshine State.” We’ll link to it at democracynow.org. Thanks for joining us from Bentonville, Arkansas.

The original content of this program is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License . Please attribute legal copies of this work to democracynow.org. Some of the work(s) that this program incorporates, however, may be separately licensed. For further information or additional permissions, contact us.

ICE is tracking pregnant women all the way to the delivery room: ‘She was so afraid they would take her baby’

Guardian
www.theguardian.com
2025-12-10 13:15:55
Pregnant immigrants in ICE monitoring programs are avoiding care, fearing detention during labour and delivery In early September, a woman, nine months pregnant, walked into the emergency obstetrics unit of a Colorado hospital. Though the labor and delivery staff caring for her expected her to have ...
Original Article

I n early September, a woman, nine months pregnant, walked into the emergency obstetrics unit of a Colorado hospital. Though the labor and delivery staff caring for her expected her to have a smooth delivery, her case presented complications almost immediately.

The woman, who was born in central Asia, checked into the hospital with a smart watch on her wrist, said two hospital workers who cared for her during her labor, and whom the Guardian is not identifying to avoid exposing their hospital or patients to retaliation.

The device was not an ordinary smart watch made by Apple or Samsung, but a special type that US Immigration and Custom Enforcement (ICE) had mandated the woman wear at all times, allowing the agency to track her. The device was beeping when she entered the hospital, indicating she needed to charge it, and she worried that if the battery died, ICE agents would think she was trying to disappear, the hospital workers recalled. She told them that, just days earlier, she had been put on a deportation flight to Mexico, but the pilot refused to let her fly because she was so close to giving birth.

The woman’s fear only grew from there, according to the hospital workers. Her delivery wasn’t progressing the way the care team hoped, and she needed a C-section, a procedure that requires doctors to use a cauterizing tool to minimize bleeding. To prevent possible burning and electrocution, patients are instructed to take off all jewelry or metals before the surgery. The mandatory watch had no way to be easily removed, nor was information about whether it would be safe to wear during the procedure readily available. Hospital staff didn’t know how to contact ICE to ask what to do. When hospital staff told the woman they might have to cut the smart watch off, she panicked, the workers said.

Staff eventually did remove the device, and ICE agents did not show up at the hospital during the delivery. The nurses said they do not know what happened to the woman after she left the hospital with her baby.

The woman was one of three pregnant patients wearing a location-tracking smart watch whom these two workers encountered in their ER in the last few months, they said.

BI Inc and alternative to detention

The watches are built and operated by BI Inc, a company specializing in monitoring tech that runs the US government’s largest immigrant surveillance operation. The program, Alternative to Detention (ATD), allows select immigrants to await their day in court at home rather than in detention, provided they subscribe to intense monitoring.

When immigrants are enrolled in ATD, they are assigned one or more types of supervision. Some have to wear an ankle monitor, some a smart watch. Some are required to complete regularly scheduled facial recognition scans at their home using a BI Inc app, others are mandated to go into a BI Inc or ICE office for regular in-person check-ins.

The smart watch, officially called the VeriWatch, was introduced two years ago by BI Inc. It was first piloted under the Biden administration and framed as a more discrete alternative to the less digitally equipped ankle monitor, which BI also manufactures and supplies to ICE. As the Guardian previously reported , immigrants wearing the ankle monitors have complained about the stigma that comes with wearing the conspicuous device as well as physical pain caused by the monitors, including electric shocks and cuts from devices that are strapped on too tightly.

Nearly 200,000 people are currently enrolled in the program, and many of them have become increasingly fearful of being considered out of compliance as the Trump administration works to deport immigrants en masse. There have been several cases of people in the program showing up to a mandated, regular in-person check-in with immigration officials, believing they will continue in the ATD program, only to be detained.

All three women encountered by the Colorado hospital staff were reluctant to take their monitors off, fearing that doing so would trigger an alert to ICE or BI Inc, the staff said, even if removing the device was deemed medically necessary.

One of the women went into the ER for a C-section and was diagnosed with preeclampsia, a complication that can cause significant swelling. Staff were worried her smartwatch would cut off her circulation.

“She was in tears about it. She had this deep fear that ICE was going to come to the hospital and take her baby,” one of the staff said. The hospital worker’s shift ended before the patient underwent the C-section. They said they do not know whether the staff who took over the patient’s case convinced her to cut off the watch.

The confusion and fear surrounding the wrist monitor caused delays in the hospital’s ability to provide adequate and necessary care for these women, the workers said, though the patients delivered their babies safely.

“Waiting and trying to figure these things out even when things are not super emergent can cause something emergent to happen,” one of the workers said. “Sometimes in birth, doing a C-section 20 minutes before something bad happens can prevent it.”

The workers pointed out that when they treat patients wearing a monitor issued by the state Department of Corrections, there is a protocol in place to remove it ahead of medical procedures.

Trump’s chaotic crackdown

Hospital staff from across the US who spoke to the Guardian say the confusion brought on by monitoring devices is just one of several ways Donald Trump’s immigration crackdown is affecting medical care, and comes as immigrant patients are increasingly fearful of seeking out treatment.

One of the staff at the Colorado hospital said she’s had at leastthree pregnant patients show up for their first-ever prenatal appointment at anytime between 34 and 38 weeks – well into their third trimester and long after pregnant women are recommended to begin going to consistent doctor appointments.

In California , hospital workers have also noticed a drop this year in immigrants not just seeking emergency care but also showing up for regular doctor visits or vaccinations, according to the California Nurses Association president, Sandy Reding.

“Obviously it has a cascading effect,” Reding said. “If you don’t see your doctor regularly then the outcomes are worse and you wait until you have a crisis to go to the ER.”

In Chicago, CommunityHealth, one of the largest volunteer-based health centers in the country, documented an overall drop in visits per patient and patient retention between 2024 and 2025 due to immigration enforcement activity in the city. In June, the organization observed a 30% dip in patients showing up for their appointments and around a 40% drop in patients picking up their medication since Trump took office.

Neither ICE nor BI Inc responded to requests for comment. ICE previously told the Guardian that there is no evidence the ankle monitors have caused physical harm and that the ATD program was effective at increasing court appearance rates among immigrants facing removal.

skip past newsletter promotion

Vague procedures, concrete problems

The lack of procedure to have ankle or wrist monitors removed in medical emergencies has affected more than just pregnant women. In one July 2025 case, ICE responded to a man’s request to remove his ankle monitor because of a medical issue by detaining him, according to a court petition filed on his behalf by immigrant rights group Amica, which the Guardian reviewed.

The man came to the US from Bangladesh to seek political asylum, and was told he had to wear an ankle monitor while his claim was pending. Suffering from nerve damage in one leg, he obtained a note from a medical clinic requesting the monitor be removed. His lawyer sent the note to the ICE officer on the case but never heard back. During his first check-in at the BI offices, the man brought the medical note to the BI Inc employee assigned to the case, who suggested the man might be able to move the ankle monitor to his other leg. But after the man’s lawyer called ICE to inquire about moving the ankle monitor, the BI case manager informed the man that ICE officers were coming to the BI office to speak with him. They arrested and detained the man, according to the petition.

“He explained that he was just asking for the ankle monitor to be put on the other leg, and the officer told him it was ‘too late’,” the petition reads.

In 2009, ICE discontinued the use of ankle monitors for pregnant women and people whose medical conditions made it “inappropriate” to wear them. But former BI Inc staff as well as immigrants rights groups Amica and American Friends Services Committee said they are concerned that these exceptions are not always enforced. That exception also doesn’t apply to smart watches, a June 2025 ICE memo shows.

The ICE memo instructs agency staffers to put ankle monitors on anyone enrolled in ATD. Dawnisha M Helland, an ICE acting assistant director in the management of non-detained immigrants, wrote that the only group who would not be given ankle monitors were pregnant women. Instead, pregnant women in ATD would wear the smart watch .

Though it resembles a typical consumer smart watch, the VeriWatch is not less restrictive than the ankle monitor. Like the ankle monitor, the wrist watch can’t be removed by the person wearing it. ICE had the option of using a removable version of the watch, according to a 2023 request for information DHS published. The agency chose a different direction; it currently only uses a watch that cannot be removed except by an ICE or authorized BI agent, according to two former DHS officials and two former BI employees.

Immigrants in the program are not told what to do with their ankle or wrist monitors in case of medical emergencies, and BI staff were not authorized to approve the removal of the monitors without first speaking to ICE, the two former BI Inc. staff recalled.

There’s not always time in emergency cases to wait for approval from ICE to cut off the monitors, the Colorado hospital workers said. One of the Colorado staff said they’re deeply concerned about how this unremovable watch will continue to impact vulnerable pregnant women.

“They’re looking at people who literally can’t speak up, who have no legal resources, who are not American citizens, and are pregnant. They’re asking themselves what they can get away with in terms of violating civil liberties for these patients,” the employee said. “That’s the true pilot program: How far can they overreach?”

Internal alarm

Healthcare workers are not the only ones sounding the alarm over surveillance’s interference with medical care. Two former Department of Homeland Security officials told the Guardian that the lack of protocols for immigrants surveilled under ATD with exigent medical issues is a symptom of a larger issue with the way BI Inc and ICE run the program. As the Guardian previously reported , immigrants surveilled under ATD and BI Inc employees alike have long complained that the program is highly discretionary. They said that many of the decisions about how, why or how long a given person was mandated to wear an ankle monitor or a smart watch were left to individual case workers.

BI Inc, which started off as a cattle monitoring company, and its parent company the Geo Group, which develops detention centers, private prisons, and rehabilitation facilities, have been given the exclusive DHS contract to operate all aspects of the ATD program since its inception in 2004. That’s despite previous attempts by ICE leadership under Joe Biden’s administration to break the contract up into three parts rather than awarding the entirety of the contract to Geo Group, a company that has served as a landing spot for former ICE and DHS officials.

At its peak, BI Inc monitored approximately 370,000 immigrants under the Biden administration as part of a policy that put every head of household crossing the border on ATD. The tally decreased in 2025 to about 180,000 people, due in part to high costs of putting so many people on ATD, former DHS officials said. As Trump’s second administration supercharged immigration enforcement and greenlit a $150bn surge in funding for ICE, though, Geo Group executives expressed confidence they could reach the same height by the second half of 2025. The goal, the executives have said, is to monitor all 7.5 million people listed on the federal government’s non-detained docket, the list of non-citizens who have not been detained but are subject to removal.

However, the Trump administration has focused on deportation and detention rather than monitoring, and the number of immigrants enrolled in ATD and wearing ankle monitors or other GPS tracking devices has hovered around 180,000, much to the dismay of Geo Group executives.

“Now the count has been fairly stable, which is a little disappointing, obviously,” George Zoley, the GEO Group founder and executive chairman of the board, said during the company’s November earnings call .

ICE awarded another two-year-contract to BI Inc to manage ATD in September . Executives have said they’re pleased that the agency is prioritizing using the company’s more expensive ankle monitors on those immigrants already in ATD rather than the more cost-effective tools like the company’s facial recognition app, Smart Link.

Under the Biden administration, several departments within DHS attempted to address the lack of consistent policy around how ICE should run ATD. In December 2022, DHS hosted 100 non-governmental organizations as well as members of academia and private industry to discuss how to bring more “uniform standards to govern” ATD. That two year effort to draft guidelines in a document, initially titled Non-Detained Management Standards, was ultimately scuttled by ICE and BI, said Scott Shuchart, a former assistant director for regulatory affairs and police at ICE under the Biden administration. Another former DHS official confirmed his account. The draft standards were never made public.

“The program is really structured for the benefit of BI and not for the benefit of the non-citizens who were going to be managed through it,” said Shuchart. “Therefore ERO [ICE’s enforcement and removal arm] was extremely resistant to bring rationalization and consistent policy into it.”

Will the International Community Act? Preschool Massacre & "Large Piles of Bodies" in Sudan

Democracy Now!
www.democracynow.org
2025-12-10 13:13:06
The world’s largest conflict by scale is in Sudan, where tens of thousands have been killed and millions displaced since fighting broke out between the UAE-backed paramilitary Rapid Support Forces (RSF) and the Sudanese military (SAF) in April 2023. Last week, the RSF attacked a kindergarten, ...
Original Article

This is a rush transcript. Copy may not be in its final form.

AMY GOODMAN : We begin today’s show in Sudan, where fighting continues after the UAE -backed Rapid Support Forces attacked a preschool last year [ sic ], a hospital and other sites in the state of South Kordofan, killing at least 116 people, including 46 children. This happened last week. Reports from the WHO say parents and caregivers rushed the wounded to a nearby hospital, even as the attack was ongoing. Paramedics and responders were also reportedly attacked.

Since fighting between the RSF and the Sudanese military broke out in April 2023, an estimated 150,000 people have been killed and at least 12 million displaced. Aid groups say the true death toll is likely far higher. Hundreds of thousands also face famine.

On Monday, the RSF seized the Heglig oil field, the country’s largest. This follows other RSF advances, including the seizure of El Fasher, Darfur’s largest city, in October. In this clip from Amnesty International, a survivor describes what happened when she fled El Fasher with her five children and was stopped by three armed men.

SURVIVOR : [translated] One of them forced me to go with them, cut my robe and raped me. When they left, my 14-year-old daughter came to me. I found that her clothes had blood on them and were cut into pieces. Her hair at the back of her head was full of dust. She came to me and said, “Mum, they raped me, too, but do not tell anyone.” After the rape, my daughter became really sick. When we reached Tawila, her health deteriorated, and she died at the clinic.

AMY GOODMAN : Also this week, on Tuesday, the United States Treasury announced sanctions against four people and four entities accused of recruiting Colombian mercenaries to fight alongside the RSF in Sudan.

To discuss all this and more, we’re joined by two guests. Nathaniel Raymond is executive director of the Humanitarian Research Lab at the Yale School of Public Health. The lab has been monitoring El Fasher. He’s joining us from New Haven. And here in New York, we’re joined by Kholood Khair, a Sudanese political analyst, head of the Confluence Advisory, a think tank founded in Khartoum.

Welcome to Democracy Now! Kholood, let’s begin with you. We just heard this horrific story, and this follows last week’s attack on the kindergarten and a hospital. At least 46 children were killed. For people who are not following what is happening in Sudan, can you explain why these warring parties are still fighting after two years?

KHOLOOD KHAIR : Sure. Well, this war started because the Sudanese Armed Forces, the national army, and the Rapid Support Forces, a really powerful paramilitary group, fell out of favor with each other. They were once very much allied. They committed the genocide together in Darfur 20 years ago. They led a coup against a civilian cabinet two-and-a-half years — more than two-and-a-half years ago, in 2021. And then they fell out, because there wasn’t any kind of security arrangements that both were happy with.

Now, this war is the world’s largest at the moment. It’s the world’s largest hunger crisis, world’s largest humanitarian crisis, world’s largest displacement crisis, and, as we heard in your report, charged with the world’s largest protection crisis, because of the number of women and girls, in particular, who are being exposed to gender — sexual, gender-based violence. And this war, really, to a lot of people, seems like a nonsensical conflict, because the level of fighting cannot possibly justify any political machinations of either of the two sides. But this war has now mushroomed into something much, much larger. Almost every part of Sudan is somehow impacted by this war.

People who pushed against military rule in the revolution of 2018 and 2019 are, by and large, the parts of the society that are facing the most repression from both the Sudanese Armed Forces and the Rapid Support Forces. And so we’re seeing really a war against civilians. While the SAF and the RSF are fighting each other, they’re really fighting the people of Sudan. And that’s why you get the nursery killings that we saw last week. You see barrel bombs being used by the Sudanese Armed Forces against largely civilian sites. You see the mass atrocity and genocide that’s taking place in Darfur. And all of that really can be described, I think, best, for people unfamiliar with the story, as a means for the security services in Sudan, both the Sudanese Armed Forces and the Rapid Support Forces, to really try and kill any kind of revolutionary zeal in Sudan and to make sure that they pave the way for their vision of military rule.

AMY GOODMAN : Can you talk about the role of the United Arab Emirates? What’s their interest in backing the RSF ? And then talk about how and why former Colombian military personnel came to fight alongside the RSF . Talk about the international dimensions of this.

KHOLOOD KHAIR : Sure. Well, increasingly, as the war continues, we get to see more and more of these proxy elements, and the most obvious one has been the United Arab Emirates. It, of course, denies supporting the RSF , but the United States’ own intelligence community, the United Nations’ panel of experts on Darfur have all shown that the UAE has been supporting the RSF pretty much from the outset of the war, and probably before that. The UAE has been familiar with the RSF for some time. The RSF and the Sudanese Armed Forces were part of the Saudi-Emirati Coalition on Yemen, and that’s where their relationship really started.

But the UAE now is interested in land in Sudan, arable land, fertile land for agriculture. It’s interested in supply lines that go through the western part of Sudan and the southern part of Sudan, that the RSF largely controls. It has some interest in Red Sea access. It is also interested in being — having some kind of influence over the Red Sea, which, of course, is a very large and very important commercial zone. And because of that, it has given the RSF huge amounts, huge volumes of weapons, and very sophisticated weapons, from as far away as China. But there are also allegations that German, Swedish, British, American and Canadian weaponry that has been sold to the United Arab Emirates has found its way to the — in the RSF’s hands in places like Darfur.

Now, what we’ve seen recently is an uptick of mercenary action that is reported to have come through the Global Security Services Group, a UAE -based company, that gets particularly Colombian mercenaries that have been phased out of the Colombian — Colombian soldiers who have been phased out of the Colombian military since the 2016 peace agreement in Colombia. And those people have effectively found new livelihood sources through this UAE -based company, and most of them have now found themselves in Darfur. Now, there are some reports that these mercenaries will continue to be part of the coalition in Sudan, as we have seen them in Yemen and as we have seen them in Libya. So, this is part of a broader UAE security infrastructure that’s been put in place that we’re now seeing brought to bear in Sudan.

AMY GOODMAN : And the U.S. Treasury announcing sanctions against four people and four companies accused of aiding the RSF by enlisting Colombian mercenaries? Can you talk about the role of the U.S., which is increasingly allying with Saudi Arabia, UAE , Qatar, UAE being the backer of one of the sides, of the RSF , what power they have here?

KHOLOOD KHAIR : So, the U.S. has a lot of power. The question is: Will they use it? Because Sudan is — even though it is the world’s largest conflict by scale right now, it is not very important, has not been a priority country for the United States, which means that increasingly what we’re seeing is that the U.S. allies are able to, you know, be involved in the war in Sudan, whether it’s by supporting one of the armed actors or, in the case of Egypt and Turkey, by increasing its weapons support to, for example, the Sudanese Armed Forces, and really allowing for these proxy elements to take part in order to keep their allies in the region happy.

And, you know, the biggest country that’s sort of — the biggest priority, I should say, in the region for the United States is, of course, Israel. And here we see that Arab countries, particularly the United Arab Emirates, that, of course, is very close to Israel, is an ally of Israel in the region, probably one of the few, has really been able to use that relationship, to leverage that relationship against Washington in terms of what it can get away with, as far as the United States is concerned. And this is what puts Sudan, unfortunately, in a very difficult position. And the rights and sort of the, you know, potential ability for civilians in Sudan to get access, to get an end to the fighting, to get a peace deal, some kind of ceasefire, all of that is complicated by the regional picture, and in particular the interests of American allies.

AMY GOODMAN : The United Nations is saying more resources are needed to adequately address the humanitarian crisis. The U.N. High Commissioner for Refugees Filippo Grandi says the Sudan response plan is only one-third-funded, due largely to Western donor cuts.

FILIPPO GRANDI : What is very real is that people are fleeing this advance of the RSF . I was in a place called Al Dabbah. This is in the so-called Northern state, north of Khartoum, where there is a smaller camp. You know, the biggest camp is Tawila, taking people from El Fasher. This is a smaller camp, taking people also from El Fasher, but also from Kordofan and other places. And their stories are, unfortunately, all the same: rape, murder, forced recruitment of children, separation of families and sheer robbery. … We are barely responding. I have to say, in the site — I only visited this particular site, which is not very big, about 11,000, 12,000 people, but arrivals all the time. We saw people just arrive, literally.

AMY GOODMAN : So, that’s the U.N. High Commissioner for Refugees Filippo Grandi. We’re bringing in now Nathaniel Raymond, executive director of the Humanitarian Research Lab at the Yale School of Public Health, which is monitoring El Fasher. Nathaniel, we last had you on weeks ago. Explain the latest findings of your lab. You said El Fasher is beginning to look a lot like a slaughterhouse. We haven’t spoken to you since, for example, the kindergarten was attacked last week, with over 40 children killed.

NATHANIEL RAYMOND : What we’re seeing, through very high-resolution satellite imagery, is at least 140 large piles of bodies that appear at the end of October into early November, and we see basically a pattern of activity by the Rapid Support Forces that indicates they’ve been burning and burying bodies for almost the better part of five weeks. Meanwhile, we see none of the pattern of life that we expect to see in a place with civilians. There’s grass growing in the main market in El Fasher. There’s no activity at the water points or in the streets. And there’s no sign of civilian vehicles, such as donkey carts or cars. Basically, we see a ghost town, where the only visible activity is Rapid Support Forces in what’s called their technicals, their armed pickup trucks, moving objects consistent with human remains around, burying them and burning them.

AMY GOODMAN : Moving from what’s happening now, the horror we’re hearing described, to the International Criminal Court in The Hague, they’ve just sentenced the former Janjaweed leader Ali Muhammad Ali Abd-Al-Rahman, known as Ali Kushayb, to 20 years in prison for atrocities committed in the Darfur region like 20 years ago, in 2003 and ’04. He faced 31 counts of war crimes and crimes against humanity, turned himself in, transferred to ICC custody in 2020. The RSF is largely seen as successors to the Janjaweed. This is the presiding ICC Judge Joanna Korner.

JUDGE JOANNA KORNER : Abd-Al-Rahman’s conviction is the first acknowledgment that the people of Darfur were not victims of mere intertribal conflict or something akin to that. They were victims of a deliberate campaign, the chamber made it very clear, orchestrated by those in power, executed by the Janjaweed, led by Mr. Abd-Al-Rahman in the Wadi Salih region, under the authority of the government of Sudan, even if not specifically ordered by anyone in particular.

AMY GOODMAN : So, that was the presiding judge of the International Criminal Court sentencing the Janjaweed leader. This is crimes committed over 20 years ago. You grew up in Sudan, Kholood Khair. Is it proper to think of, as we wrap up this segment, the RSF as the kind of successors of the Janjaweed? And the significance of this sentencing?

KHOLOOD KHAIR : Absolutely. I mean, I think, first and foremost, that a lot of people will feel that 20 years is far too little for what Ali Kushayb has committed, and others will feel that it’s not enough, of course. There are four other indictees, including the former president, President Omar al-Bashir, and another very key indictee called Ahmed Haroun, who, according to the ICC , was working very closely with Ali Kushayb. Now, the issue is that both Omar al-Bashir and Ahmed Haroun are currently in Sudan, and reports say that they’re being protected by the Sudanese Armed Forces.

And this just shows you the extent to which neither the RSF nor the SAF want to see justice done in Sudan for previous crimes, or indeed for current crimes, and have blocked every single justice mechanism that we have seen. That said, Darfuri people and communities that I speak to say that at least now we’re seeing some kind of justice, some kind of recompense at the global stage, because we’re not going to get it at the national stage. No government in Sudan has ever been interested in bringing about justice for particularly those from places like Darfur and the Kordofans. So, it’s some measure of justice, but by no means enough.

AMY GOODMAN : You left just a year or two ago from Sudan. You travel the world. You talk about the situation in Sudan. What do you think, as we wrap up, are the biggest misconceptions? And what’s the most important action that must be taken now?

KHOLOOD KHAIR : I think people get very much invested in the military elements of this war — who has gained what ground, you know, which actor is potentially on the road to winning it, or isn’t. What we have seen in Sudan’s history is that no military actor has ever won a war outright, whether that’s the central Sudanese Armed Forces or any group that they have fought, no matter how strong they are. And so, investing in a military victory, investing in this narrative that we will get some kind of victor, is probably not going to serve us.

I think we need to focus on the victims. And as you said, the U.N. envelope for humanitarian aid is very poorly resourced. It’s, I think, about 16% funded. And those are the people we need to focus on. There are emergency response rooms and mutual aid groups. These are volunteer-led groups of civil society actors that are at the brunt, at the forefront of the humanitarian relief, and nobody is really looking at them, I think, sufficiently. Nobody is helping them. No one is putting money and resources to them to enable them to save lives. And they have won a string of awards. They have been nominated for the Nobel Peace Prize twice. But we have not seen that translate to political support, and we haven’t seen that translate sufficiently to financial support. I think investing in those groups, both for the humanitarian response and for, you know, frankly, allowing them to weave back the social fabric that this war has ripped apart, I think that is a much better investment of time than focusing on the belligerent parties.

AMY GOODMAN : I want to thank you, Kholood Khair, for being with us, Sudanese political analyst, head of the Confluence Advisory. It’s great to have you in our studio.

KHOLOOD KHAIR : Thank you.

AMY GOODMAN : And Nathaniel Raymond, executive director of Humanitarian Research Lab at the Yale School of Public Health, monitoring El Fasher. We’ll link to your reports at democracynow.org.

Coming up, torture and enforced disappearances at ICE jails in Florida, from the Everglades to Krome. And then we will look at the case of the mother of the nephew of the White House press spokesperson. She was just released from an ICE jail yesterday. Stay with us.

[break]

AMY GOODMAN : Alice Gerrard and band singing “When I Loved You” at the Brooklyn Folk Festival.

The original content of this program is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License . Please attribute legal copies of this work to democracynow.org. Some of the work(s) that this program incorporates, however, may be separately licensed. For further information or additional permissions, contact us.

‘What to buy Dad for Christmas’: is retail ready for the AI shopping shift?

Guardian
www.theguardian.com
2025-12-10 13:00:01
As shoppers ask ChatGPT for inspiration, brands scramble to ensure their products appeal to the bots calling the shotsConsumer test drive: can AI do your Xmas gift shopping for you?Christmas shopping – some love it, to others it’s a chore, and this year for the first time many of us will outsource t...
Original Article

Christmas shopping – some love it, to others it’s a chore, and this year for the first time many of us will outsource the annual task of coming up with gift ideas to artificial intelligence.

While traditional internet search, social media – especially TikTok and Instagram – and simply wandering a local high street will still be the main routes to presents for most this year, about a quarter of people in the UK are already using AI to find the right products, according to PricewaterhouseCoopers.

For brands appealing to younger people, the revolution is well under way: the rival advisory firm KPMG says as many as 30% of shoppers aged 25-34 are using AI to find products, compared with 1% of those aged over 65.

Asking a large language model (LLM) such as ChatGPT or Gemini what you should get your father-in-law – rather than typing “whisky” or “socks” into Google or DuckDuckGo – may seem a small change in habits. However, it marks a sea change for retailers accustomed to paying search engines to promote their listings.

LLMs allow users to ask questions in conversational language, perhaps by speaking into their computer or phone. Instead of just providing a list of links, they offer specific suggestions with the potential for big sales for items that are regularly recommended.

The chatbots produce their responses by scraping the internet and inbuilt datasets for relevant information, with some sources given more trusted status than others.

Companies large and small are scrambling to adapt to this new world where the keywords and advertising deals previously central to web marketing hold less importance than the reviewers’ opinions, accurate availability information and product details read by LLMs such as OpenAI’s ChatGPT, Google’s Gemini and Meta’s Llama.

The shake-up may create an opening for independent businesses to cut through online, but some big brands are concerned they will be lost in a wild west where it is unclear how to reach the consumer. Marketers must now appeal not only to shoppers directly but also to their AI bots.

“Retailers can’t buy their way into the search – they have to earn it,” says Emma Ford, the director of digital transformation for PwC UK. “The experience, expertise, authenticity and trustworthiness [of a brand online] help. Sentiment across the internet is really important.”

Several large UK retailers have told the Guardian they already have teams on the case looking at a wide variety of tactics, from making sure they appear in Reddit forums – a key source for some platforms – to responding to reviews on Google or Trustpilot, and ensuring AI models can access the correct product data.

While some say they are being cautious with resources, amid signs certain individual LLMs could disappear as rapidly as they have sprung up, the belief is that this new way of interacting online is here to stay.

Nickyl Raithatha, the chief executive of the online card and gift seller Moonpig, says AI search’s relevance for companies this year is relatively low but his company is well-prepared for rapid change .

He says Moonpig is ensuring its products are picked up in AI search by using generative engine optimisation (GEO) techniques such as “online content with people discussing the best way to make someone happy on Mothers Day” in its own content or on discussions boards and in YouTube videos. He adds: “There is a growing science around this and we are all learning.”

Ford says businesses are still feeling their way into the nuances of how the technology will find and respond to their online presence. Online reviews, for example, are clearly a factor in AI decision-making, but it is not clear how much importance is placed on particular platforms or how they rank against other factors such as reliable availability data, longevity of a brand or secure payment options.

It may be that suppliers that have been around longer and have a broader profile are foregrounded, but their long history of ups and downs could also play against them.

“I do think AI will change retail for the next 20 years,” says Peter Ruis, the managing director of John Lewis. He contends that established brands such as his will benefit from having a strong reputation with the technology in place to sell online, while shoppers could discover they stock items previously assumed to have been available only at a specialist.

In future, industry watchers believe, ChatGPT, Amazon and Google are likely to try to monetise their AI platforms with some form of paid search or featured ads.

More sophisticated “AI agent” models are also being developed – bots that can autonomously perform complex multistage tasks such as seeking out the best deals, placing orders and organising delivery.

For example, it could be possible for these digital secretaries to negotiate offers tailored to particular customers, such as bundling together a number of furniture purchases from various outlets during a move, which have been customised to fit budget, style and delivery preferences, according to the advisory firm McKinsey.

skip past newsletter promotion

That could lead retailers into allowing their systems to flex product prices to attract particular searchers.

Organising the return of an unwanted item could also be taken on by AI agents, with one acting on behalf of the shoppers and the other for the retailer.

However, such technology is fraught with potential pitfalls. Retailers will need systems that can cope with a potential flurry of queries and to have clear rules on who might be responsible for glitches such as unwanted purchases made by a bot.

In the US, the online marketplace Etsy was the first to team up with ChatGPT to make it possible to pay for goods via the LLM’s instant checkout service. The e-commerce platform Shopify and the retailers Walmart and Target swiftly followed. While the deals do not appear to prioritise their products in searches, the inclusion of a “buy” button for their goods could put them ahead of the pack.

Anna Bancroft, a partner in PwC’s digital transformation team, points out that under current UK rules it is not possible for an AI bot to make a purchase on behalf of a human, and regulation would need to change for such systems to run without human oversight. She says retailers and shoppers are cautious about giving the robots access to customer data and handling payment.

There are also concerns about agents being susceptible to manipulation, as Microsoft has found in research simulations . Meanwhile, tech retail players are becoming territorial about who gets to crawl whose data.

Last month, Amazon sued the AI company Perplexity over its shopping feature that automates placing orders for users. Amazon accused the startup of covertly accessing customer accounts and disguising AI activity as human browsing. Perplexity hit back , defending users’ right to delegate their shopping to AI agents and calling the suit “a bully tactic to suppress competition”.

In this rapidly shifting landscape, Ford suggests independent retailers may have a chance to shine. “Independents have potential to go faster,” she says, with the ability to respond nimbly without having to sign off large budgets.

Michelle Ovens, the founder of Small Business Britain, which advises independent retailers on how to survive on the changing high street, agrees. “[Independent businesses] don’t necessarily need to spend a lot of money. You don’t necessarily need a big team,” she says.

Ovens advises local shopkeepers to ask AI platforms themselves how best to make sure they can appear. “Be clear about who you are,” she says, with a description making clear that you are an independent specialist, up-to-date pictures and “encourage customers who have got experience of the brand to give a good review”.

However, all of this should not stand ahead of making a website engaging and easy to shop on, Ovens adds. “There will not be a dramatic shift this Christmas. We’ll see change over time and operators will rise to the challenge.”

Headlines for December 10, 2025

Democracy Now!
www.democracynow.org
2025-12-10 13:00:00
Hamas Calls for Greater International Pressure on Israel Before Agreeing to Next Phase of Ceasefire Deal, UAE-Backed Forces Claim Control of Oil-Rich Southern Half of Yemen, Trump Presses Zelensky to Accept a Peace Deal with Russia, Nobel Peace Prize Winner Machado Travels to Oslo But Unable to Atte...
Original Article

Hi there,

Democracy Now!’s independent journalism is more vital than ever. We continue to spotlight the grassroots movements working to keep democracy alive. No time has been more crucial to amplify the voices that other outlets ignore. Thanks to a group of generous donors, all donations made today will be TRIPLED, which means your $15 gift is worth $45. Please donate today, so we can keep delivering fact-based, fearless reporting.

Every dollar makes a difference

. Thank you so much!

Democracy Now!
Amy Goodman

Non-commercial news needs your support.

We rely on contributions from you, our viewers and listeners to do our work. If you visit us daily or weekly or even just once a month, now is a great time to make your monthly contribution.

Please do your part today.

Donate

Independent Global News

Donate

Headlines December 10, 2025

Watch Headlines

Hamas Calls for Greater International Pressure on Israel Before Agreeing to Next Phase of Ceasefire Deal

Dec 10, 2025

Hamas is calling for greater international pressure on Israel before agreeing to the next phase of the U.S.-brokered Gaza ceasefire. Hamas says Israel must open key border crossings, halt military strikes and home demolitions, and allow far more humanitarian aid into the besieged enclave. Palestinian health officials report that since the ceasefire took effect on October 10, Israel has killed at least 376 Palestinians. Meanwhile, Israel continues to bar international journalists from independently entering Gaza, after the country’s top court on Tuesday delayed a legal challenge seeking to overturn the media restrictions. Meanwhile, UNICEF says that 9,300 children in Gaza were treated for severe acute malnutrition in October when the first phase of the ceasefire deal came into effect. This is UNICEF spokesperson Tess Ingram.

Tess Ingram : “Mothers cannot afford to buy their children the nutritious food that’s available in the markets. Fruits and vegetables, which are now here, remain very expensive, and animal products like dairy and meat are even more so. For example, a UNICEF market survey done in November found that meat still, on average, costs about U.S. $20 a kilo, so most families can’t access this. And that’s why we’re still seeing high rates of malnutrition.”

UAE -Backed Forces Claim Control of Oil-Rich Southern Half of Yemen

Dec 10, 2025

In news from Yemen, forces backed by the United Arab Emirates have claimed control of the oil-rich southern half of Yemen, including the city of Aden. Analysts say the military advance could result in renewed fighting between UAE - and Saudi-backed groups, as well as southern Yemen possibly becoming an independent country again.

Trump Presses Zelensky to Accept a Peace Deal with Russia

Dec 10, 2025

President Trump is publicly pressing Ukrainian President Volodymyr Zelensky to accept a peace deal, saying Ukraine is “losing” the war with Russia. In an interview with Politico, Trump also said it was time for Ukraine to hold elections. Zelensky responded by saying Ukraine could soon be ready for long-delayed elections if the U.S. ensures security.

Nobel Peace Prize Winner Machado Travels to Oslo But Unable to Attend Award Ceremony

Dec 10, 2025

In Oslo, Norway, the 2025 Nobel Peace Prize was awarded to the right-wing Venezuelan opposition leader María Corina Machado, but she did not attend the ceremony. Her daughter accepted the prize on her behalf. Ahead of the ceremony, Machado said she was heading to Oslo but would not arrive in time for the event. Machado has been in hiding for the past year. On Tuesday night, hundreds of protesters marched in Oslo to condemn the selection of Machado, who has supported Trump’s threats against the Venezuelan government. In October, she dedicated the peace prize to President Trump.

Two U.S. F-18 Fighter Jets Enter Venezuelan Airspace for 40 Minutes

Dec 10, 2025

Image Credit: US Central Command via X

In related news, two U.S. F-18 fighter jets entered Venezuelan airspace for 40 minutes on Tuesday as the U.S. escalates its threats against the Maduro government. The jets circled the Gulf of Venezuela near the city of Maracaibo.
Meanwhile, the American Civil Liberties Union and the Center for Constitutional Rights have sued the Trump administration, seeking the release of the secret legal memo that has been used to justify the U.S. campaign targeting alleged drug boats in the Caribbean and the Pacific. The U.S. has struck at least 22 boats, killing 87 civilians, since September. On Tuesday, Defense Secretary Pete Hegseth gave a classified briefing to members of Congress, but he refused to commit to show lawmakers the full unedited video of a September 2 strike on two shipwrecked men who had survived an earlier U.S. strike that killed nine.

Honduran President Castro Accuses Trump of Interfering in Presidential Election

Dec 10, 2025

Honduran President Xiomara Castro has accused President Trump of interfering in Honduras’s recent election and said an “electoral coup” is occurring. Honduran election officials are still processing ballots from the November 30 election after numerous delays. The Trump-backed candidate Nasry Asfura has a narrow lead over Salvador Nasralla, who has also alleged fraud. President Xiomara Castro spoke on Tuesday.

President Xiomara Castro : “In this election, the people were subjected to coercion, blackmail, extortion, tricks, fraud and the manipulation of the preliminary results transmission system. These threats are a direct attack on the popular will.”

Trump Attacks Rep. Omar in Racist, Expletive-Filled Rant While Calling for More Immigration from Northern Europe

Dec 10, 2025

At a rally in Pennsylvania Tuesday, President Trump again attacked Democratic Congressmember Ilhan Omar in a racist, expletive-filled rant, while calling for more immigration from northern European nations.

President Donald Trump : “We had a meeting, and I said, 'Why is it we only take people from shithole countries?' Right? Why can’t we have some people from Norway, Sweden? Just a few. Let’s have a few, from — from Denmark. Do you mind sending us a few people? Send us some nice people. … Ilhan Omar, whatever the hell her name is, with the little — with the little turban, I love her. She comes in, does nothing but bitch.

During the speech, President Trump also called concerns over affordability a “hoax.” In response, Democratic Congressmember Ilhan Omar posted on social media, “Trump’s obsession with me is beyond weird. He needs serious help. Since he has no economic policies to tout, he’s resorting to regurgitating bigoted lies instead. He continues to be a national embarrassment.”

Florida Governor DeSantis Designates CAIR a Foreign Terrorist Organization

Dec 10, 2025

Florida’s Republican Governor Ron DeSantis signed an executive order Monday declaring the Council on American-Islamic Relations ( CAIR ) a foreign terrorist organization. The move follows a similar declaration issued last month by Texas’s Republican Governor Greg Abbott. Like in Texas, the Florida executive order also designates the Muslim Brotherhood a foreign terrorist organization. Here’s Imran Ghani, director of CAIR’s Houston chapter.

Imran Ghani : “So, these two governors are fomenting anti-Muslim hate, bigotry, and these accusations are totally conspiracy-based and done to stoke fear of Muslims. … From a human perspective, it continues to make Muslims, who are part of the cultural fabric of American society — it others us. Muslims have been around for hundreds of years.”

Texas Governor Abbott Pushes Turning Point USA in Public High Schools Across the State

Dec 10, 2025

Texas’s Republican Governor Greg Abbott announced that the state will be partnering with Turning Point USA to establish chapters of the right-wing youth organization launched by the late Charlie Kirk in every high school in Texas. Governor Abbott said, “Let me be clear: Any school that stands in the way of a Club America program in their school should be reported immediately to the Texas Education Agency.” Turning Point USA’s high school chapters are called Club America. The move follows similar efforts by state officials in Oklahoma and Florida to establish Turning Point USA chapters in high schools.

One Student Dead and Another in Critical Condition After Shooting at Kentucky State University

Dec 10, 2025

One student is dead and another was in critical condition after a shooting at Kentucky State University on Tuesday. Officials say a suspect, who is not a student at the school, is in custody after the incident. According to the Gun Violence Archive, there have been 387 mass shootings in the U.S. so far this year.

Education Department Moves to End Biden’s Student Loan Repayment Program

Dec 10, 2025

The Trump administration announced that it has reached an agreement with several Republican-led states to end former President Biden’s student loan repayment program. The program, called SAVE , which stands for Saving on a Valuable Education, is an income-driven repayment program which currently has more than 7 million borrowers. In a statement, the Education Department said it plans to stop all new enrollments under the plan, deny any pending applications and transition borrowers into other repayment plans. Natalia Abrams, president of the Student Debt Crisis Center, said, “Borrowers need real relief and stability, not a return to unaffordable, costly student loan payments that push them closer to financial crisis.”

Judge Rules Tufts Student Rümeysa Öztürk Can Resume Teaching and Conducting Research

Dec 10, 2025

Image Credit: Rümeysa Öztürk's Legal Team

A federal judge has ruled Tufts University Ph.D. student Rümeysa Öztürk can resume teaching and conducting research. In March, the Turkish-born student was abducted by masked immigration agents near Boston and then sent to an ICE jail in Louisiana, where she was held for six weeks. She had been targeted for co-writing a student article on Gaza. Up until now she had not been able to teach or do research because the Trump administration had revoked her visa.

Illinois Governor Pritzker Signs Bills to Protect Immigrants from ICE Agents

Dec 10, 2025

Illinois Governor JB Pritzker has signed bills aimed to prevent federal immigration agents from making arrests near courthouses, hospitals or colleges. Another new law will also make it easier for people to sue federal agents if their constitutional rights have been violated. Meanwhile, a coalition of civil and immigrant rights groups are calling for the immediate closure of an ICE detention jail at Fort Bliss in El Paso, Texas. The groups allege detained immigrants have been beaten and sexually abused, while being denied adequate medical care and food.

Alina Habba Resigns as U.S. Attorney in New Jersey

Dec 10, 2025

President Trump’s former personal lawyer Alina Habba, who was installed as U.S. attorney for New Jersey, has resigned after a panel of federal judges ruled that she was serving in her position unlawfully. Attorney General Pam Bondi said Habba would remain at the Justice Department to serve as a senior adviser.

Miami Elects First Democratic Mayor in Nearly 30 Years

Dec 10, 2025

In election news, voters in Miami, Florida, have elected a Democratic mayor for the first time in nearly 30 years. In a stunning upset, former County Commissioner Eileen Higgins received about 59% of the vote, defeating Republican Emilio González, who had been endorsed by Trump. Eileen Higgins will become Miami’s first female mayor.
In another setback for Republicans, in Georgia, Democrat Eric Gisler flipped a state House seat in a district Trump won by double digits last year.

NYC Comptroller Brad Lander Launches Primary Challenge Against Rep. Dan Goldman

Dec 10, 2025

Image Credit: Facebook/Brad Lander

New York City Comptroller Brad Lander is launching a primary challenge against Democratic Congressmember Dan Goldman. The congressional district covers the southern part of Manhattan and parts of Brooklyn. New York City Mayor-elect Zohran Mamdani endorsed Lander, saying, “He has been a trusted ally and partner of mine and I’m proud to support him as I know he’ll continue delivering for those who need government to show up for them the most.”

The original content of this program is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License . Please attribute legal copies of this work to democracynow.org. Some of the work(s) that this program incorporates, however, may be separately licensed. For further information or additional permissions, contact us.

Non-commercial news needs your support

We rely on contributions from our viewers and listeners to do our work.
Please do your part today.

Make a donation

McDonald's pulls AI Christmas ad after backlash

Hacker News
www.bbc.co.uk
2025-12-10 12:57:38
Comments...
Original Article

McDonald's has taken down a Christmas advert made with Artificial Intelligence (AI) following online backlash.

The 45-second advert was produced with generative AI clips and released publicly on McDonald's Netherlands YouTube channel on 6 December.

Viewers on social media denounced the use of AI in the film, with one commenter calling it "the most god-awful ad I've seen this year" , external .

On 9 December McDonald's Netherlands removed the video, adding in a statement to BBC News that the moment served as "an important learning" as the company explored "the effective use of AI".

The advert was created for McDonald's by Dutch company TBWA\Neboko and US production company The Sweetshop.

Adverts which include generative AI have become a growing trend among major brands, such as Coca-Cola, particularly for the Christmas season.

The McDonald's advert depicted things that can go wrong during the Christmas break, using the slogan "the most terrible time of the year", and suggesting the time was better spent in the company of the fast food giant.

Following its release, viewers criticised the film's uncanny-looking characters and large number of stitched together clips, calling it "creepy" , external and "poorly edited".

As clips made using generative AI are more likely to distort the longer they run for - most clips made using the process tend to be roughly six to 10 seconds long - even a 45-second advert would likely consist of many videos edited together.

The video also provoked concerns for job displacement in the industry, with one Instagram comment , external noting: "No actors, no camera team..welcome to the future of filmmaking. And it sucks."

Following the video being made private on the McDonald's Netherlands YouTube channel, The Sweetshop's chief executive Melanie Bridge defended the advert.

As quoted in Futurism , external , she said the production process took "seven weeks" where the team "hardly slept" and created "thousands of takes - then shaped them in the edit just as we would on any high-craft production".

"This wasn't an AI trick," she said. "It was a film."

In a statement to BBC News, McDonald's Netherlands said the video was meant to "reflect the stressful moments that can occur during the holidays" but had decided to remove the advert.

"This moment serves as an important learning as we explore the effective use of AI," it said.

Where normally a high-publicity Christmas campaign could take up to a year to pull off, companies have begun to look to firms which can produce films in a much shorter time span, using prompts from generative AI tools to create new video content.

Coca-Cola seems to have been able to sway at least some of the general public with its second AI-generated Christmas ad in a row.

While the use of AI to create the advert has been divisive, a report from analytics company Social Sprout found it had a 61% "positive sentiment rating" , external from commenters online.

But several other businesses such as the Italian luxury fashion house Valentino have come under fire for using the technique in their campaigns, with critics calling Valentino's advert "cheap" and "lazy".

BBC News has contacted The Sweetshop and TBWA\Neboko for comment.

US could ask foreign tourists for five-year social media history before entry

Hacker News
www.bbc.co.uk
2025-12-10 12:37:30
Comments...
Original Article

Tourists from dozens of countries including the UK could be asked to provide a five-year social media history as a condition of entry to the United States, under a new proposal unveiled by American officials.

The new condition would affect people from dozens of countries who are eligible to visit the US for 90 days without a visa, as long as they have filled out an Electronic System for Travel Authorization (ESTA) form.

Since returning to the White House in January, President Donald Trump has moved to toughen US borders more generally - citing national security as a key reason.

Analysts say the new plan could pose an obstacle to potential visitors, or harm their digital rights.

The US expects a major influx of foreign tourists next year, as it hosts the men's football World Cup alongside Canada and Mexico, and for the 2028 Olympics in Los Angeles.

The proposal document was filed by Customs and Border Protection (CBP) and the Department of Homeland Security (DHS), of which the agency is part.

US media reported that it appeared in the Federal Register, which is the official journal of the US government. The BBC has asked DHS for comment.

The proposal says "the data element will require ESTA applicants to provide their social media from the last 5 years", without giving further details of which specific information will be required.

The existing ESTA requires a comparatively limited amount of information from travellers, as well as a one-off payment of $40 (£30). It is accessible to citizens of about 40 countries - including the UK, Ireland, France, Australia and Japan - and allows them to visit the US multiple times during a two-year period.

As well as the collection of social media information, the new document proposes the gathering of an applicant's telephone numbers and email addresses used over the last five and 10 years respectively, and more information about their family members.

The text cites an executive order from Trump in January, titled "Protecting the United States From Foreign Terrorists and Other National Security and Public Safety Threats".

The Trump administration has previously required foreign nationals to make their social media accounts public if they are applying for student visas or H1B visas for skilled workers - the latter of which now also entail a much higher fee.

A senior state department official said of the student visa policy: "It is an expectation from American citizens that their government will make every effort to make our country safer, and that is exactly what the Trump Administration is doing every single day."

Officers were instructed to screen for those "who advocate for, aid, or support designated foreign terrorists and other threats to national security; or who perpetrate unlawful anti-Semitic harassment or violence".

As part of the administration's broader effort to toughen borders, officials recently said an existing travel ban - affecting 19 countries in Africa, the Middle East and the Caribbean - could soon be expanded.

That move was announced in the wake of a shooting attack on two National Guard members in Washington DC, in which an Afghan man has been named as the suspect.

The new proposal regarding ESTA data collection for tourists invites views from the public for 60 days.

Sophia Cope, of digital rights organisation the Electronic Frontier Foundation, criticised the plan, telling the New York Times that it could "exacerbate civil liberties harms".

Meanwhile, immigration law practice Fragomen suggested there could be practical impacts as applicants could face longer waits for ESTA approvals.

Experts have previously suggested that the changes to travel policies introduced under Trump have had an impact on the American tourism industry.

Earlier this year, the World Travel & Tourism Council said the US was the only one of 184 economies that it analysed that was expected to see a decline in international visitor spending in 2025.

Other Trump administration policies have also appeared to impact tourism to the country, such as many Canadians boycotting US travel as a form of protest against Trump's tariffs.

October marked the 10th straight month of decline in the number of Canadian travellers to the US. In the past, Canadians have made up about a quarter of all international visitors to the US, spending more than $20bn (£15.1bn) a year, according to the US Travel Association.

Ukrainian hacker charged with helping Russian hacktivist groups

Bleeping Computer
www.bleepingcomputer.com
2025-12-10 12:26:32
U.S. prosecutors have charged a Ukrainian national for her role in cyberattacks targeting critical infrastructure worldwide, including U.S. water systems, election systems, and nuclear facilities, on behalf of Russian state-backed hacktivist groups. [...]...
Original Article

Russian hacker

U.S. prosecutors have charged a Ukrainian national for her role in cyberattacks targeting critical infrastructure worldwide, including U.S. water systems, election systems, and nuclear facilities, on behalf of Russian state-backed hacktivist groups.

On Tuesday, 33-year-old Victoria Eduardovna Dubranova (also known as Vika, Tory, and SovaSonya) was arraigned on charges related to her alleged role in NoName057(16), after being extradited to the U.S. earlier this year for supporting CyberArmyofRussia_Reborn (CARR).

Dubranova has pleaded not guilty in both cases and is now scheduled for trial in February (on the NoName indictment) and April 2026 (on the CARR matter).

As the indictment reveals, NoName057(16) was a state-sanctioned project partially administered by multiple threat actors, as well as The Center for the Study and Network Monitoring of the Youth Environment (CISM), an information technology organization established by order of the Russian president in October 2018.

The NoName Russian hacktivist group developed a custom distributed denial-of-service (DDoS) tool called DDoSia, and recruited volunteers to use it in DDoS attacks against government agencies, financial institutions, and critical infrastructure, including railways and ports.

U.S. prosecutors also noted that the Main Directorate of the General Staff of the Armed Forces of the Russian Federation (GRU) founded, funded, and directed CARR, a pro-Russia hacktivist group with over 75,000 Telegram followers and more than 100 members (including teenagers), that claimed credit for hundreds of cyberattacks against victims worldwide.

CARR has attacked public drinking water systems across several U.S. states, causing damage to industrial controls and spilling hundreds of thousands of gallons of drinking water, and breached the systems of a Los Angeles meat processing facility in November 2024, triggering an ammonia leak and spoiling thousands of pounds of meat, according to the indictment . Additionally, the group targeted websites of nuclear regulatory entities and U.S. election infrastructure.

A GRU officer using the "Cyber_1ce_Killer" online handle instructed CARR leadership on targets and financed the group's access to distributed denial-of-service-for-hire services, the prosecutors added. CARR had over 75,000 Telegram followers and more than 100 members, including teenagers.

Victoria Eduardovna Dubranova
Victoria Eduardovna Dubranova (U.S. Justice Department)

If found guilty, Dubranova faces up to 27 years on the CARR charges and up to 5 years on the NoName charges.

"The defendant's illegal actions to tamper with the nation's public water systems put communities and the nation's drinking water resources at risk," said Craig Pritzlaff, Acting Assistant Administrator at the Environmental Protection Agency (EPA), in a Tuesday statement.

"These criminal charges serve as an unequivocal warning to malicious cyber actors in the U.S. and abroad: EPA's Criminal Investigation Division and our law enforcement partners will not tolerate threats to our nation's water infrastructure and will pursue justice against those who endanger the American public."

Yesterday, the U.S. State Department also announced rewards of up to $2 million for information on individuals associated with CARR and up to $10 million for any details on individuals linked with NoName.

Additionally, in a joint advisory with the FBI, NSA, European Cybercrime Centre (EC3), and various other cybersecurity and law enforcement agencies worldwide, CISA has warned that pro-Russia hacktivist groups, such as CARR, NoName, Z-Pentest, and Sector16, are targeting critical infrastructure organizations. These attacks can have varying degrees of impact, including the potential for physical damage.

In July 2024, the U.S. Treasury Department's Office of Foreign Assets Control (OFAC) also sanctioned two CARR members, Denis Olegovich Degtyarenko and Yuliya Vladimirovna Pankratova (the group's leader and a primary hacker), for cyberattacks against U.S. critical infrastructure.

tines

Break down IAM silos like Bitpanda, KnowBe4, and PathAI

Broken IAM isn't just an IT problem - the impact ripples across your whole business.

This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.

FBI Warns of Fake Video Scams

Schneier
www.schneier.com
2025-12-10 12:05:37
The FBI is warning of AI-assisted fake kidnapping scams: Criminal actors typically will contact their victims through text message claiming they have kidnapped their loved one and demand a ransom be paid for their release. Oftentimes, the criminal actor will express significant claims of violence t...
Original Article

The FBI is warning of AI-assisted fake kidnapping scams:

Criminal actors typically will contact their victims through text message claiming they have kidnapped their loved one and demand a ransom be paid for their release. Oftentimes, the criminal actor will express significant claims of violence towards the loved one if the ransom is not paid immediately. The criminal actor will then send what appears to be a genuine photo or video of the victim’s loved one, which upon close inspection often reveals inaccuracies when compared to confirmed photos of the loved one. Examples of these inaccuracies include missing tattoos or scars and inaccurate body proportions. Criminal actors will sometimes purposefully send these photos using timed message features to limit the amount of time victims have to analyze the images.

Images, videos, audio: It can all be faked with AI. My guess is that this scam has a low probability of success, so criminals will be figuring out how to automate it.

Tags: , , , ,

Posted on December 10, 2025 at 7:05 AM 1 Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.

Get in Line - superfast SPSC Queue

Lobsters
abhikja.in
2025-12-10 11:51:11
Comments...
Original Article

4-ish years back I started writing a bounded single-consumer single-producer queue for some project I don’t remember, but I couldn’t get a correct implementation. Earlier this week I needed the same queue again, and instead of relying on popular well-tested libraries out there and being a productive developer, I decided to write one down myself. The final implementation gives about 40GiB/s throughput and 80ns latency (though not both of these at the same time), which is about as fast as you can get on my M3 Mac. I only tested this on Apple Silicon CPUs, but it should work on other architectures too as there isn’t really any ARM-specific stuff in the implementation.

You can find the crate here , if you want to see the final code directly. I also re-wrote the queue in a separate git repo so that you can read the exact code for the benchmarks at different times in this post, which you can find here , which is also linked after each section with the correct branch for that section.

Starting Simple

The simplest queue you could write would just be a VecDeque wrapped in Arc<Mutex<..>> :

struct Queue<T> {

buffer: VecDeque<T>,

capacity: usize,

}

struct Sender<T> {

inner: Arc<Mutex<Queue<T>>>,

}

impl<T> Sender<T> {

fn send(&self, item: T) {

loop {

let mut lock = self.inner.lock().unwrap();

if lock.buffer.len() < lock.capacity {

lock.buffer.push_back(item);

break;

}

}

}

}

struct Receiver<T> {

inner: Arc<Mutex<Queue<T>>>,

}

impl<T> Receiver<T> {

fn recv(&self) -> T {

loop {

if let Some(val) = self.inner.lock().unwrap().buffer.pop_front() {

return val;

}

}

}

}

the performance (…is terrible compared to what I claimed, we will get there):

latency/size_512 time: [1.3195 µs 1.5799 µs 2.1086 µs]

latency/size_4096 time: [1.9729 µs 3.1022 µs 3.9848 µs]

throughput/size_512 time: [10.119 ms 10.686 ms 11.473 ms]

thrpt: [8.7164 Melem/s 9.3578 Melem/s 9.8820 Melem/s]

thrpt: [133.00 MiB/s 142.79 MiB/s 150.79 MiB/s]

throughput/size_4096 time: [7.4096 ms 7.8135 ms 8.1513 ms]

thrpt: [12.268 Melem/s 12.798 Melem/s 13.496 Melem/s]

thrpt: [187.19 MiB/s 195.29 MiB/s 205.93 MiB/s]

Note that even though this is latency for round trip, one way latency would not be that different, maybe slightly lower. This is because the atomics are also synchronized and need to travel from core-to-core.

Same test will be used later too for consistency. size here is the capacity with which the queue was created.

We could try using RwLock ( link ) and avoid holding an exclusive lock when reading head/tail, but that wouldn’t bring any improvements as there are only two threads working over this queue and one of them exclusively has to get write access. So effectively it is the same as a Mutex , because the write-access acts like a mutex anyway.

Another major point is that these are running in a spin loop, which while inflating these benchmarks are really bad in real applications where you could be doing other work in the background. Though if waiting/sending data is the only thing you can do, then these are fine.

Code at this point.

More details about queue with RwLock If you want to see the performance compared to one with Mutex :

latency/size_512 time: [1.4813 µs 3.1652 µs 4.9153 µs]

change: [+5.9038% +71.260% +155.23%] (p = 0.05 > 0.05)

No change in performance detected.

latency/size_4096 time: [1.8180 µs 2.5355 µs 3.2314 µs]

change: [−25.547% +19.284% +92.522%] (p = 0.50 > 0.05)

No change in performance detected.

throughput/size_512 time: [26.399 ms 31.534 ms 35.914 ms]

thrpt: [2.7844 Melem/s 3.1712 Melem/s 3.7880 Melem/s]

thrpt: [42.487 MiB/s 48.389 MiB/s 57.800 MiB/s]

change:

time: [+155.24% +186.10% +221.52%] (p = 0.00 < 0.05)

thrpt: [−68.898% −65.047% −60.821%]

Performance has regressed.

throughput/size_4096 time: [4.8831 ms 8.7747 ms 12.557 ms]

thrpt: [7.9635 Melem/s 11.396 Melem/s 20.479 Melem/s]

thrpt: [121.51 MiB/s 173.89 MiB/s 312.48 MiB/s]

change:

time: [−33.908% +2.2474% +49.381%] (p = 0.93 > 0.05)

thrpt: [−33.057% −2.1980% +51.304%]

No change in performance detected.

Performance has slightly dropped because now we are trying to lock twice when sending.

Code with RwLock .

Conditional Variables

How about if we only wake up the other thread when there is actually some data to send/some space to send data, instead of just try to lock-unlock in a loop? We could use conditional variables which will wake the thread up only when other threads send a signal over. Another nice thing is that we won’t be wasting CPU spin looping (benchmarks can be misleading because good performance in benchmarks does not mean it will work well with rest of the application you are writing).

struct Queue<T> {

buffer: VecDeque<T>,

buffer: Mutex<VecDeque<T>>,

condvar: Condvar,

capacity: usize,

}

impl<T> Sender<T> {

fn send(&self, item: T) {

let mut buffer_lock = self.inner.buffer.lock().unwrap();

loop {

if buffer_lock.len() < self.inner.capacity {

buffer_lock.push_back(item);

break;

}

buffer_lock = self.inner.condvar.wait(buffer_lock).unwrap();

}

self.inner.condvar.notify_one();

}

}

impl<T> Receiver<T> {

fn recv(&self) -> T {

let mut buffer_lock = self.inner.buffer.lock().unwrap();

loop {

if let Some(val) = buffer_lock.pop_front() {

self.inner.condvar.notify_one();

return val;

}

buffer_lock = self.inner.condvar.wait(buffer_lock).unwrap();

}

}

}

performance:

latency/size_512 time: [2.2194 µs 2.2302 µs 2.2409 µs]

change: [+14.922% +42.752% +75.530%] (p = 0.00 < 0.05)

Performance has regressed.

latency/size_4096 time: [2.2335 µs 2.2378 µs 2.2408 µs]

change: [−32.970% −10.855% +27.263%] (p = 0.55 > 0.05)

No change in performance detected.

throughput/size_512 time: [3.4748 ms 3.4836 ms 3.4912 ms]

thrpt: [28.643 Melem/s 28.706 Melem/s 28.778 Melem/s]

thrpt: [437.06 MiB/s 438.02 MiB/s 439.12 MiB/s]

change:

time: [−68.884% −66.884% −63.926%] (p = 0.00 < 0.05)

thrpt: [+177.21% +201.97% +221.38%]

Performance has improved.

throughput/size_4096 time: [3.4268 ms 3.4720 ms 3.5407 ms]

thrpt: [28.243 Melem/s 28.802 Melem/s 29.182 Melem/s]

thrpt: [430.95 MiB/s 439.48 MiB/s 445.27 MiB/s]

change:

time: [−59.404% −57.004% −54.640%] (p = 0.00 < 0.05)

thrpt: [+120.46% +132.58% +146.33%]

Performance has improved.

Compared to plain Mutex , there is a massive improvement in throughput. Latency is a bit down, which is expected as we are still doing some more work and not waiting in a tight(er) spin loop anymore. But we are also not burning CPU cycles.

If I wanted a simple, easy to understand, easy to maintain version, I would’ve stopped here. The performance is bad (especially compared to ones we will see later on), but this is enough for most small projects which just need a queue and don’t want to use an external library. Also this version is dead simple.

Though this is not good enough for production systems, and we can still do a lot better with very little complexity jump.

Code till now.

If we really think about it, do we really need to put a lock on the entire queue at once to push or pop? As the buffer size is constant, there is no reallocation of memory and no change of pointers. The sender only cares about the tail until the queue is full and the receiver only cares about the head until the queue is empty. We could just put a lock on the head pointer and tail pointer instead.

struct Queue<T> {

buffer: *mut MaybeUninit<T>,

head: Mutex<usize>,

tail: Mutex<usize>,

capacity: usize,

}

impl<T> Sender<T> {

fn send(&self, item: T) {

loop {

let mut tail = self.inner.tail.lock().unwrap();

let head = *self.inner.head.lock().unwrap();

let next_tail = (*tail + 1) % self.inner.capacity;

if next_tail != head {

unsafe {

self.inner.buffer.add(*tail).write(std::mem::MaybeUninit::new(item));

}

*tail = next_tail;

break;

}

}

}

}

impl<T> Receiver<T> {

fn recv(&self) -> T {

loop {

let mut head = self.inner.head.lock().unwrap();

let tail = *self.inner.tail.lock().unwrap();

if *head != tail {

let val = unsafe {

self.inner.buffer.add(*head).read().assume_init()

};

*head = (*head + 1) % self.inner.capacity;

return val;

}

}

}

}

You may have noticed already, but this causes a deadlock when:

  1. sender locks the tail
  2. the OS yeets the sender thread
  3. receiver locks the head
  4. now neither thread can proceed . We can fix that by always locking the tail first then head. Still this is just doing more work, locking two things instead of one. Performance is actually the worst so far:

latency/size_512 time: [4.2295 µs 4.3851 µs 4.6308 µs]

change: [+87.467% +95.116% +103.98%] (p = 0.00 < 0.05)

Performance has regressed.

latency/size_4096 time: [4.1396 µs 4.3851 µs 4.5334 µs]

change: [+88.570% +95.194% +101.23%] (p = 0.00 < 0.05)

Performance has regressed.

throughput/size_512 time: [805.15 ms 1.2389 s 1.6952 s]

thrpt: [58.991 Kelem/s 80.718 Kelem/s 124.20 Kelem/s]

thrpt: [921.74 KiB/s 1.2317 MiB/s 1.8952 MiB/s]

change:

time: [+23076% +35408% +49451%] (p = 0.00 < 0.05)

thrpt: [−99.798% −99.718% −99.569%]

Performance has regressed.

throughput/size_4096 time: [443.20 ms 581.29 ms 712.75 ms]

thrpt: [140.30 Kelem/s 172.03 Kelem/s 225.63 Kelem/s]

thrpt: [2.1408 MiB/s 2.6250 MiB/s 3.4429 MiB/s]

change:

time: [+13038% +16725% +20942%] (p = 0.00 < 0.05)

thrpt: [−99.525% −99.406% −99.239%]

Performance has regressed.

But this is the right direction (not obvious right now why that is the case, but will be explained later). If we can just avoid locking on both things, or at least avoid locking both most of the time, we have a shot at getting better performance.

Code till now.

Waiting to Sync (Shadow Variables)

The main cost after sharding is acquiring of locks, or more generally, syncing. One way to avoid the cost is to not lock as often. The producer doesn’t need to know exactly where the tail is at the moment, just that the tail is pointing to valid un-consumed data. As the producer is the only one modifying the tail, it is alright if the synced-tail is not pointing to the latest data. Same argument can be applied for head and the consumer. So we keep a “shadow” variable, which is occasionally synced with the real head and tail. The only time we actually sync the head/tail is when we find that it is no longer possible to pop/push more data as per the shadow variables.

struct Sender<T> {

inner: Arc<Queue<T>>,

tail: usize,

head: usize,

}

impl<T> Sender<T> {

fn send(&mut self, item: T) {

loop {

let next_tail = (self.tail + 1) % self.inner.capacity;

if next_tail != self.head {

unsafe {

self.inner.buffer.add(self.tail).write(std::mem::MaybeUninit::new(item));

}

let mut tail = self.inner.tail.lock().unwrap();

*tail = next_tail;

self.tail = next_tail;

break;

}

self.head = *self.inner.head.lock().unwrap();

}

}

}

And we similarly modify the receiver too. Unfortunately this does not lead to any increase in performance compared to the simple version with just a single Mutex :

latency/size_512 time: [3.8723 µs 5.0873 µs 5.8944 µs]

change: [+58.229% +92.926% +132.40%] (p = 0.00 < 0.05)

Performance has regressed.

latency/size_4096 time: [4.3719 µs 5.0350 µs 5.7585 µs]

change: [+57.399% +115.36% +185.58%] (p = 0.00 < 0.05)

Performance has regressed.

throughput/size_512 time: [3.6511 ms 3.7586 ms 3.8652 ms]

thrpt: [25.872 Melem/s 26.606 Melem/s 27.389 Melem/s]

thrpt: [394.77 MiB/s 405.98 MiB/s 417.92 MiB/s]

change:

time: [+1.8870% +5.5301% +8.9882%] (p = 0.01 < 0.05)

thrpt: [−8.2470% −5.2403% −1.8521%]

Performance has regressed.

throughput/size_4096 time: [3.3533 ms 3.4144 ms 3.4818 ms]

thrpt: [28.721 Melem/s 29.287 Melem/s 29.822 Melem/s]

thrpt: [438.24 MiB/s 446.89 MiB/s 455.04 MiB/s]

change:

time: [−3.7307% −0.8104% +2.5110%] (p = 0.62 > 0.05)

thrpt: [−2.4495% +0.8170% +3.8752%]

No change in performance detected.

Code till now.

At least we got the throughput back. Looking at the flamegraphs, we can still clearly see that waiting for locks is consuming bulk of the time.

There are various things that we can do to improve it further, like batching and increasing the size of the inner queue, but there aren’t any major “algorithmic” improvements left to do. We have exhausted the pure software logic; now we need to rely on better hardware primitives.

Atomics

We are wrapping integers in Mutex , which is quite a stupid and expensive thing to do for small, Copy -able data types like usize . Atomics are basically Mutex<usize> , but exploit the fact that a lot of operations on modern hardware are possible to do atomically on integers and thus we don’t need complicated mutual-exclusion things.

The standard way to use atomics is CAS operations which rely on comparing the values before swapping. So they run a “read-modify-write” loop, and the hardware supports doing so atomically via CAS operations. But because in our case only one thread modifies the integer (only sender modifies the tail, only receiver modifies the head), we don’t need to actually compare and can directly change the values from the correct thread.

struct Queue<T> {

buffer: *mut MaybeUninit<T>,

head: AtomicUsize,

tail: AtomicUsize,

capacity: usize,

}

impl<T> Sender<T> {

fn send(&mut self, item: T) {

loop {

let head = self.inner.head.load(Ordering::Acquire);

let tail = self.inner.tail.load(Ordering::Acquire);

let next_tail = (tail + 1) % self.inner.capacity;

if next_tail != head {

unsafe {

self.inner

.buffer

.add(tail)

.write(std::mem::MaybeUninit::new(item));

}

self.inner.tail.store(next_tail, Ordering::Release);

break;

}

}

}

}

impl<T> Receiver<T> {

fn recv(&mut self) -> T {

loop {

let tail = self.inner.tail.load(Ordering::Acquire);

let head = self.inner.head.load(Ordering::Acquire);

if head != tail {

let val = unsafe { self.inner.buffer.add(head).read().assume_init() };

let next_head = (head + 1) % self.inner.capacity;

self.inner.head.store(next_head, Ordering::Release);

return val;

}

}

}

}

Understanding Ordering in atomics I am assuming that you know about Atomics and Memory Ordering, but if you don’t (or need a refresher), here are some links:
  • Jon Gjengset’s video - Crust of Rust: Atomics and Memory Ordering - long video but explains everything, including the reasoning of why atomics are useful. Great beginner resource.
  • Rust STL’s official docs - if you prefer reading instead. Still goes into great depth.
  • cppreference.com’s page on std::memory_order ( link ) - if you want concise and dense docs for a quick refresher.

Just directly replacing Mutex with AtomicUsize in the basic version without shadowing gives great improvements:

latency/size_512 time: [128.05 ns 131.43 ns 137.80 ns]

change: [−97.355% −96.871% −96.202%] (p = 0.00 < 0.05)

Performance has improved.

latency/size_4096 time: [147.79 ns 151.98 ns 154.64 ns]

change: [−97.614% −96.935% −95.814%] (p = 0.00 < 0.05)

Performance has improved.

throughput/size_512 time: [2.0787 ms 2.0964 ms 2.1164 ms]

thrpt: [47.250 Melem/s 47.701 Melem/s 48.106 Melem/s]

thrpt: [720.98 MiB/s 727.86 MiB/s 734.04 MiB/s]

change:

time: [−44.700% −42.724% −40.683%] (p = 0.00 < 0.05)

thrpt: [+68.586% +74.593% +80.832%]

Performance has improved.

throughput/size_4096 time: [2.0233 ms 2.0299 ms 2.0360 ms]

thrpt: [49.117 Melem/s 49.264 Melem/s 49.423 Melem/s]

thrpt: [749.46 MiB/s 751.70 MiB/s 754.14 MiB/s]

change:

time: [−42.189% −40.825% −39.583%] (p = 0.00 < 0.05)

thrpt: [+65.517% +68.990% +72.976%]

Performance has improved.

We can still avoid some syncing via atomics by keeping a local copy of tail in receiver (and head in sender). Unfortunately this again makes no difference:

latency/size_512 time: [133.67 ns 136.56 ns 139.14 ns]

change: [−4.4807% −1.0442% +2.7171%] (p = 0.60 > 0.05)

No change in performance detected.

latency/size_4096 time: [130.44 ns 134.32 ns 137.54 ns]

change: [−14.212% −9.6386% −4.3449%] (p = 0.01 < 0.05)

Performance has improved.

throughput/size_512 time: [1.9316 ms 2.1315 ms 2.4965 ms]

thrpt: [40.055 Melem/s 46.915 Melem/s 51.772 Melem/s]

thrpt: [611.20 MiB/s 715.87 MiB/s 789.97 MiB/s]

change:

time: [−4.9192% +9.4505% +27.938%] (p = 0.32 > 0.05)

thrpt: [−21.837% −8.6345% +5.1737%]

No change in performance detected.

throughput/size_4096 time: [1.9885 ms 2.0915 ms 2.1651 ms]

thrpt: [46.188 Melem/s 47.812 Melem/s 50.290 Melem/s]

thrpt: [704.77 MiB/s 729.55 MiB/s 767.36 MiB/s]

change:

time: [−5.6552% −0.5175% +5.6131%] (p = 0.86 > 0.05)

thrpt: [−5.3148% +0.5202% +5.9942%]

No change in performance detected.

This is the code with just atomics, and this is code with shadowing.

Why did the shadowing not work? Honestly, I’ve got no clue. Logically it makes sense that reducing number of atomic ops should have an effect on performance somehow. My best guess is that adding shadowing didn’t actually reduce the number of atomic load/stores somehow.

EDIT: Also if you look under Extra stuff -> More shadowing of atomics for performance , you’ll see that my benchmarks at this point were biased towards contention-heavy usage, so probably this shadowing helped a lot with low-contention usage but I never benchmarked that.

Spin Loop Hint

Notice that in case the queue is empty/full we are effectively busy-looping/spin-looping. It is possible that the same thread is continuously loading the same atomic and not giving the other thread a chance to store/read from it themselves. We can fix this by using std::hint::spin_loop in the middle of the while loop right before the load. This leads to slightly better performance:

latency/size_512 time: [134.81 ns 142.88 ns 151.85 ns]

change: [−1.3404% +3.5886% +9.2331%] (p = 0.23 > 0.05)

No change in performance detected.

latency/size_4096 time: [118.24 ns 129.58 ns 138.32 ns]

change: [−13.630% −8.3261% −2.5195%] (p = 0.02 < 0.05)

Performance has improved.

throughput/size_512 time: [1.8775 ms 1.8978 ms 1.9248 ms]

thrpt: [51.953 Melem/s 52.693 Melem/s 53.264 Melem/s]

thrpt: [792.74 MiB/s 804.03 MiB/s 812.74 MiB/s]

change:

time: [−29.164% −17.840% −5.7520%] (p = 0.02 < 0.05)

thrpt: [+6.1030% +21.714% +41.170%]

Performance has improved.

throughput/size_4096 time: [1.7486 ms 1.7744 ms 1.8362 ms]

thrpt: [54.460 Melem/s 56.358 Melem/s 57.189 Melem/s]

thrpt: [830.99 MiB/s 859.95 MiB/s 872.63 MiB/s]

change:

time: [−13.679% −8.3516% −2.0571%] (p = 0.02 < 0.05)

thrpt: [+2.1003% +9.1127% +15.847%]

Performance has improved.

Using the spin_loop hint is usually bad for latency, but in this case it is clear that the spin loop was actually blocking the data from coming in by starving the sender and thus we even got a slight improvement (or no change) in latency.

Code at this point.

Relaxing the Ordering

When the Sender::send first loads the current tail, it is using Ordering::Acquire . This would’ve been necessary if other threads were also updating the tail atomic, but in this case the sender thread is the only one writing to tail. Also because of the data dependency, the sender thread’s CPU can not schedule the load before the store as (current tail is used to calculate the next tail). So we can get by with Ordering::Relaxed for this load of tail in sender. Similar argument can be applied for first load of head in Receiver::recv .

latency/size_512 time: [117.64 ns 124.73 ns 133.65 ns]

change: [−15.854% −10.442% −4.5425%] (p = 0.00 < 0.05)

Performance has improved.

latency/size_4096 time: [111.22 ns 114.58 ns 117.24 ns]

change: [−13.109% −7.6552% −2.1833%] (p = 0.02 < 0.05)

Performance has improved.

throughput/size_512 time: [723.39 µs 740.06 µs 751.51 µs]

thrpt: [133.07 Melem/s 135.12 Melem/s 138.24 Melem/s]

thrpt: [1.9828 GiB/s 2.0135 GiB/s 2.0599 GiB/s]

change:

time: [−61.733% −60.886% −60.138%] (p = 0.00 < 0.05)

thrpt: [+150.86% +155.66% +161.32%]

Performance has improved.

throughput/size_4096 time: [865.55 µs 883.62 µs 914.36 µs]

thrpt: [109.37 Melem/s 113.17 Melem/s 115.53 Melem/s]

thrpt: [1.6297 GiB/s 1.6864 GiB/s 1.7216 GiB/s]

change:

time: [−53.307% −50.817% −48.491%] (p = 0.00 < 0.05)

thrpt: [+94.141% +103.32% +114.16%]

Performance has improved.

Code at this point.

Removing False Sharing

Profiling the binary via the Instruments tool on MacOS (with its shitty UI and no support for exports like perf on Linux), we see that we are getting a lot of cache misses:

Sample L1 Cache Misses (Load) L1 Cache Misses (Store) L2 Cache Misses Function
6 (100.0%) 85,777 (100.0%) 19,410 (100.0%) 1,350 (100.0%) test
1 (16.7%) 49,060 (57.2%) 11,772 (60.6%) 805 (59.6%) core::sync::atomic::atomic_load
1 (16.7%) 30,107 (35.1%) 4,170 (21.5%) 58 (4.3%) core::ptr::non_null::NonNull::as_ref

If we can avoid these cache misses we can probably improve the performance a bit more. Say, if we can somehow always keep the tail in the cache of the CPU which is running Sender::send , then we eliminate the misses on stores to tail , and the only loads that will miss will be from Receiver::recv thread. Same for when keeping head in Receiver::recv ’s thread. We can (almost) achieve this by adding padding after each atomic, enough to completely fill out the cache line.

A cache line on Apple Silicon Macs is 128 bytes:

sysctl -a hw machdep.cpu | rg cachelinesize

> hw.cachelinesize: 128

If you are on linux you can use

lscpu -C

> NAME ONE-SIZE ALL-SIZE WAYS TYPE LEVEL SETS PHY-LINE COHERENCY-SIZE

> L1d 32K 512K 8 Data 1 64 1 64

> L1i 32K 512K 8 Instruction 1 64 1 64

> L2 1M 16M 8 Unified 2 2048 1 64

> L3 96M 128M 16 Unified 3 98304 1 64

and see under COHERENCY-SIZE .

What is the actual size of cacheline on Apple Silicon While getting the cacheline size via software gives us 128 bytes, I think physically it is actually 64 bytes. In this benchmark even if you change the alignment to 64, the performance remains exactly the same. You read more details about this on this paper ’s Section 3 (or read rest of the paper too, just for fun). Though Apple folks probably had a good reason to report 128 instead, and using larger alignment doesn’t really affect us anyway, except for maybe size of structs.

So we want to pad after head and tail just enough so that the atomics live on separate cache lines. C++ has a convenient static constexpr std::hardware_destructive_interference_size , but I couldn’t find an equivalent in Rust so I am just hardcoding 128 bytes to be safe.

#[align(128)]

struct Padding<T> {

value: T,

}

pub(crate) struct Queue<T> {

head: AtomicUsize,

head: Padding<AtomicUsize>,

tail: AtomicUsize,

tail: Padding<AtomicUsize>,

capacity: usize,

buffer: *mut MaybeUninit<T>,

}

latency/size_512 time: [88.922 ns 90.830 ns 94.060 ns]

change: [−31.258% −26.918% −22.850%] (p = 0.00 < 0.05)

Performance has improved.

latency/size_4096 time: [80.777 ns 81.812 ns 83.805 ns]

change: [−27.990% −24.698% −21.139%] (p = 0.00 < 0.05)

Performance has improved.

throughput/size_512 time: [258.84 µs 261.57 µs 265.48 µs]

thrpt: [376.68 Melem/s 382.30 Melem/s 386.34 Melem/s]

thrpt: [2.8065 GiB/s 2.8484 GiB/s 2.8784 GiB/s]

change:

time: [−65.388% −64.812% −63.999%] (p = 0.00 < 0.05)

thrpt: [+177.77% +184.19% +188.91%]

Performance has improved.

throughput/size_4096 time: [249.95 µs 251.46 µs 254.87 µs]

thrpt: [392.35 Melem/s 397.67 Melem/s 400.07 Melem/s]

thrpt: [2.9233 GiB/s 2.9629 GiB/s 2.9808 GiB/s]

change:

time: [−73.051% −72.075% −71.074%] (p = 0.00 < 0.05)

thrpt: [+245.71% +258.10% +271.08%]

Performance has improved.

Code at this point.

Now this is the point at which the queue is probably good enough for most applications. If you want a wait-free one you can just allow the send/recv to fail after one update to shadow variable instead of looping if the queue if full/empty. You could make it more power efficient at the cost of some latency by using std::thread::yield_now() . Even though these benchmarks only test how fast you can send usize s, you can also just use something like arena allocators to keep the latency low. Even the ProducerConsumerQueue in Facebook’s folly stops here, and if it’s good enough for them then it is probably good enough for most people.

I suspect this latency is as low as you can get. According to core-to-core-latency :

Num cores: 8

Num iterations per samples: 1000

Num samples: 300

1) CAS latency on a single shared cache line

0 1 2 3 4 5 6 7

0

1 53±1

2 43±0 36±0

3 35±0 35±0 35±0

4 35±0 36±0 34±0 34±0

5 35±0 34±0 34±0 34±0 35±0

6 34±0 34±0 34±0 34±0 34±0 35±0

7 35±0 35±0 35±0 34±0 34±0 34±0 34±0

Min latency: 33.6ns ±0.1 cores: (7,5)

Max latency: 53.3ns ±1.2 cores: (1,0)

Mean latency: 35.5ns

About core-to-core-latency app If we look inside , the read_write benchmark is basically another spsc, and it runs almost the same round-trip latency benchmark that we do, except it divides by 2 to get one-way latency. So not super sure if this really provides any more info than what we already know, except maybe confirming that we are on right track with the benchmarks.

The actual minimum latency would be twice the core-to-core latency, because in best case a thread does 2 operations which need to be synced between the cores: first is the actual fetching of data at the head/tail position and second is the final atomic store to head/tail itself.

As you can see, we are already quite close to minimum latency. If we really, really care about latency we could remove the hint::spin_loop() :

latency/size_512 time: [66.599 ns 68.148 ns 70.187 ns]

change: [−36.215% −27.721% −19.234%] (p = 0.00 < 0.05)

Performance has improved.

latency/size_4096 time: [65.401 ns 68.008 ns 70.571 ns]

change: [−41.446% −38.184% −34.219%] (p = 0.00 < 0.05)

Performance has improved.

I wouldn’t recommend that for real applications as this artificially improves benchmarks and blocks CPU for doing other useful works. Also has no effect on throughput anyway. This gives us as low as we can get in latency.

But we are still an order of magnitude away from saturating the throughput. Apple claims memory bandwidth of 100GB/s (though that is just the bandwidth, this does not mean that a pair of producer-consumer threads can actually saturate it). If I paid for 100GB/s, I will use 100GB/s (I’ll try).

Batching

If we look at the flamegraph for the version after we fix false sharing, we get this:

The biggest time sink is loading/storing atomics. If we can reduce the number of atomic ops per message somehow, we may be able to get even higher throughput. What if instead of sending one item and then “commit”-ing it by updating the tail immediately, we instead send multiple items (write them in the buffer) before updating the tail? This way we could amortize the cost of loading/storing atomics over multiple items. We can do this by separating the step where we “reserve” a spot in the buffer (and also reserve multiple spots at once) and them committing some of the spots, the ones we were able to fill. We will take a hit on latency though, which is fine because technically latency is still the same if you commit one element at a time, and most applications anyway send data in batches.

impl<T> Sender<T> {

pub fn write_buffer(&mut self) -> &mut [MaybeUninit<T>] {

let tail_ref = unsafe { &*(&raw const (*self.inner.as_ptr()).tail.value) };

let tail = tail_ref.load(Ordering::Relaxed);

let next_tail = if tail + 1 == self.capacity { 0 } else { tail + 1 };

if next_tail == self.cached_head {

let head_ref = unsafe { &*(&raw const (*self.inner.as_ptr()).head.value) };

self.cached_head = head_ref.load(Ordering::Acquire);

}

let end = if self.cached_head > tail {

self.cached_head - 1

} else if self.cached_head == 0 {

self.capacity - 1

} else {

self.capacity

};

unsafe {

let ptr = self.buffer.add(tail).cast();

std::slice::from_raw_parts_mut(ptr.as_ptr(), end - tail)

}

}

pub unsafe fn commit(&mut self, len: usize) {

let tail_ref = unsafe { &*(&raw const (*self.inner.as_ptr()).tail.value) };

let tail = tail_ref.load(Ordering::Relaxed);

let mut new_tail = tail + len;

if new_tail >= self.capacity {

new_tail -= self.capacity;

}

tail_ref.store(new_tail, Ordering::Release);

}

}

Similarly you can implement read_buffer/advance on Reader<T> too.

batching/size_512 time: [1.2599 ms 1.2869 ms 1.3238 ms]

thrpt: [755.39 Melem/s 777.05 Melem/s 793.69 Melem/s]

thrpt: [5.6281 GiB/s 5.7895 GiB/s 5.9134 GiB/s]

batching/size_4096 time: [1.5422 ms 1.5532 ms 1.5650 ms]

thrpt: [638.99 Melem/s 643.83 Melem/s 648.41 Melem/s]

thrpt: [4.7609 GiB/s 4.7969 GiB/s 4.8310 GiB/s]

batching/size_65536 time: [1.7904 ms 1.8044 ms 1.8121 ms]

thrpt: [551.84 Melem/s 554.20 Melem/s 558.54 Melem/s]

thrpt: [4.1115 GiB/s 4.1291 GiB/s 4.1614 GiB/s]

Not quite the 40GiB/s! But if we look closely at the benchmarks, we are writing one element at a time. Which is not the fastest. Using std::ptr::copy_nonoverlapping (which uses memcpy underneath) would be much better.

batching/size_512 time: [250.29 µs 252.37 µs 255.05 µs]

thrpt: [3.9208 Gelem/s 3.9624 Gelem/s 3.9954 Gelem/s]

thrpt: [29.212 GiB/s 29.522 GiB/s 29.768 GiB/s]

batching/size_4096 time: [219.35 µs 223.51 µs 228.63 µs]

thrpt: [4.3739 Gelem/s 4.4740 Gelem/s 4.5588 Gelem/s]

thrpt: [32.588 GiB/s 33.334 GiB/s 33.966 GiB/s]

batching/size_65536 time: [177.43 µs 178.58 µs 179.60 µs]

thrpt: [5.5679 Gelem/s 5.5997 Gelem/s 5.6359 Gelem/s]

thrpt: [41.484 GiB/s 41.721 GiB/s 41.991 GiB/s]

Final code .

There we go! Note that you can probably go above 60GiB/s by tuning the size of items and queues (though sizes larger than 16 bytes will have a larger latency). You can run benchmarks on the official crate (not this example one), and some of them will acheive around 60GiB/s on M3 chip.

More shadowing of atomics for performance

When comparing my implementation with other crates, I came across rtrb which also has an SPSC. When running my benchmarks on it, of course mine was faster (slightly). And when running their benchmarks on my implementation, theirs was faster, by a lot :’( Oh the joys of benchmarking.

Anyway after some digging I found that their implementation also “shadows” the head in receiver and tail in sender, but only for local loads. This makes sense, because if sender is the only one writing to tail, why should it have to load it every time? The version it has locally is always correct and latest anyway. We just need to update the local one when we are updating the atomic one too. After implementing this, mine was just as fast (within reasonable margin) on their benchmarks too.

My benchmarks were heavily biased for contention-heavy queue usage - when queue is often empty or full. But for low contention queue usage (for example when queue size is huge, and/or queue is almost never completely empty/full) it is better to also avoid the first load (even if it is Relaxed ) entirely and shadow the other head/tail too.

Also if you survey for SPSC queues in general, almost every one of them does something similar to this anyway.

ISB vs YIELD on ARM

On ARM, std::hint::spin_loop() compiles to ISB instruction, whereas the std::thread::yield_now() actually does a syscall (which is why it is slower). The former is better for latency while the latter is better for power efficiency.

Intel cldemote

Intel has this special instruction called cldemote (Cache Line Demote) which can probably give fantastic performance for this case. It flushes the cache line from the core’s private L1/L2 cache to the shared L3 cache directly. If I test it out, I will update this section.

Pointer Projections and Aliasing

I used raw pointer offsets (the &raw const ) instead of simply creating a reference to the struct to access fields. It is undefined behaviour in Rust to create multiple mutable references to the same object at the same time in different threads. This is because Rust marks &mut T as noalias in the LLVM IR which means LLVM is free to optimise things as it pleases which can be problematic, for example:

  • it might read a value once and store it in a CPU register locally never checking main memory again, assuming the value cannot change
  • it might re-order the writes and wait until the very end of function to do that and possibly more things we do not want.

Creating multiple immutable references to the same object is fine though, as long as the underlying object is Sync . AtomicUsize is Sync , so creating references to it via pointer projection is fine.

discuss this on Hacker News , Twitter , Bluesky or Reddit .

EDIT: thanks to u/matthieum and u/The_8472 on Reddit for pointing out mistakes with this post, which have now been fixed.

Factor 0.101 now available

Hacker News
re.factorcode.org
2025-12-10 11:33:31
Comments...
Original Article

“Keep thy airspeed up, lest the earth come from below and smite thee.” - William Kershner

I’m very pleased to announce the release of Factor 0.101!

OS/CPU Windows Mac OS Linux
x86 0.101 0.101
x86-64 0.101 0.101 0.101

Source code : 0.101

This release is brought to you with almost 700 commits by the following individuals:

Aleksander Sabak, Andy Kluger, Cat Stevens, Dmitry Matveyev, Doug Coleman, Giftpflanze, John Benediktsson, Jon Harper, Jonas Bernouli, Leo Mehraban, Mike Stevenson, Nicholas Chandoke, Niklas Larsson, Rebecca Kelly, Samuel Tardieu, Stefan Schmiedl, @Bruno-366 , @bobisageek , @coltsingleactionarmyocelot , @inivekin , @knottio , @timor

Besides some bug fixes and library improvements, I want to highlight the following changes:

  • Moved the UI to render buttons and scrollbars rather than using images, which allows easier theming.
  • Fixed HiDPI scaling on Linux and Windows, although it currently doesn’t update the window settings when switching between screens with different scaling factors.
  • Update to Unicode 17.0.0.
  • Plugin support for the Neovim editor .

Some possible backwards compatibility issues:

  • The argument order to ltake was swapped to be more consistent with words like head .
  • The environment vocabulary on Windows now supports disambiguating f and "" (empty) values
  • The misc/atom folder was removed in favor of the factor/atom-language-factor repo.
  • The misc/Factor.tmbundle folder was removed in favor of the factor/factor.tmbundle repo.
  • The misc/vim folder was removed in favor of the factor/factor.vim repo.
  • The http vocabulary request tuple had a slot rename from post-data to data .
  • The furnace.asides vocabulary had a slot rename from post-data to data , and might require running ALTER TABLE asides RENAME COLUMN "post-data" TO data; .
  • The html.streams vocabulary was renamed to io.streams.html
  • The pdf.streams vocabulary was renamed to io.streams.pdf

What is Factor

Factor is a concatenative , stack-based programming language with high-level features including dynamic types, extensible syntax, macros, and garbage collection. On a practical side, Factor has a full-featured library , supports many different platforms, and has been extensively documented.

The implementation is fully compiled for performance, while still supporting interactive development . Factor applications are portable between all common platforms. Factor can deploy stand-alone applications on all platforms. Full source code for the Factor project is available under a BSD license.

New libraries:

Improved libraries:

Removed libraries

  • ui.theme.images

VM Improvements:

  • More work on ARM64 backend (fix set-callstack, fix generic dispatch)

Memory leak regression testing with V8/Node.js

Lobsters
joyeecheung.github.io
2025-12-10 11:29:01
Comments...
Original Article

Like many other relatively big piece of software, Node.js is no stranger to memory leaks, and with them, fixes and regression tests. Testing against memory leak regressions, however, can be particularly tricky in a runtime with a garbage-collected heap, and quite a few of these tests became source of flakes in the Node.js CI. In the past few months, I’ve been doing some work to improve the reliability of these tests. I’ve also come across a few bug reports in the Node.js issue tracker memory leaks that turn out to be false alarms because the reproductions made incorrect assumptions. Here are my notes about the testing strategies Node.js uses against memory leak regression, my observations about them, and why I added a new testing strategy with a new V8 API. Hopefully this can help the readers write less unreliable memory regression tests/memory leak reproductions.

First, let’s look at one of the earliest strategy used by Node.js to test against memory leaks is based on memory usage measurements. This probably is a result of receiving bug reports from users who found out about the leaks via memory usage monitoring in production. Naturally, their reproduction involved memory measurement and this then went into the test suites.

Measuring heap usage + gc()

This strategy is based on the assumption that gc() (a global function exposed by the --expose-gc V8 flag) should be able to reclaim memory used by objects that are already unreachable. If the tested operation leaks, the memory would not go down after gc() , and there should be a leak.

Take this test (which flaked and was revised to use another strategy we’ll talk about later) for example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
const { ok } = require('assert');
const { subscribe, unsubscribe } = require('diagnostics_channel');

function noop() {}

const heapUsedBefore = process.memoryUsage().heapUsed;

for (let i = 0; i < 1000; i++) {
subscribe(String(i), noop);
unsubscribe(String(i), noop);
}

global.gc();

const heapUsedAfter = process.memoryUsage().heapUsed;

ok(heapUsedBefore >= heapUsedAfter);

The testing procedure is basically:

  1. Measure the memory usage of the heap before allocation starts. In this case the value of heapUsed comes from v8::HeapStatistics::used_heap_size() and the statistics come from v8::Isolate::GetHeapStatistics() .
  2. Do the operation that could be leaking (and to avoid false negatives, allocate multiple times to use a significant amount of memory)
  3. Run gc() and then measure memory usage of the heap again
  4. If the memory usage does not go down, it leaks, otherwise there is no leak.

There are several issues that can make the test unreliable, one of which is assuming that gc() would reclaim enough unreachable memory after immediately it returns. But that’s not actually how gc() works. The GC tasks that bring the actual memory usage down could be delayed until the thread is idle i.e. not executing JavaScript (or, one could say it’s asynchronous, conceptually).

gc() multiple times asynchronously

To deal with the delayed effect of gc() , Node.js core’s test suite has a utility which runs the gc() function 10 times via setImmediate() until a condition is true. setImmediate() is chosen because the callback is run in the next iteration of the event loop. By that time the thread has already finished execution of JavaScript on the stack and likely has processed some GC tasks.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
function gcUntil(name, condition) {
return new Promise((resolve, reject) => {
let count = 0;

function gcAndCheck() {
setImmediate(() => {
count++;
global.gc();
if (condition()) {
resolve();
} else if (count < 10) {
gcAndCheck();
} else {
reject(name);
}
});
}

gcAndCheck();
});
}

So in the step 3 mentioned above, instead of doing:

  1. Run gc() and then measure memory usage of the heap again
  2. If the memory usage does not go down, it leaks, otherwise there is no leak.

We do

  1. Run gc() , then measure memory usage again after the current JavaScript execution completes & pending GC tasks are run.
  2. If the usage does not go down, run again, repeat this for up to 10 times. If the memory usage does not go down (enough) within the 10 attempts, it leaks, otherwise there is no leak.

So the test above would’ve been updated to something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
const { subscribe, unsubscribe } = require('diagnostics_channel');

function noop() {}

async function main() {
const heapUsedBefore = process.memoryUsage().heapUsed;

for (let i = 0; i < 1000; i++) {
subscribe(String(i), noop);
unsubscribe(String(i), noop);
}

await gcUntil('heap usage should go down', () => {
const heapUsedAfter = process.memoryUsage().heapUsed;
return heapUsedBefore >= heapUsedAfter;
});
}

main();

That was not what this test ended up looking like eventually, however, because it was still not reliable enough for this particular case. The only remaining example in Node.js core’s test suite that uses this pattern looks like below - it’s measuring RSS ( resident set size ) because the leak being tested came from the native side, and it was checking wether the memory overhead can go away by comparing the measurement with a multiplier that looked reasonable from local runs - so this is a pretty sketchy test, but it does the job and has not flaked enough to be updated:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

const v8 = require('v8');

const before = process.memoryUsage.rss();

for (let i = 0; i < 1000000; i++) {
v8.serialize('');
}

async function main() {
await gcUntil('RSS should go down', () => {
const after = process.memoryUsage.rss();
return after < before * 10;
});
}

main();

This test works reliably in the CI so far, but do note that it still relies on an shaky assumption - that if the native memory can be reclaimed by the OS, process.memoryUsage.rss() should eventually go down. Resident set size is the amount of physically memory allocated to the process. You might assume that, as long as the allocated memory is released, this is going to drop down immediately - but that’s not actually the case. It is mostly up to the memory allocator being used to decide when to actually return the memory to the system.

Sometimes, there can be a significant amount of fragmentation, and the system is not under memory pressure anyway, so memory allocator could think it’s too expensive to defragement to return the unused memory to the OS, and would rather keep it in case the process need it again. That happens quite a lot with the latest versions of glibc, for example. When that happens, detecting memory leaks based on whether resident set size goes down can produce false positives too. The same can be said about tests based on heapUsed as well. To address this issue, we can give V8 a bit more memory pressure and encourage it to reclaim more memory.

Small heap + pressure test for OOM failure

This is probably one of the most used strategy in Node.js for memory leak testing, though it is getting increasingly unreliable with V8 updates that contain major GC changes.

(If you are not familiar with the design of the V8 garbage collector and the generation layout, check out this blog post from V8).

The idea is essentially:

  1. Set the maxium heap size to a relatively small value.
    • With the default configurations, the V8 heap used by a minial Node.js instance is around 3-4 MB.
    • Typically tests that use this strategy limit the size of the old space to 16-20MB (when there are leaks, the leaking objects and the graph it retains usually end up in the old space).
  2. Repeat the tested operation and make sure that the total memory consumption from it is significantly higher the heap size limit.
    • To make the tests run relatively fast, the heap size set in 1 is usually small so that the test can reach the heap size limit quickly by running fewer operations.
  3. If the test crashes with a Out-Of-Memory failure, it means that the tested operation leave a reachable graph that the V8’s GC cannot purge memory from even under pressure, which indicates that there is a leak. Otherwise there is likely no leak.

An example using this strategy roughly looks like this:

1
2
3
4
5
6


for (let i = 0; i < 100; i++) {
const realm = new ShadowRealm();
realm.evaluate('new TextEncoder(); 1;');
}

There is another issue caused by the way V8’s GC works here. Oftentimes, step 2 is done in a tight loop, which is also done in the example above. In V8, the garbage collection of the old generation is designed to kick in when the JS execution thread is idle to avoid hurting the performance of JS execution. It has been observed that allocating memory in a tight loop can leave very little room for the GC to kick in, leading to flaky tests.

Pressure test for OOM failure with room for GC

To give V8’s GC a bit of room to kick in and avoid false positives, another utility is introduced.

1
2
3
4
5
6
7
8
9
const wait = require('timers/promises').setTimeout;


async function runAndBreathe(fn, repeat, waitTime = 20) {
for (let i = 0; i < repeat; i++) {
await fn();
await wait(waitTime);
}
}

The updated test looks like this:

1
2
3
4
5
6
7

'use strict';

runAndBreathe(() => {
const realm = new ShadowRealm();
realm.evaluate('new TextEncoder(); 1;');
}, 100);

Here we use setTimeout() to give GC sufficient time to kick in. This makes the test run slightly slower, but it is still acceptable and the updated test has been stable enough in the CI.

There is another caveat I’ve observed with this approach: once V8 native coverage collection - specifically, precises coverage collection, which tracks invocation count - is enabled (e.g. via NODE_V8_COVERAGE ), the feedback vectors in newly complied code can live longer than usual, since V8 needs them to track the invocation count. If the repeated operation involves compiling new code, the heap size limit chosen in step 1 must be big enough to account for this overhead or the test can still go out of memory even if the tested operation produces a graph that’s ultimately collectable.

Next: finalizer-based testing

As it turns out, testing against memory leaks using memory usage measurements can sometimes be quite tricky. In the next post , I will talk about a different strategy used by Node.js for testing against memory leaks.

Common Lisp, ASDF, and Quicklisp: packaging explained

Hacker News
cdegroot.com
2025-12-10 11:10:58
Comments...
Original Article

If there is one thing that confuses newcomers to Common Lisp, it is the interplay of built-in CL functionality, add-ons like Quicklisp and ASDF , and what all the words mean.

Common Lisp is old, and its inspiration is even older. It was developed when there was zero consensus on how file systems worked, operating systems were more incompatible than you can probably imagine, and that age shows. It pinned down terminology way before other languages got to the same point, and, as it happens so often, the late arrivals decided that they needed different words and these words stuck.

So let’s do a bit of a deep dive and see how all the bits and pieces work and why they are there. All examples are using SBCL and might be SBCL-specific. Check your Lisp’s manual if you use something else. Also, I’m (still) linking to the old LispWorks-provided HyperSpec as I’m not sure that the newer versions are fully done yet.

Common Lisp comes with just the bare essentials to work with files. It has to, as that single specification had to work on microcomputers, mainframes, and all sorts of minicomputers. Even today with essentially just two branches of the operating system family alive, the difference are big between Unix derivatives with a single hierarchy (and one of them, macOS, by default with a case-insensitive interpretation) and MS-DOS derivaties with drive letters and backslashes but also the option to have network-style paths with double backslashes. So Common Lisp has a somewhat odd system of “namestrings” (plain strings) and “pathnames” (weird strings). It is not super important and the spec has details, the tl&dr is that sometimes you will see a special reader macro #P"/foo/bar" instead of just "/foo/bar" and the docs will tell you which of these two is acceptable as an argument for what function. I just wanted to get that out of the way first. They HyperSpec has all the details , of course.

Loading code from files.

With files out of the way, next up is LOAD . It loads a file “into the Lisp environment” (which means your running image), but exactly how the file is named and whether it will load a source file or a compiled file is system-dependent. So

can load foo.lisp or foo.fasl or maybe even foo.obj if a Lisp implementation compiles to C object files. If it is a source file, it’ll evaluate all the forms and do some system-specific thing with them. The end result is that, well, everything in the file will now be ready for you to use. So if we have:

(defun hello ()
  (print "Hello, world!"))

(print "Done loading!")

and we open SBCL:

CL-USER(1): (load "test")

"Done loading"
T
CL-USER(2): (hello)

"Hello, world!"
"Hello, world!"

Nothing too surprising there. In case we want to speed up loading, we can compile the file:

CL-USER(7): (compile-file "test")

; compiling file "/home/cees/tmp/test.lisp" (written 26 NOV 2025 09:03:19 PM):

; wrote /home/cees/tmp/test.fasl
; compilation finished in 0:00:00.004
#P"/home/cees/tmp/test.fasl"
NIL
NIL

and the next time we ask to load "test" , the FASL (“fast load”) file should be loaded. It is purely a time-saver as the FASL file has been pre-parsed into your Lisp’s in-memory format so can be loaded very quickly (bypassing READ with all its bells and whistles). FASL files are implementation dependent and more often than not even version dependent. This is pretty much everything that the standard has to say about getting code into the system, and as you can see, it’s not much.

There is also PROVIDE and REQUIRE , which operate on something that the standard calls modules (and which are kept in a variable called *modules* ) but the standard designates this as deprecated so let’s skip it. Just know it is still lingering there. Don’t use it (not even when packages “helpfully” wrap it).

Packages

That CL-USER in the prompt is the name of the package that you are in. Here is a pretty bad choice of naming, and an endless source of confusion. A package is a namespace, nothing else, and the spec says so much :

A package establishes a mapping from names to symbols.

These days, we associate the concept of “package” probably with more than that. A bundle of software, with files and maybe some metadata, a thing you can download from somewhere, most likely. But in Common Lisp, it’s just a tool to map symbol names (strings in your source code) to symbols (internal addresses in memory). It’s a pretty versatile facility and you should read the docs on DEFPACKAGE . It’s is quite powerful, as it can :use other packages, it can shadow symbols, and whatnot, but at the end of the day, all that happens is that when you type:

The REPL will use the current package (in *package* ) to translate hello to whatever function is in memory, which should exist in the current package (here COMMON-LISP-USER , commonly aliased to CL-USER ) or in any packages it inherits from (“uses”). You can explicitly tell Lisp to look into another package ( my-package:hello is a different function) and even ignore that package’s explicit list of exported symbols by using a double colon (but don’t make a habit out of prying into other packages, it breaks modularity). There are a ton of details, but what counts is that a Common Lisp package is just an in-memory namespace thing, a bunch of connected lookup tables that help the parser map the strings in the files you load to the correct items inside your running image.

Nothing more, nothing less.

Systems

Common Lisp documentation often talks about systems in a general way like it is an intrinsic part of the language. However, the standard is vague. In the chapter on “System construction” it deals with loading —the little bit of functionality we already discussed—and “features” , which are essentially just flags that are used by the #+ and #- reader macros to make bits of code that is loaded conditional on the presence of features.

That is all the standard has to say about systems. You can load files and you can make compilation of these files conditional on feature flags.

So, where does that leave us?

In a sense, this is all you need. I mean, you can take someone else’s files and LOAD them, and they can be made somewhat portable by using features and saying #+sbcl (this code only to be compiled on SBCL) or #-linux (do not compile this on Linux), and the files can organize themselves by using DEFPACKAGE and friends to separate the code into namespaces so everybody can write code using names like HELLO and not step on each other’s toes.

Still, that Common Lisp “system” thing… it’s a bit vague and maybe there’s a hook there to build something more?

Another System Definition Facility

Some Common Lisp implementations come with a DEFSYSTEM , but that is not portable. There were early (we’re in 1989-ish now) attempts to have a common version, MK:DEFSYSTEM , which still works and is used by some projects. At the turn of a century, another version of DEFSYSTEM was created under the name ASDF , which modernized things and quickly turned into the de facto standard. It can do a lot of things and has extensive docs on its website , but we’ll focus here on the essentials.

So, what is a system? Well, a library? A, err, package ? Well, it should be named a package and if Common Lisp were born a couple of decades later it might have been called a package, but we have already seen that that name has been given to something closer to what we would probably call “module” today. “System” it is, then, and ASDF “defines” them.

Still, the closest analogy of a system is a package or a library: a bunch of Lisp code that together defines some functionality. It’s not a perfect comparison, because a lot of Lisp libraries contain multiple systems: at the very least, it is customary to have your code define separate systems for regular code and for test code, and often more systems are defined for, say, optional or contributed code. In any case, it is not intrinsic to Common Lisp, though, so ASDF strictly adds functionality:

  • It allows you to define a system ( ASDF:DEFSYSTEM ). That’s the core function: you tell it that you have a system with a certain name, and description, and all sorts of metadata; and most importantly, what source files are part of the system.
  • It allows you to define dependencies between systems in your DEFSYSTEM .
  • It allows you to load such systems wholesale. Instead of the individual files, or a developer’s homebrew loading script, you can now work on a higher level and load a system by name.

A system still is not a “real” Common Lisp thing: all that it does with respect to the standard is a bunch of LOAD and COMPILE-FILE calls. It will keep metadata in memory about systems that are loaded, but under the hood, loading code is all it does. It comes with extensive documentation and can do a lot of things like additional compilation steps, manage test runs, etcetera, but if you squint, it just loads code.

An ASDF file, with the extension .asd , is also just a Lisp source. The only special thing about it is that the extension signals to ASDF that it is the file to look for when ASDF is searching for systems, the one that has the system definition in a given constellation of source files and directories.

It is important to realize that a “system” and a package are entirely different things: one is an entity in an add-on tool, the other is intrinsic to Common Lisp’s namespacing. They can have the same name and often enough, they have the same name (your ASDF system “foo” will likely define a package “FOO” and it is helpful if that lives in a Git repository called “foo” which has a file named “foo.asd”) but they are different things living in, well, different namespaces and should not be confused with each other. One is intrinsic, the other an optional (but widely used) add-on and they are fully orthogonal things.

Where does ASDF gets its systems from?

Well, we have a “standard”, albeit a de facto one, to bundle Lisp code and describe how to load it. But if you say “this system here is called FOO and is dependent on BAR”, how does ASDF find BAR? The answer is very simple: it looks in predefined locations on your local disk (and nowhere else!). There are two predefined locations, one older and one currently preferred:

  1. ~/common-lisp , the old one;
  2. ~/.local/share/common-lisp/source , the XDG-compliant currently preferred one. Use this.

That’s all. You can extend that list by a very flexible but somewhat complicated mechanism called “source registries”, extensively documented , but essentially, the process looks like:

  • You refer to a system called foo ;
  • ASDF will look for foo.asd under the configured directories;
  • If found, it will load that file, and the DEFSYSTEM in there will do the rest.

This process recurses, so when system foo depends on system bar then the process will repeat, until everything is loaded or an error occurs. All the systems will be found, defined (in your image/memory), and loaded in the right order (depth-first so that dependencies are loaded, their packages defined and functions and macros and variables ready for use, before dependents are).

So, where does that leave us?

We upgraded from “here are a couple of Lisp files, good luck!” to “here is a library with dependencies”. Good progress. All you need to do now is download the library (as a Zip file or a tarball), unpack it under ~/.local/share/common-lisp/source , and load if with ASDF:LOAD-SYSTEM . Of course, the system may declare dependencies so you may get an error message. Easy enough, hunt for the dependency on the Net, download and unpack that , try again, find the next one.

Not perfect, but, well, progress?

Quicklisp enters the stage

It’s still a bit primitive, though. I mean, when coders were sending each other QIC tapes this may have been sufficient, but then someone went and had to invent the Internet and now we just push data over the information superhighway. We should be able to do better, not? Like “Perl in 1995” better, even?

Just ilke ASDF is an optional add-on to what Common Lisp provides, Quicklisp is an optional add-on to what ASDF offers. Essentially, it does two things:

  1. It adds a new directory to the places where ASDF can find systems;
  2. It offers some functions to download a system from “wherever”, which includes “the Internet”.

It hook into ASDF’s dependency resolution so that if there are more dependencies needed, Quicklisp will go and fetch them as well.

Tadaa: problem solved! We can just open SBCL and say

and admire a scrolling list of systems being downloaded, unpacked, loaded, analyzed for dependencies, dependencies being loaded, and so on.

So, where does that leave us?

We have all the functionality, but there’s one final issue: it is “always on”. In a lot of other languages, if you start your REPL (say, in Python or in Ruby or in Elixir) in a certain directory, that carries significance. The language runtime will look for a special project file, probably, and set up search paths so that they work for that project. Common Lisp has no such concept, not even after you load ASDF and Quicklisp. So if you have a directory ~/my-code/my-awesome-lisp-project with a my-awesome-lisp-project.asd in there…. Neither Quicklisp nor ASDF is going to bother about the current directory and magically find your system.

You must play with their rules. Luckily, the rules are simple: go to ~/.local/share/common-lisp/source and drop symlinks in there to your projects so that ASDF can find them. That also means that it doesn’t matter where you start sbcl or Sly or SLIME from, your code will always be found. And when you then load your system with QL:QUICKLOAD , its dependencies will automatically be pulled in ( ASDF:LOAD-SYSTEM will still operate locally. It will, of course, use dependencies that Quicklisp found and downloaded in previous runs).

Final tips

Read the source, Luke

$ git clone https://gitlab.common-lisp.net/asdf/asdf.git
$ git clone https://github.com/quicklisp/quicklisp-client
$ guix shell cloc -- cloc quicklisp-client asdf
     363 text files.
     264 unique files.
     104 files ignored.

github.com/AlDanial/cloc v 2.06  T=0.13 s (1984.8 files/s, 332115.5 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Lisp                           219           3981           3889          31548
Markdown                         3            305              0           1173
HTML                             1             11             50            767
Bourne Shell                    10             40            120            526
Text                            18            105              0            357
make                             4             88             61            312
CSS                              1             60              8            236
YAML                             3             28             51            202
Perl                             2             22              8            117
DOS Batch                        2             23             13             74
C                                1              0              0              1
-------------------------------------------------------------------------------
SUM:                           264           4663           4200          35313
-------------------------------------------------------------------------------

It’s not that much code, and 7400 lines of that is the UIOP package that ASDF includes. UIOP is a package that is very useful in its own right as it is full of utilities that help you make your code less implementation-dependent, but it won’t teach you much about ASDF. So 25KLOC, tops. Without tests and contrib and whatnot, each package is around 5000 lines of well-written Lisp and worth learning. It’s helped me more than once to understand especially ASDF-VM to just open the code and figure out what exactly is going on.

KISS: Use package-inferred-system and a single source tree.

Put your Lisp code in directories under, I dunno, say ~/Code/CL . Symlink that directory to ~/.local/share/common-lisp/source and ASDF will be able to find all your systems. I’ve done some magic using GUIX Home and Stow and whatnot and had to dig around into how things worked, not recommended. If you have dependencies that are not in Quicklisp (or Ultralisp , which is worth adding), then check them out in a central spot (I use ~/OpenSource ) and symlink it into ~/quicklisp/local-projects . That way, all your dependency management is in one spot, the Quicklisp directory, whether you download them or Quicklisp did the job.

Read about ASDF’s package-inferred-system and use it. It’ll keep you from having to spend much time writing .asd files. As the docs say, ASDF itself uses it and since switching to it, there’s no going back for me. In a nutshell, every file is now expected to be a package and a system, same name, so that bit of confusion goes away. One of my project repos (my main monorepo as of lately, I’m slowly moving all my other code to it) has a very short ASDF definition:

#-asdf3.1 (error "CA.BERKSOFT requires ASDF 3.1 or later.")
(asdf:defsystem "ca.berksoft"
  :class :package-inferred-system)

It also has some necessary REGISTER-SYSTEM-PACKAGES calls to register Coalton packages. Sometimes you have dependencies that don’t work well with this scheme and this is the work-around, a small drawback that is dwarved by the advantages. But essentially, these three lines are it.

With that setup, a library to calculate the color temperature of an RGB color, say, lives in l/gfx/color-temperature.lisp and starts with:

(uiop:define-package :ca.berksoft/l/gfx/color-temperature
  (:use :cl :infix-math :try)
  (:export :temp->rgb))

Note that I use the UIOP version of defpackage . It’s a good habit to use the UIOP versions of functions where possible; it’ll increase portability and more often than not, the UIOP functions clean up confusion or shortcomings of the standard.

And that is all. ASDF, when I instruct it to load the system “ca.berksoft/l/gfx/color-temperature”, will stumble upon the top level .asd file, and then will start interpreting the rest (“l/gfx/color-temperature”) as a relative path under its package-inferred-system functionality. It finds that file, registers it as an ASDF system and loads it, which creates the Common Lisp package. Very simple, very clean. Give it a try.

Questions? Jump on Libera IRC and join the #commonlisp channel, I usually keep a close eye on it. You can also DM me on Mastodon or drop me a mail .

Common Lisp, ASDF, and Quicklisp: packaging explained

Lobsters
cdegroot.com
2025-12-10 11:08:47
Comments...
Original Article

If there is one thing that confuses newcomers to Common Lisp, it is the interplay of built-in CL functionality, add-ons like Quicklisp and ASDF , and what all the words mean.

Common Lisp is old, and its inspiration is even older. It was developed when there was zero consensus on how file systems worked, operating systems were more incompatible than you can probably imagine, and that age shows. It pinned down terminology way before other languages got to the same point, and, as it happens so often, the late arrivals decided that they needed different words and these words stuck.

So let’s do a bit of a deep dive and see how all the bits and pieces work and why they are there. All examples are using SBCL and might be SBCL-specific. Check your Lisp’s manual if you use something else. Also, I’m (still) linking to the old LispWorks-provided HyperSpec as I’m not sure that the newer versions are fully done yet.

Common Lisp comes with just the bare essentials to work with files. It has to, as that single specification had to work on microcomputers, mainframes, and all sorts of minicomputers. Even today with essentially just two branches of the operating system family alive, the difference are big between Unix derivatives with a single hierarchy (and one of them, macOS, by default with a case-insensitive interpretation) and MS-DOS derivaties with drive letters and backslashes but also the option to have network-style paths with double backslashes. So Common Lisp has a somewhat odd system of “namestrings” (plain strings) and “pathnames” (weird strings). It is not super important and the spec has details, the tl&dr is that sometimes you will see a special reader macro #P"/foo/bar" instead of just "/foo/bar" and the docs will tell you which of these two is acceptable as an argument for what function. I just wanted to get that out of the way first. They HyperSpec has all the details , of course.

Loading code from files.

With files out of the way, next up is LOAD . It loads a file “into the Lisp environment” (which means your running image), but exactly how the file is named and whether it will load a source file or a compiled file is system-dependent. So

can load foo.lisp or foo.fasl or maybe even foo.obj if a Lisp implementation compiles to C object files. If it is a source file, it’ll evaluate all the forms and do some system-specific thing with them. The end result is that, well, everything in the file will now be ready for you to use. So if we have:

(defun hello ()
  (print "Hello, world!"))

(print "Done loading!")

and we open SBCL:

CL-USER(1): (load "test")

"Done loading"
T
CL-USER(2): (hello)

"Hello, world!"
"Hello, world!"

Nothing too surprising there. In case we want to speed up loading, we can compile the file:

CL-USER(7): (compile-file "test")

; compiling file "/home/cees/tmp/test.lisp" (written 26 NOV 2025 09:03:19 PM):

; wrote /home/cees/tmp/test.fasl
; compilation finished in 0:00:00.004
#P"/home/cees/tmp/test.fasl"
NIL
NIL

and the next time we ask to load "test" , the FASL (“fast load”) file should be loaded. It is purely a time-saver as the FASL file has been pre-parsed into your Lisp’s in-memory format so can be loaded very quickly (bypassing READ with all its bells and whistles). FASL files are implementation dependent and more often than not even version dependent. This is pretty much everything that the standard has to say about getting code into the system, and as you can see, it’s not much.

There is also PROVIDE and REQUIRE , which operate on something that the standard calls modules (and which are kept in a variable called *modules* ) but the standard designates this as deprecated so let’s skip it. Just know it is still lingering there. Don’t use it (not even when packages “helpfully” wrap it).

Packages

That CL-USER in the prompt is the name of the package that you are in. Here is a pretty bad choice of naming, and an endless source of confusion. A package is a namespace, nothing else, and the spec says so much :

A package establishes a mapping from names to symbols.

These days, we associate the concept of “package” probably with more than that. A bundle of software, with files and maybe some metadata, a thing you can download from somewhere, most likely. But in Common Lisp, it’s just a tool to map symbol names (strings in your source code) to symbols (internal addresses in memory). It’s a pretty versatile facility and you should read the docs on DEFPACKAGE . It’s is quite powerful, as it can :use other packages, it can shadow symbols, and whatnot, but at the end of the day, all that happens is that when you type:

The REPL will use the current package (in *package* ) to translate hello to whatever function is in memory, which should exist in the current package (here COMMON-LISP-USER , commonly aliased to CL-USER ) or in any packages it inherits from (“uses”). You can explicitly tell Lisp to look into another package ( my-package:hello is a different function) and even ignore that package’s explicit list of exported symbols by using a double colon (but don’t make a habit out of prying into other packages, it breaks modularity). There are a ton of details, but what counts is that a Common Lisp package is just an in-memory namespace thing, a bunch of connected lookup tables that help the parser map the strings in the files you load to the correct items inside your running image.

Nothing more, nothing less.

Systems

Common Lisp documentation often talks about systems in a general way like it is an intrinsic part of the language. However, the standard is vague. In the chapter on “System construction” it deals with loading —the little bit of functionality we already discussed—and “features” , which are essentially just flags that are used by the #+ and #- reader macros to make bits of code that is loaded conditional on the presence of features.

That is all the standard has to say about systems. You can load files and you can make compilation of these files conditional on feature flags.

So, where does that leave us?

In a sense, this is all you need. I mean, you can take someone else’s files and LOAD them, and they can be made somewhat portable by using features and saying #+sbcl (this code only to be compiled on SBCL) or #-linux (do not compile this on Linux), and the files can organize themselves by using DEFPACKAGE and friends to separate the code into namespaces so everybody can write code using names like HELLO and not step on each other’s toes.

Still, that Common Lisp “system” thing… it’s a bit vague and maybe there’s a hook there to build something more?

Another System Definition Facility

Some Common Lisp implementations come with a DEFSYSTEM , but that is not portable. There were early (we’re in 1989-ish now) attempts to have a common version, MK:DEFSYSTEM , which still works and is used by some projects. At the turn of a century, another version of DEFSYSTEM was created under the name ASDF , which modernized things and quickly turned into the de facto standard. It can do a lot of things and has extensive docs on its website , but we’ll focus here on the essentials.

So, what is a system? Well, a library? A, err, package ? Well, it should be named a package and if Common Lisp were born a couple of decades later it might have been called a package, but we have already seen that that name has been given to something closer to what we would probably call “module” today. “System” it is, then, and ASDF “defines” them.

Still, the closest analogy of a system is a package or a library: a bunch of Lisp code that together defines some functionality. It’s not a perfect comparison, because a lot of Lisp libraries contain multiple systems: at the very least, it is customary to have your code define separate systems for regular code and for test code, and often more systems are defined for, say, optional or contributed code. In any case, it is not intrinsic to Common Lisp, though, so ASDF strictly adds functionality:

  • It allows you to define a system ( ASDF:DEFSYSTEM ). That’s the core function: you tell it that you have a system with a certain name, and description, and all sorts of metadata; and most importantly, what source files are part of the system.
  • It allows you to define dependencies between systems in your DEFSYSTEM .
  • It allows you to load such systems wholesale. Instead of the individual files, or a developer’s homebrew loading script, you can now work on a higher level and load a system by name.

A system still is not a “real” Common Lisp thing: all that it does with respect to the standard is a bunch of LOAD and COMPILE-FILE calls. It will keep metadata in memory about systems that are loaded, but under the hood, loading code is all it does. It comes with extensive documentation and can do a lot of things like additional compilation steps, manage test runs, etcetera, but if you squint, it just loads code.

An ASDF file, with the extension .asd , is also just a Lisp source. The only special thing about it is that the extension signals to ASDF that it is the file to look for when ASDF is searching for systems, the one that has the system definition in a given constellation of source files and directories.

It is important to realize that a “system” and a package are entirely different things: one is an entity in an add-on tool, the other is intrinsic to Common Lisp’s namespacing. They can have the same name and often enough, they have the same name (your ASDF system “foo” will likely define a package “FOO” and it is helpful if that lives in a Git repository called “foo” which has a file named “foo.asd”) but they are different things living in, well, different namespaces and should not be confused with each other. One is intrinsic, the other an optional (but widely used) add-on and they are fully orthogonal things.

Where does ASDF gets its systems from?

Well, we have a “standard”, albeit a de facto one, to bundle Lisp code and describe how to load it. But if you say “this system here is called FOO and is dependent on BAR”, how does ASDF find BAR? The answer is very simple: it looks in predefined locations on your local disk (and nowhere else!). There are two predefined locations, one older and one currently preferred:

  1. ~/common-lisp , the old one;
  2. ~/.local/share/common-lisp/source , the XDG-compliant currently preferred one. Use this.

That’s all. You can extend that list by a very flexible but somewhat complicated mechanism called “source registries”, extensively documented , but essentially, the process looks like:

  • You refer to a system called foo ;
  • ASDF will look for foo.asd under the configured directories;
  • If found, it will load that file, and the DEFSYSTEM in there will do the rest.

This process recurses, so when system foo depends on system bar then the process will repeat, until everything is loaded or an error occurs. All the systems will be found, defined (in your image/memory), and loaded in the right order (depth-first so that dependencies are loaded, their packages defined and functions and macros and variables ready for use, before dependents are).

So, where does that leave us?

We upgraded from “here are a couple of Lisp files, good luck!” to “here is a library with dependencies”. Good progress. All you need to do now is download the library (as a Zip file or a tarball), unpack it under ~/.local/share/common-lisp/source , and load if with ASDF:LOAD-SYSTEM . Of course, the system may declare dependencies so you may get an error message. Easy enough, hunt for the dependency on the Net, download and unpack that , try again, find the next one.

Not perfect, but, well, progress?

Quicklisp enters the stage

It’s still a bit primitive, though. I mean, when coders were sending each other QIC tapes this may have been sufficient, but then someone went and had to invent the Internet and now we just push data over the information superhighway. We should be able to do better, not? Like “Perl in 1995” better, even?

Just ilke ASDF is an optional add-on to what Common Lisp provides, Quicklisp is an optional add-on to what ASDF offers. Essentially, it does two things:

  1. It adds a new directory to the places where ASDF can find systems;
  2. It offers some functions to download a system from “wherever”, which includes “the Internet”.

It hook into ASDF’s dependency resolution so that if there are more dependencies needed, Quicklisp will go and fetch them as well.

Tadaa: problem solved! We can just open SBCL and say

and admire a scrolling list of systems being downloaded, unpacked, loaded, analyzed for dependencies, dependencies being loaded, and so on.

So, where does that leave us?

We have all the functionality, but there’s one final issue: it is “always on”. In a lot of other languages, if you start your REPL (say, in Python or in Ruby or in Elixir) in a certain directory, that carries significance. The language runtime will look for a special project file, probably, and set up search paths so that they work for that project. Common Lisp has no such concept, not even after you load ASDF and Quicklisp. So if you have a directory ~/my-code/my-awesome-lisp-project with a my-awesome-lisp-project.asd in there…. Neither Quicklisp nor ASDF is going to bother about the current directory and magically find your system.

You must play with their rules. Luckily, the rules are simple: go to ~/.local/share/common-lisp/source and drop symlinks in there to your projects so that ASDF can find them. That also means that it doesn’t matter where you start sbcl or Sly or SLIME from, your code will always be found. And when you then load your system with QL:QUICKLOAD , its dependencies will automatically be pulled in ( ASDF:LOAD-SYSTEM will still operate locally. It will, of course, use dependencies that Quicklisp found and downloaded in previous runs).

Final tips

Read the source, Luke

$ git clone https://gitlab.common-lisp.net/asdf/asdf.git
$ git clone https://github.com/quicklisp/quicklisp-client
$ guix shell cloc -- cloc quicklisp-client asdf
     363 text files.
     264 unique files.
     104 files ignored.

github.com/AlDanial/cloc v 2.06  T=0.13 s (1984.8 files/s, 332115.5 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Lisp                           219           3981           3889          31548
Markdown                         3            305              0           1173
HTML                             1             11             50            767
Bourne Shell                    10             40            120            526
Text                            18            105              0            357
make                             4             88             61            312
CSS                              1             60              8            236
YAML                             3             28             51            202
Perl                             2             22              8            117
DOS Batch                        2             23             13             74
C                                1              0              0              1
-------------------------------------------------------------------------------
SUM:                           264           4663           4200          35313
-------------------------------------------------------------------------------

It’s not that much code, and 7400 lines of that is the UIOP package that ASDF includes. UIOP is a package that is very useful in its own right as it is full of utilities that help you make your code less implementation-dependent, but it won’t teach you much about ASDF. So 25KLOC, tops. Without tests and contrib and whatnot, each package is around 5000 lines of well-written Lisp and worth learning. It’s helped me more than once to understand especially ASDF-VM to just open the code and figure out what exactly is going on.

KISS: Use package-inferred-system and a single source tree.

Put your Lisp code in directories under, I dunno, say ~/Code/CL . Symlink that directory to ~/.local/share/common-lisp/source and ASDF will be able to find all your systems. I’ve done some magic using GUIX Home and Stow and whatnot and had to dig around into how things worked, not recommended. If you have dependencies that are not in Quicklisp (or Ultralisp , which is worth adding), then check them out in a central spot (I use ~/OpenSource ) and symlink it into ~/quicklisp/local-projects . That way, all your dependency management is in one spot, the Quicklisp directory, whether you download them or Quicklisp did the job.

Read about ASDF’s package-inferred-system and use it. It’ll keep you from having to spend much time writing .asd files. As the docs say, ASDF itself uses it and since switching to it, there’s no going back for me. In a nutshell, every file is now expected to be a package and a system, same name, so that bit of confusion goes away. One of my project repos (my main monorepo as of lately, I’m slowly moving all my other code to it) has a very short ASDF definition:

#-asdf3.1 (error "CA.BERKSOFT requires ASDF 3.1 or later.")
(asdf:defsystem "ca.berksoft"
  :class :package-inferred-system)

It also has some necessary REGISTER-SYSTEM-PACKAGES calls to register Coalton packages. Sometimes you have dependencies that don’t work well with this scheme and this is the work-around, a small drawback that is dwarved by the advantages. But essentially, these three lines are it.

With that setup, a library to calculate the color temperature of an RGB color, say, lives in l/gfx/color-temperature.lisp and starts with:

(uiop:define-package :ca.berksoft/l/gfx/color-temperature
  (:use :cl :infix-math :try)
  (:export :temp->rgb))

Note that I use the UIOP version of defpackage . It’s a good habit to use the UIOP versions of functions where possible; it’ll increase portability and more often than not, the UIOP functions clean up confusion or shortcomings of the standard.

And that is all. ASDF, when I instruct it to load the system “ca.berksoft/l/gfx/color-temperature”, will stumble upon the top level .asd file, and then will start interpreting the rest (“l/gfx/color-temperature”) as a relative path under its package-inferred-system functionality. It finds that file, registers it as an ASDF system and loads it, which creates the Common Lisp package. Very simple, very clean. Give it a try.

Questions? Jump on Libera IRC and join the #commonlisp channel, I usually keep a close eye on it. You can also DM me on Mastodon or drop me a mail .

Big Tech are the new Soviets

Hacker News
unherd.com
2025-12-10 09:15:18
Comments...
Original Article

Big Tech’s so-called Magnificent Seven are on everyone’s lips. The exorbitant stock market valuations of Google, Meta, Apple, Microsoft, Nvidia, Amazon and Tesla provoke an amalgam of awe and fear. Their trillion-dollar investments in AI prompt some to predict the brightest of futures and others to dread humanity’s dumbing down, unemployment, redundancy even. In this overwhelming din, it is easy to miss the larger picture: a new type of capital is killing markets, capitalism’s habitat.

At its very beginning, capitalism was underpinned by faith in competitive markets. In the liberal fantasy, spearheaded by Adam Smith, bakers, brewers and butchers laboured within markets so cut-throat that none could make more money than the bare minimum necessary to keep their small, family-owned businesses running. This in turn provided us with our daily bread, ale and meat.

Then came the second industrial revolution and the conglomerates whose market power would make Smith weep with joy. This was the era of Big Business and the “robber barons”. And so another — neoliberal — fantasy was created, to justify the new big beasts that were now monopolising almost every market that mattered. Joseph Schumpeter, a former Austrian finance minister who made America his home, was the new creed’s most effective advocate. Progress, he argued, is impossible in competitive markets. Growth needs monopolies to fuel it. How else can enough profit be earned to pay for expensive research and development, for new machines, new product lines and all the paraphernalia that helps innovation take root? To monopolise markets, conglomerates need to dazzle us with remarkable new products that kill off the competition, like Henry Ford’s Model-T or Apple’s iPhone. Should we worry about all that concentrated power? No, Schumpeter reassured us. Once they reach their pinnacle, these monopolies get flabby and complacent and, eventually, they’re brought down by some upstart: one example being Toyota’s toppling of General Motors.

More recently, Peter Thiel, Palantir’s co-founder, said something that many thought was a restatement of Schumpeter’s dictum: “Competition is for losers!” While the pioneers of Big Business like Thomas Edison and Henry Ford would have agreed wholeheartedly, what Thiel was implying went beyond their wildest imagination. It went much further even than Schumpeter’s pseudo-Darwinian idea that progress comes through the rise and fall of monopolists in an endless struggle for existence.

What Thiel was saying is that today, winners do not just kill off the competition to monopolise a market. No, they keep going until they kill the market itself and replace it with something quite different: a kind of cloud fief that lacks all of the ingredients of a proper market — indeed that lacks all of the advantages that liberals and neoliberals alike recognise in the machinery of decentralised markets. In fact, today’s winners —the Magnificent Seven, plus Thiel’s own Palantir — are reviving an economic model that all of us thought dead and buried after the fall of the Soviet Union: economic planning systems that match buyers and sellers outside anything that can be usefully described as a market.

Gosplan was the Soviet Union’s State Planning Committee, the engine room of its command economy. Its remit was to match the supply and demand of critical resources (oil, steel, cement) but also consumption goods (food, clothes, appliances), without using market prices. Once buyers and sellers were matched, prices were assigned with a view to achieving political and social objectives (such as to ensure basic affordability, or subsidise certain industries) — not to balance markets.

Gosplan was disbanded immediately after the red flag was lowered over the Kremlin on Boxing Day of 1991, but it is now back. Where? In the algorithms powering Jeff Bezos’s Amazon, Peter Thiel’s Palantir and the rest of Big Tech’s digital platforms that pretend to be, but are not, markets.

Before you protest the audacity of my claim, think of what happens when you visit Amazon. Unlike when you visit a shopping mall, either with friends or mingling with strangers, the moment you follow the link to amazon.com, you exit the marketplace and enter a space of pristine isolation. It’s just you and Jeff Bezos’s algorithm. You type, say, “espresso machines” into the search box and the algorithm matches you with a number of vendors. However, to achieve what it was coded for, the algorithm had started working months, even years, earlier.

Over that period, you will have revealed to it many of your whims and desires through your searches, purchases, clicks and reviews. Using these cues, as well as data from other sources, the algorithm has trained you to train it to know you even better, enabling it to advise you on what books, music and films to buy. It has already won your trust. So, now that you are in a hurry to replace your broken espresso machine, the chances are that you will choose one of the top search results it has given you.

The algorithm knows your spending pattern. It knows how to guide you to the espresso machine with the highest price you are prepared to pay, all in order that Amazon can collect up to 40% of it the moment you click the purchase button. It is an extortionate cut, but the espresso machine’s makers tolerate it, because they know that if they don’t, their company will never appear in the top search results of anyone prepared to pay for their product. As AI improves, this power to manipulate your behaviour increases — and this is why Big Tech’s valuations are going through the roof.

This is nothing less than a capitalist, privately-owned, super high-tech reincarnation of the USSR’s Gosplan. Amazon’s software matches you with particular vendors and bans you from talking to any seller or even from observing what other buyers are doing — unless of course it calculates that it serves its own purposes to let you see a small selection of them. As for the price you pay, this follows (rather than precipitates) your being matched with a seller. Rather than being the variable that equilibrates demand with supply, prices in Amazon fulfil another role: that of maximising Jeff Bezos’s cloud rents.

“Had the Soviet leaders lived to witness the workings of Silicon Valley’s Big Tech, they would be kicking themselves.”

In this sense, prices in Amazon and other Big Tech platforms function in a manner far closer to Gosplan than to any farmers’ market, money market or shopping mall you have ever experienced. In fact, had the Soviet leaders lived to witness the workings of Silicon Valley’s Big Tech, they would be kicking themselves, lamenting that it was American capitalists who perfected their Gosplan model, complete with a surveillance system that would make their KGB henchmen green with envy.

Gosplan failed to turn into a success story as it lacked Big Tech’s greatest weapon: cloud capital, that is, the algorithms, data centres and optic fibre cables working as an integrated network to train you to train it. As you impart your data, cloud capital learns how to input desires into your mind and then satiate these desires by selling you stuff within its privately owned version of Gosplan.

But, is there really a difference — I hear many of you ask loudly — between Thomas Edison and Jeff Bezos? Are they not cut from the same cloth of megalomaniac monopolists seeking to dominate markets and our imagination? Yes, despite their similarities, there is a difference — and it is gigantic. Edison’s and Ford’s capital was productive. It produced cars, electricity, turbines. Bezos’s cloud capital produces nothing, except the enormous power to encase us in his cloud fief where traditional capitalist producers are squeezed for cloud rents and we, the users, provide our free labour. With every click, like and review, we enhance the power of cloud capital.

Once upon a time, an old Trotskyite told me that the Soviet Union, in the name of socialism, had created a form of industrial feudalism. Independent of whether he was right or not, his comment is pertinent today in relation to Big Tech. Come to think of it, while the trading process on platforms like Amazon is reminiscent of the USSR’s Gosplan mechanism, it is also the case that the enormous sums that Amazon, Uber, Airbnb etc, charge the actual producers of the goods and services peddled on their sites are akin to the ground rents that the landed gentry used to charge their vassals — except that, here, they are cloud rents that accrue to the owners of cloud capital. So, just as the Soviet Union generated one kind of feudalism in the name of socialism and human emancipation, today, Silicon Valley is generating another kind of feudalism — technofeudalism , I have called it — in the name of capitalism and free markets.

The parallel extends to the state. The USSR was meant to be a workers’ paradise in contrast to the USA whose raison d’être was to be a haven for capitalist producers. It turns out that both promises were false. As Big Tech’s cloud capital accumulates and concentrates into fewer and fewer hands, states are becoming dependent on corporate techlords. By outsourcing core functions — archives, health data, even military software — to rented cloud infrastructure, governments lease back their own operational capacity from Amazon Web Services, Microsoft, and Google. This dependency enables a new dimension of technofeudal power.

From this perspective, just as the Soviet Union was a feudal-like industrial society pretending to be a workers’ state, the United States today is performing a splendid impersonation of a technofeudal state, with repercussions that extend to every realm of state activity, including health services, education, the tax office, our borders and faraway battlefields.

In Ukraine and Gaza, and along our militarised borders, cloud capital is trained to extend its reach. Amazon’s AI tool Rekognition is used by law enforcement, including ICE, while Palantir’s vast surveillance software runs on Amazon’s cloud. Through Project Nimbus, Amazon and Google provide the Israeli military with advanced cloud and AI capabilities, reportedly enabling rapid, AI-driven targeting in Gaza with minimal human oversight.

Let us briefly return to the comparison with the early 20th century’s original monopolist capitalists. Whether we admire or abhor the Magnificent Seven’s stock market valuations, it is helpful to keep this in mind: the old capitalist giants, the “robber barons”, actually produced things . The new technofeudal lords produce a new social order. They have replaced the invisible hand of the market with the visible, algorithmic fist of the cloudalist.

Free-market enthusiasts have nothing to celebrate and much to regret. But it will take a brave soul amongst them to stare reality in the face. Just like pro-Soviet Marxists remained in denial that the Soviet experiment had failed for many years after 1991, so  free-market ideologues refuse to see that capitalism begat a form of capital — cloud capital — that replaced markets with something out of the Soviet past. In the process, it has killed capitalism.


New tag suggestion: "genai-assisted"

Lobsters
lobste.rs
2025-12-10 08:53:54
I believe we should implement a new tag for submissions that are wholly or largely composed with the help of generative AI, LLMs, etc. The purpose of this tag is for people who do not wish to consume such content to be able to filter it out. The tag should ideally be set by the submitter, but if the...
Original Article

I believe we should implement a new tag for submissions that are wholly or largely composed with the help of generative AI, LLMs, etc.

The purpose of this tag is for people who do not wish to consume such content to be able to filter it out.

The tag should ideally be set by the submitter, but if they are submitting the piece in ignorance of the fact that is created with the help of GenAI, it can be suggested by users.

Recent examples of submissions (direct link to comments pointing this out):

Shouldn't this be a flag, instead of a tag?

No, a flag should be reserved for content that is off-topic. GenAI generated content can be on-topic, but should be marked as such.

The flags name should be "slop" or some other negative term

No, we're already inundated with discussions about the perceived derogatory tone of the "vibecoding" tag. I believe the tag should sound more descriptive, to encourage submitters and authors to apply it voluntarily.

There should be negative consequences for posting this sort of content.

Other than more people whould filter out the content, and less people would click through to the submission, there should be no weighting of submission visibility.

It's really hard to detect this stuff! Isn't that a problem?

Unfortunately it is probably impossible to know if an submission is composed largely with GenAI. Authors, submitters and commenters should make a best-effort to accurately determine it. No negative consequences should accrue to submitters who submit the content in good faith.

I use GenAI extensively but I don't like the negative connotations assigned to it. I should not be punished for having my work labelled as such on this site

This is more about labelling than value judgement. Fast food is still food, even if it's not as nutritious as more healthy food. A reasonable consumer protection practice is accurately labelling foodstuffs with their nutritional contents, so as to let the consumer choose with as much information as possible.

Stop Breaking TLS

Lobsters
www.markround.com
2025-12-10 07:04:32
Comments...
Original Article

Updated:

Rant ahead: I hate TLS “Inspection” software with a burning passion and I wish we collectively as an industry would just knock it the fuck off and stop pretending it’s some great security benefit. Every time I encounter it, in whatever form, it’s a gigantic headache that makes everyone’s life worse off and as far as I am concerned offers next to zero tangible benefits.

For those blissfully unaware, this is a class of “security” software or appliance that is supposed to let organisations monitor all encrypted traffic. It does this by inserting itself in the middle of traffic, stripping the encryption off so it can inspect it and then re-signing it with its own certificate. If that sounds familiar, it’s because it’s a widely known class of attack - the Man In The Middle attack. Great stuff, we’re literally deploying the exact attack vector that TLS was designed to prevent, but slapping a “security” label on it.

Firstly, it undermines one of the most important protocols of the modern Internet as it deliberately breaks all the guarantees that TLS encryption is supposed to offer. If the MITM certificate is installed everywhere, your company can intercept and monitor everything you say and do. Consider the ramifications of that - confidential messages to HR, medical information, insider trading information, your banking sessions - would you feel happy BCC’ing every single email to your IT department? Would you print out your therapy notes and pin them to the kitchen notice board?

But even ignoring the philosophical arguments about privacy and trust, I argue it actively makes your security worse . Consider this - what is the likelihood of every certificate authority on the Internet having their private keys compromised simultaneously? I’d wager that’s almost at the whatever is the statistics equivalent of the Planck length level of probability.

On the other hand, what’s the chance of your company’s MITM private key getting compromised by an attacker? Even if you completely trust your IT team and vendor (and if you do, you clearly haven’t been paying attention to any tech news for oh… the last few decades), you have to admit that chance is a lot higher. And depending on the vendor or tech stack involved, it could be a LOT higher. One disgruntled employee, one unpatched vulnerability, one phishing email to the right admin and choo-choo, it’s all aboard the FAIL train. Now an attacker could have the keys to your entire kingdom.

Then there’s the practicalities of it. It’s simply a massive hassle. Different Operating Systems expect certificates in different formats (PEM? DER? PFX? P7B?) installed in different places with different tooling to manage it all. update-ca-certificates vs update-ca-trust is just the tip of the iceberg - and that’s just the OS level. You then have language runtimes (Java keystore anyone?) and the applications themselves that all need to be configured.

And the problem is compounded with modern cloud-native apps. In a Kubernetes cluster, as well as having to handle updating the node VM images and container runtimes, you’ll have dozens if not hundreds of different base images each of which has their own standards. Alpine uses a different certificate path than Ubuntu. Your Node app expects them somewhere else entirely. The various CRDs or Helm charts you are using may or may not support custom CA bundles, and if they do there’s no agreed-on standard.

Now I’m not saying that because a problem is hard we should simply give up, but even if the benefits were worth it the simple fact is even with the best tooling and automation, you are guaranteed to miss something. Whether it’s some obscure tool that has a custom keystore and non-standard tooling, a quick “one off” command in an ephemeral container, some app that uses certificate pinning or an aging switch firmware that doesn’t even support custom certificate bundles, something will slip through the cracks. And when it does, guess what happens?

Which brings me to my biggest peeve: it normalizes bad security practices. Given that you will never have 100% coverage of your CA certificate installation - particularly amongst your technical teams who will be using a multitude of different tools and platforms - you get developers and sysadmins used to TLS errors. Instead of treating each one as an anomaly and something to be investigated, you get used to just running with --insecure or curl -k because you just need to get shit done. Turning off certificate verification becomes a routine troubleshooting step. “Oh, it’s probably just the corporate proxy again” becomes the reflexive response to any TLS error. You’ve just trained your entire technical staff to ignore one of the most important security warnings on the Internet!

And don’t even get me started on the performance and availability implications. All your traffic now has to be decrypted and re-encrypted by your magic box. Hope you sized that appliance correctly! Hope it doesn’t become a single point of failure! Hope it supports all the latest TLS versions and cipher suites!

There are a multitude of ways to protect yourself that are not only less invasive but are often more effective because they’re designed for how modern infrastructure actually works. Anomaly detection, Zero Trust network architecture, EDR, Netflow analysis… You don’t need to create single points of failure, and you can actually work with modern cloud-native infrastructure instead of fighting it. Plus, y’know, there’s this AI thing which as it turns out is actually quite useful at analysing metadata and spotting odd behavioral patterns.

In my experience: TLS Inspection MITM is a gigantic administrative burden, it normalizes bad practice, it creates bottlenecks and availability risks, and actively worsens your security posture.

Just stop it already.

From ‘glacier aesthetic’ to ‘poetcore’: Pinterest predicts the visual trends of 2026 based on its search data

Guardian
www.theguardian.com
2025-12-10 06:58:53
If search interest holds, glitchy glam, cool blue, aliencore and gummy bear aesthetics are among the vibes set to rock the creative world next year Next year, we’ll mostly be indulging in maximalist circus decor, working on our poetcore, hunting for the ethereal or eating cabbage in a bid for “indiv...
Original Article

Next year, we’ll mostly be indulging in maximalist circus decor, working on our poetcore, hunting for the ethereal or eating cabbage in a bid for “individuality and self-preservation”, according to Pinterest .

The organisation’s predictions for Australian trends in 2026 have landed, which – according to the platform used by interior decorators, fashion lovers and creatives of all stripes – includes 1980s, aliens, vampires and “forest magic”.

Close up on blazer with interesting pins over yellow shirt
The bookish and weird is the inspiration for ‘the poet aesthetic’, which jumped in search popularity this year. Photograph: Naomi Rahim/WireImage

Among the Pinterest 2026 trends report’s top 21 themes are “Afrohemian” decor (searches for the term are on the rise by baby boomers and Gen X); “glitchy glam” (asymmetric haircuts and mismatching nails); and “cool blue” (drinks, wedding dresses and makeup with a “glacier aesthetic”).

Pinterest compared English-language search data from September 2024 to August 2025 with those of the year before and claims it has an 88% accuracy rate. More than 9 million Australians use Pinterest each month.

Wednesday’s report found searches for 1980s luxury soared by 225%, “Scotland Highlands aesthetic” by 465% and “the poet aesthetic” by 175%. “Poetcore” – a key trend for Gen Z and millennials - takes its inspiration from the bookish: turtlenecks, fountain pens, satchels and ties.

A stylish man walking over tram tracks
Photograph: Eugenio Marongiu/Getty Images/Image Source
Man in turtleneck and sunglasses in sunshine
Will millenials be going poetcore in 2026? Photograph: Alexander Spatari/Getty Images

Driven by Gen Z and millennials, lace will be in, according to the data, including in doily, bandana and makeup form – as will khaki, field jackets and pleated trousers, aka the “paleontologist aesthetic”.

They’ll also be working an intergalactic “aliencore aesthetic”.

For Gen Z and millennials, travel will be adrenaline-seeking; for baby boomers, it will be to places that are “mystical” and “ethereal”. Searches for “Faroe Island aesthetic” almost doubled.

Aesthetic shot of hands with painted nails reaching for brightly-painted bottles
‘Gummy bears aesthetic’ searches aren’t looking for candy – it’s more about makeup products and nail art. Photograph: Anna Efetova/Getty Images

Pinterest predicts boomer and Gen X food trends to be cruciferous, with kimchi, dumplings and golumpki soup all raising the cabbage’s status. A younger trend for a “gummy bears aesthetic” goes beyond sweets and into makeup products and rubberised nail art.

And “niche perfume collection” is having its moment in the sun, as are “perfume layering combinations”.

Pinterest said there was a theme uniting trends as disparate as masquerades and operas, dragonfly wing-patterned nails and animal-inspired outfits, forecasting a move towards individuality and away from imitation.

Model in silvery space glam rock jumpsuit
The future is now for devotees of aliencore aesthetics. Photograph: Jacob Wackerhausen/Getty Images

Melinda Petrunoff, the managing director for Pinterest ANZ, told Guardian Australia that “people are craving comfort, authenticity and grounded optimism in a world that feels increasingly fast and often noisy”.

“What’s driving this is a desire for individuality and self-preservation – people are moving towards curating rather than copying, choosing to engage with what truly resonates with them instead of chasing every viral moment,” she said.

“We’re moving away from one-size-fits-all aesthetics and endless trend cycles that leave people feeling overwhelmed and disconnected.”

Revisiting "Let's Build a Compiler"

Hacker News
eli.thegreenplace.net
2025-12-10 06:22:19
Comments...
Original Article

There's an old compiler-building tutorial that has become part of the field's lore: the Let's Build a Compiler series by Jack Crenshaw (published between 1988 and 1995).

I ran into it in 2003 and was very impressed, but it's now 2025 and this tutorial is still being mentioned quite often in Hacker News threads . Why is that? Why does a tutorial from 35 years ago, built in Pascal and emitting Motorola 68000 assembly - technologies that are virtually unknown for the new generation of programmers - hold sway over compiler enthusiasts? I've decided to find out.

The tutorial is easily available and readable online , but just re-reading it seemed insufficient. So I've decided on meticulously translating the compilers built in it to Python and emit a more modern target - WebAssembly. It was an enjoyable process and I want to share the outcome and some insights gained along the way.

The result is this code repository . Of particular interest is the TUTORIAL.md file , which describes how each part in the original tutorial is mapped to my code. So if you want to read the original tutorial but play with code you can actually easily try on your own, feel free to follow my path.

A sample

To get a taste of the input language being compiled and the output my compiler generates, here's a sample program in the KISS language designed by Jack Crenshaw:

var X=0

 { sum from 0 to n-1 inclusive, and add to result }
 procedure addseq(n, ref result)
     var i, sum  { 0 initialized }
     while i < n
         sum = sum + i
         i = i + 1
     end
     result = result + sum
 end

 program testprog
 begin
     addseq(11, X)
 end
 .

It's from part 13 of the tutorial, so it showcases procedures along with control constructs like the while loop, and passing parameters both by value and by reference. Here's the WASM text generated by my compiler for part 13:

(module
  (memory 8)
  ;; Linear stack pointer. Used to pass parameters by ref.
  ;; Grows downwards (towards lower addresses).
  (global $__sp (mut i32) (i32.const 65536))

  (global $X (mut i32) (i32.const 0))

  (func $ADDSEQ (param $N i32) (param $RESULT i32)
    (local $I i32)
    (local $SUM i32)
    loop $loop1
      block $breakloop1
        local.get $I
        local.get $N
        i32.lt_s
        i32.eqz
        br_if $breakloop1
        local.get $SUM
        local.get $I
        i32.add
        local.set $SUM
        local.get $I
        i32.const 1
        i32.add
        local.set $I
        br $loop1
      end
    end
    local.get $RESULT
    local.get $RESULT
    i32.load
    local.get $SUM
    i32.add
    i32.store
  )

  (func $main (export "main") (result i32)
    i32.const 11
    global.get $__sp      ;; make space on stack
    i32.const 4
    i32.sub
    global.set $__sp
    global.get $__sp
    global.get $X
    i32.store
    global.get $__sp    ;; push address as parameter
    call $ADDSEQ
    ;; restore parameter X by ref
    global.get $__sp
    i32.load offset=0
    global.set $X
    ;; clean up stack for ref parameters
    global.get $__sp
    i32.const 4
    i32.add
    global.set $__sp
    global.get $X
  )
)

You'll notice that there is some trickiness in the emitted code w.r.t. handling the by-reference parameter (my previous post deals with this issue in more detail). In general, though, the emitted code is inefficient - there is close to 0 optimization applied.

Also, if you're very diligent you'll notice something odd about the global variable X - it seems to be implicitly returned by the generated main function. This is just a testing facility that makes my compiler easy to test. All the compilers are extensively tested - usually by running the generated WASM code [1] and verifying expected results.

Insights - what makes this tutorial so special?

While reading the original tutorial again, I had on opportunity to reminisce on what makes it so effective. Other than the very fluent and conversational writing style of Jack Crenshaw, I think it's a combination of two key factors:

  1. The tutorial builds a recursive-descent parser step by step, rather than giving a long preface on automata and table-based parser generators. When I first encountered it (in 2003), it was taken for granted that if you want to write a parser then lex + yacc are the way to go [2] . Following the development of a simple and clean hand-written parser was a revelation that wholly changed my approach to the subject; subsequently, hand-written recursive-descent parsers have been my go-to approach for almost 20 years now .
  2. Rather than getting stuck in front-end minutiae, the tutorial goes straight to generating working assembly code, from very early on. This was also a breath of fresh air for engineers who grew up with more traditional courses where you spend 90% of the time on parsing, type checking and other semantic analysis and often run entirely out of steam by the time code generation is taught.

To be honest, I don't think either of these are a big problem with modern resources, but back in the day the tutorial clearly hit the right nerve with many people.

What else does it teach us?

Jack Crenshaw's tutorial takes the syntax-directed translation approach, where code is emitted while parsing , without having to divide the compiler into explicit phases with IRs. As I said above, this is a fantastic approach for getting started, but in the latter parts of the tutorial it starts showing its limitations. Especially once we get to types, it becomes painfully obvious that it would be very nice if we knew the types of expressions before we generate code for them.

I don't know if this is implicated in Jack Crenshaw's abandoning the tutorial at some point after part 14, but it may very well be. He keeps writing how the emitted code is clearly sub-optimal [3] and can be improved, but IMHO it's just not that easy to improve using the syntax-directed translation strategy. With perfect hindsight vision, I would probably use Part 14 (types) as a turning point - emitting some kind of AST from the parser and then doing simple type checking and analysis on that AST prior to generating code from it.

Conclusion

All in all, the original tutorial remains a wonderfully readable introduction to building compilers. This post and the GitHub repository it describes are a modest contribution that aims to improve the experience of folks reading the original tutorial today and not willing to use obsolete technologies. As always, let me know if you run into any issues or have questions!


[1] This is done using the Python bindings to wasmtime .
[2] By the way, gcc switched from YACC to hand-written recursive-descent parsing in the 2004-2006 timeframe, and Clang has been implemented with a recursive-descent parser from the start (2007).
[3]

Concretely: when we compile subexpr1 + subexpr2 and the two sides have different types, it would be mighty nice to know that before we actually generate the code for both sub-expressions. But the syntax-directed translation approach just doesn't work that way.

To be clear: it's easy to generate working code; it's just not easy to generate optimal code without some sort of type analysis that's done before code is actually generated.

How Google Maps quietly allocates survival across London’s restaurants - and how I built a dashboard to see through it

Lobsters
laurenleek.substack.com
2025-12-10 06:09:43
Comments...
Original Article

I needed a restaurant recommendation, so I did what every normal person would do: I scraped every single restaurant in Greater London and built a machine-learning model.

It started as a very reasonable problem. I was tired of doom-scrolling Google Maps, trying to disentangle genuinely good food from whatever the algorithm had decided to push at me that day. Somewhere along the way, the project stopped being about dinner and became about something slightly more unhinged: how digital platforms quietly redistribute economic survival across cities.

Because once you start looking at London’s restaurant scene through data, you stop seeing all those cute independents and hot new openings. You start seeing an algorithmic market - one where visibility compounds, demand snowballs, and who gets to survive is increasingly decided by code.

The public story of Google Maps is that it passively reflects “what people like.” More stars, more reviews, better food. But that framing obscures how the platform actually operates. Google Maps is not just indexing demand - it is actively organising it through a ranking system built on a small number of core signals that Google itself has publicly acknowledged: relevance, distance, and prominence.

“Relevance” is inferred from text matching between your search query and business metadata. “Distance” is purely spatial. But “prominence” is where the political economy begins. Google defines prominence using signals such as review volume, review velocity, average rating, brand recognition, and broader web visibility. In other words, it is not just what people think of a place - it is how often people interact with it, talk about it, and already recognise it.

Visibility on these ranked lists determines foot traffic. Foot traffic determines how quickly reviews accumulate. Review accumulation then feeds directly back into the prominence signal. The system compounds. Early discovery generates demand. Demand generates data. Data generates future discovery. This creates a cumulative-advantage dynamic that looks remarkably similar to the way capital compounds in financial markets. This is essentially Robert Merton’s Matthew Effect applied to kebab shops - ‘unto every one that hath shall be given.’

This disproportionately rewards chains and already-central venues. Chains benefit from cross-location brand recognition. High-footfall areas generate reviews faster, meaning venues in those zones climb the prominence ranking more quickly even at identical underlying quality. By contrast, new independents face a classic cold-start problem: without reviews they are hard to find, and without being found they struggle to accumulate reviews at all. What looks like neutral consumer choice is therefore better understood as algorithmically mediated market design .

In economics, this dynamic closely resembles the logic of a market maker: an intermediary that does not merely reflect underlying supply and demand, but actively shapes liquidity, matching, and price discovery. Platforms like Google Maps perform an analogous function for local services by controlling visibility rather than prices directly. In the language of digital economics, ranking algorithms act as attention allocators, steering demand toward some firms and away from others.

If Google Maps now acts as a kind of market maker for urban demand, the obvious next question is: what would the city look like without that amplification layer? In other words, how do you separate a restaurant’s intrinsic performance from the visibility effects of the platform itself?

To get at that, I built a machine-learning model - a gradient-boosted decision tree (for the ML crowd: HistGradientBoostingRegressor from scikit-learn) - to predict what a restaurant’s Google rating should be, given only its structural characteristics. This class of model is designed for large, messy, mixed-type tabular data and is particularly good at capturing interaction effects, without me having to specify those by hand. Features include how many reviews it has (log-transformed to reflect diminishing returns to attention), what cuisine it serves, whether it is part of a chain or an independent, its price level, broad venue types (restaurant, café, takeaway, bar), and where it sits in the city via a spatial grid.

Quick aside: for a subset of places I also scraped review text, languages, and photos. But for this first full-city run I stayed within the Google Maps API free tier - partly for reproducibility, partly because previous grid-scraping adventures taught me that cloud bills compound faster than review counts. So, for future versions, more features will only improve things. In particular, who is doing the reviewing matters. A five-star review of an Indian restaurant written in Hindi probably carries a different signal than one written by someone ordering chips with everything. (No judgment of white British people ofc…)

One practical problem I ran into early on is that Google Maps is surprisingly bad at categorising cuisines. A huge share of restaurants are labelled vaguely (“restaurant”, “cafe”, “meal takeaway”), inconsistently, or just incorrectly. So I ended up building a separate cuisine-classification model that predicts cuisine from restaurant names, menu language, and review text where available. In other words, the cuisine filters in the dashboard are not just Google’s tags - they’re machine-learned. This matters more than it might sound: if you misclassify cuisines, you misread diversity, clustering, and who actually competes with whom on the high street. Btw, I briefly considered classifying Pret A Manger as French, just to see if it would make any French people angrier at me than they already are. I didn’t. But I thought about it.

Before any modelling happens, all features go through a standard preprocessing pipeline - imputation, encoding, the usual. Crucially, the model is trained only to learn the mapping between observable platform-visible features and ratings. This allows me to generate a counterfactual expected rating for each restaurant - what the platform would typically assign under those structural conditions. The difference between a restaurant’s real rating and this predicted rating is what I call the rating residual. A positive residual means the restaurant performs materially better than its platform baseline would suggest. A negative residual means it underperforms relative to what the algorithm normally rewards. This is not a perfect measure of food quality - but it is a powerful measure of algorithmic mispricing: where social or culinary value diverges from what the platform structurally amplifies.

One caveat: some restaurants pay for promoted pins or local-search ads. Because paid visibility isn’t publicly disclosed, I can’t estimate how many - which is itself a sign of how opaque platform influence has become. My residuals may partly reflect ad spend I can’t observe.

To summarise this, I built the London food dashboard. The dashboard currently allows users to search by name and filter by underrated gems (identified by my machine learning algorithm), cuisine, borough, price level, min rating, and review volume. It is still very much a version-one prototype - but it is already a working microscope into London’s algorithmic food economy.

If you want to explore it yourself, you can find it on my personal website at: laurenleek.eu/food-map .

Naturally, I immediately stress-tested it on my favourite part of London: Islington (maybe all this promo - also in my previous Substack on UK segregation - makes me qualify for a council tax rebate? - I’m looking at you Jeremy Corbyn…). I switched on my “underrated gems” filter - that’s the ML residual at work - set a minimum rating and review count, exclude the eye-wateringly expensive options, and let the bubbles guide me. Bigger, darker bubbles mean places my model thinks the algorithm is undervaluing.

And just like that, I had dinner plans. Do try it yourself.

Btw, this is very much still a beta version - which means bugs, blind spots, and lots of room to grow. If something looks odd, missing, or wrong, please leave feature ideas and suggestions in the comments or via the comments on my website. Unlike the VS Code GitHub tracker and its 13.8k open issues, I really do read them.

But restaurants don’t fail alone - they fail in ecosystems. I also wanted to understand what happens when platform dynamics scale up from restaurants to entire neighbourhood food ecosystems. So I added a second modelling layer.

First, I aggregate restaurants into small spatial cells (the hexagons you see on the maps - because squares are for people who haven’t thought hard enough about edge effects) and compute summary features for each area: restaurant density, mean rating, mean residual, total reviews, chain share, cuisine entropy, and price level. I then standardise these and run principal component analysis (PCA) to compress them into a single continuous hub score that captures overall “restaurant ecosystem strength” in one dimension. Finally, I apply K-means clustering to the same feature space to classify areas into four structural types: elite , strong , everyday , and weak hubs.

At first glance, the patterns look comfortingly familiar. Central London dominates. Of course it does. But what matters is not just where the hubs are - it’s what kind of hubs they are. Using the full hub score rather than raw ratings alone, I identify the five most structurally powerful restaurant hubs in London. They are the places where density, algorithmic attention, independent survival, and consumer spending power all line up at once. They are labeled on the maps. I am deliberately refusing to rank them loudly in prose in order to avoid starting neighbourhood wars at scale (and to not disappoint Islington) - but the visual story is already extremely clear.

Overlaying this with the cuisine density panels reveals something even sharper. London’s culinary diversity is not evenly distributed across its platform economy. Migrant cuisines cluster strongly in parts of the city where algorithmic visibility is structurally weaker. Italian, Indian, Turkish, Chinese, Thai, British, Japanese, French, American, and fish-and-chips all trace distinct settlement histories, labour networks, retail formats, and relationships to capital and rent. Some cuisines form long, contiguous corridors. Others appear as punctuated clusters tied to specific high streets or income brackets.

Cuisine diversity, in other words, is not just about taste. It is about where families settled, which high streets remained affordable long enough for a second generation to open businesses, and which parts of the city experienced displacement before culinary ecosystems could mature. (If this part especially speaks to you, I go much deeper into it in Food for thought: local restaurant diversity meets migration ).

The Take-Away and Some Unwanted Policy Advice

This project started as a search problem and ended as something more. The most important result isn’t which neighbourhood tops the rankings - it’s the realisation that platforms now quietly structure survival in everyday urban markets. London’s restaurant scene is no longer organised by taste alone. It is organised by visibility that compounds, rent that rises when discovery arrives, and algorithms that allocate attention long before consumers ever show up. What looks like “choice” is increasingly the downstream effect of ranking systems.

For policy, that shifts the frame. If discovery now shapes small-business survival, then competition, fairness, and urban regeneration can no longer ignore platform ranking systems. Councils can rebuild streets and liberalise licensing all they like - but algorithmic invisibility can still leave places economically stranded. Platform transparency and auditability are no longer niche tech debates; they are quietly becoming tools of local economic policy. At minimum, ranking algorithms with this much economic consequence should be auditable. We audit financial markets. We should audit attention markets too.

For a navigation app, Google maps has a remarkable amount of power.
Just saying.

I’m also working on other maps (including a map of the best cycling and running routes with excellent cafés along the way, because I have needs). More broadly, I’m investing more and more time into building higher-quality public data projects. If you have an idea you’d like to see built - pitch it to me. And if you enjoy this kind of work, you can always Buy Me A Coffee or subscribe to help fund the next round of over-engineered maps.

Share

Discussion about this post

Ready for more?

Are the Three Musketeers allergic to muskets?(2014)

Hacker News
www.ox.ac.uk
2025-12-10 06:08:50
Comments...
Original Article

Three Musketeers

The Three Musketeers has been updated for the small screen by the BBC.

The BBC's new drama series The Musketeers – adapted from Alexandre Dumas' novel Les Trois Mousquetaires – made its debut on Sunday evening. Ahead of the screening, Dr Simon Kemp , Oxford University Fellow and Tutor in French, tackled the curious question of  why the musketeers appear to have an aversion to muskets...

"So here it comes. Peter Capaldi – Malcolm Tucker as was, Doctor Who as shortly will be – is twirling his moustache as Cardinal Richelieu in trailers for the much-heralded BBC adaptation of Alexandre Dumas' Les Trois Mousquetaires (1844). It's always good to see British TV take on French literary classics. Let's hope The Musketeers has a little more in common with its source material than the BBC's other recent effort, The Paradise , for which I'd be surprised if the producers were able to put up the subtitle 'based on the novel by Émile Zola' without blushing.

"At any rate, the Dumas adaptation looks exciting, with plenty of cape-swishing, sword-fighting, smouldering looks and death-defying leaps. Plus one element that is markedly more prevalent than in the book itself: gunfire. One of the odder things about Dumas' novel for the modern reader is its singular lack of muskets.

"In the mid-1620s, when the story is set, the Mousquetaires are the household guard of the French king, Louis XIII, an elite force trained for the battlefield as well as for the protection of the monarch and his family in peacetime. They are named for their specialist training in the use of the musket ( mousquet ), an early firearm originally developed in Spain at the end of the previous century under the name moschetto or 'sparrow-hawk'. Muskets were long-barrelled guns, quite unlike the pistols shown in the trailer, and fired by a 'matchlock' mechanism of holding a match or burning cord to a small hole leading to the powder chamber. By the 1620s they were not quite as cumbersome as the Spanish originals, which needed to have their barrels supported on a forked stick, but they were still pretty unwieldy devices.

"There are lots of weapons in the opening chapters of Les Trois Mousquetaires , where D'Artagnan travels to the barracks and challenges almost everyone he meets along the way to a duel (including all three of the musketeers). Lots of sword-fighting, but no muskets in sight. One of the musketeers has nicknamed his manservant mousequeton , or 'little musket', and that is as near as we get to a gun until page 429 of the Folio edition, when an actual mousqueton makes its first appearance. A mousqueton is not quite a musket, though, and in any case it's not one of the musketeers who is holding it.

"The siege of La Rochelle in the later part of the story seems a more propitious setting for firearms, and indeed, as soon as he arrives at the camp, D'Artagnan spies what appears to be a musket pointing at him from an ambush and flees, suffering only a hole to the hat. Examining the bullet-hole, he discovers 'la balle n'était pas une balle de mousquet, c'était une balle d'arquebuse ' ('the bullet was not from a musket, it was an arquebuse bullet', arquebuse being an earlier type of firearm). We are now 586 pages into the story, and starting to wonder if Dumas is playing a game with us.

"The suspicion is heightened when the musketeers take a jaunt into no man's land for some secret scheming away from the camp: ' Il me semble que pour une pareille expedition, nous aurions dû au moins emporter nos mousquets ,' frets Porthos on page 639 ('It seems to me that we ought to at least have taken our muskets along on an expedition like this'). ' Vous êtes un niais, ami Porthos; pourquoi nous charger d'un fardeau inutile? ' scoffs Athos in return ('You're a fool, Porthos, my friend. Why would we weight ourselves down with useless burdens?').

"The key to the mystery of the missing muskets is in these lines. Their absence from the novel up to this point is simply for the historical reason that the heavy and dangerous weapons were appropriate for the battlefield, not for the duties and skirmishes of peace-time Paris. Even when his heroes are mobilized, Dumas remains reluctant to give his musketeers their muskets. Remember that, writing in the 1840s, Dumas is closer in time to us today than he is to the period he's writing about, and his gaze back to the 17th century is often more drawn to romance than historical accuracy (as the cheerfully pedantic footnotes in my edition point out on every other page).

"For Dumas, the charm of his chosen period lies in the skill and daring of the accomplished swordsman, and his breathless narrative can wring far more excitement from a well-matched duel of blades than it could from a military gun battle. Heroism in Dumas is to be found in noble combat, staring your opponent in the eye as you match his deadly blade with your own, not in the clumsy long-range slaughter of unknowns. Musketeers his heroes must be, in order that they might belong to the royal guard and thus play a role in the dark conspiracies hatched around the King, the Queen and her English lover by Cardinal Richelieu, the power behind the throne. But the muskets themselves are surplus to requirements.

"Dumas does relent a little on his musket-phobia by the end of the novel. On page 645, the musketless musketeers fire at their enemies using weapons grabbed from corpses. And finally, on page 705, when Richelieu catches the four friends conspiring on the beach, we are at last granted a glimpse of the soldiers' own guns: ' [Athos] montra du doigt au cardinal les quatre mousquets en faisceau près du tambour sur lequel étaient les cartes et les dès ' ('He pointed out to the cardinal the four muskets stacked next to the drum on which lay the cards and dice').

"As far as I can make out, this is the only point at which we see the musketeers with their muskets in the whole story, and it seems a fitting way to present them to the reader: lying idle while the musketeers are occupied with other, more important amusements."

This post originally appeared on the outreach blog of the French sub-faculty at Oxford University.

Do Not Optimize Away

Lobsters
matklad.github.io
2025-12-10 05:07:23
Comments...
Original Article

Compilers are sneaky beasts. If you time code like this:

var total: u32 = 0;
for (0..N) |i| total += i;
print("total={}", .{total});

You will discover that LLVM is as smart as a little kid named Gauss, and replaces the summation with an equivalent formula N ( N + 1 ) 2 .

What’s more, if you write something more complicated like total += i + 2*i*i - i*i*i , you’ll see that LLVM figures out a closed-form expression for that as well (a generalization of the Gauss trick I proudly figured out in 11th grade). See for yourself: https://godbolt.org/z/T9EcTb8zq

Usually, this kind of thing is desirable — code runs faster! Except when you are trying to benchmark your code, and instead end up benchmarking an elaborate no-op.

There are two pitfalls with benchmarking. First , in

const start = now();
_ = computation()
const elapsed = now() - start;

a reasonable compiler can notice that computation ’s result is not used, and optimize the entire computation away.

Second , in

const parameter_a = 1_000_000;
const parameter_b = 1_000;

const start = now();
_ = computation(parameter_a, parameter_b);
const elapsed = now() - start;

even if the computation is not elided as a whole, compiler can constant-fold parts of it, taking advantage of the fact that values of parameters are known at compile time.

Time To Be Killing The Dragon Again

Usually languages provide some sort of an explicit “please do not optimize this away” function, like Rust’s hint::black_box or Zig’s mem.doNotOptimizeAway , but they always felt like snake oil to me:

  • Their semantics is tricky. It is sort of impossible to explain what exactly can and can not be optimized: the whole compilation pipeline is based on erasing everything about the original form of the code, maintaining only the semantics.
  • There’s a simpler and more direct way to achieve the desired result. Just open the box and check if the cat is there!

It’s easier to explain via an example. Let’s say I am benchmarking binary search:

fn insertion_point(xs: []const u32, x: u32) usize { ... }

I would use the following benchmarking scaffold:

fn benchmark(arena: Allocator) !void {
    const element_count =
        try parameter("element_count", 1_000_000);
    const search_count =
        try parameter("search_count", 10_000);

    const elements: []const u32 =
        make_elements(arena, element_count);
    const searches: []const u32 =
        make_searches(arena, search_count);

    const start = now();
    var hash: u32 = 0;
    for (searches) |key| {
        hash +%= insertion_point(elements, key);
    }
    const elapsed = now().duration_since(start);

    print("hash={}\n", .{hash});
    print("elapsed={}\n", .{elapsed});
}

fn parameter(comptime name: []const u8, default: u64) !u64 {
    const value = if (process.hasEnvVarConstant(name))
        try process.parseEnvVarInt(name, u64, 10)
    else
        default;
    print(name ++ "={}\n", .{value});
}

On the input side, the parameter function takes a symbolic name and a default value. It looks up the value among the environmental variables, with fallback. Because the value can be specified at runtime, compiler can’t optimize assuming a particular constant. And you also get a convenient way to re-run benchmark with a different set of parameters without recompiling.

On the output side, we compute an (extremely weak) “hash” of the results. For our binary search — just the sum of all the indexes. Then we print this hash together with the timing information. Because we use the results of our computation, compiler can’t optimize them away!

Similarly to the parameter function, we also get a bonus feature for free. You know who also loves making code faster by deleting “unnecessary” functionality? I do! Though I am not as smart as a compiler, and usually end up deleting code that actually is required to get the right answer. With the hash, if I mess my optimization work to the point of getting a wrong answer, I immediately see that reflected in an unexpected value of the hash.

Consider avoiding black boxes for your next benchmark. Instead, stick to natural anti-optimizing-compiler remedies:

  • Make input parameters runtime overridable (with compile time defaults),
  • print the result (or the hash thereof).

Dependable C

Hacker News
dependablec.org
2025-12-10 04:29:30
Comments...

'Source available' is not open source (and that's okay)

Hacker News
dri.es
2025-12-10 03:33:14
Comments...
Original Article

I have spent twenty years working on open source sustainability, so watching a fight ignite between Ruby on Rails creator David Heinemeier Hansson and WordPress founding developer Matt Mullenweg this week felt uncomfortably familiar in a way I wish it didn't.

David Heinemeier Hansson (also known as DHH) released a new kanban tool, Fizzy, this week and called it open source .

People quickly pointed out that the O'Saasy license that Fizzy is released under blocks others from offering a competing SaaS version, which violates the Open Source Initiative's definition . When challenged, he brushed it off on X and said, "You know this is just some shit people made up, right?". He followed with "Open source is when the source is open. Simple as that".

This morning, Matt Mullenweg rightly pushed back . He argued that you can't ignore the Open Source Initiative definition. He compared it to North Korea calling itself a democracy. A clumsy analogy, but the point stands.

Look, the term "open source" has a specific, shared meaning. It is not a loose idea and not something you can repurpose for marketing. Thousands of people shaped that definition over decades. Ignoring that work means benefiting from the community while setting aside its rules.

This whole debate becomes spicier knowing that DHH was on Lex Fridman's podcast only a few months ago, appealing to the spirit and ethics of open source to criticize Matt's handling of the WP Engine dispute . If the definition is just "shit people made up", what spirit was Matt violating?

The definition debate matters, but the bigger issue here is sustainability. DHH's choice of license reacts to a real pressure in open source: many companies make real money from open source software while leaving the hard work of building and maintaining it to others.

This tension also played a role in Matt's fight with WP Engine , so he and DHH share some common ground, even if they handle it differently. We see the same thing in Drupal, where the biggest companies do not always contribute at the same level.

DHH can experiment because Fizzy is new. He can choose a different license and see how it works. Matt can't as WordPress has been under the GPL for more than twenty years. Changing that now is virtually impossible.

Both conversations are important, but watching two of the most influential people in open source argue about definitions while we all wrestle with free riders feels a bit like firefighters arguing about hose lengths during a fire.

The definition debate matters because open source only works when we agree on what the term means. But sustainability decides whether projects like Drupal, WordPress, and Ruby on Rails keep thriving for decades to come. That is the conversation we need to have.

In Drupal, we are experimenting with contribution credits and with guiding work toward companies that support the project. These ideas have helped, but also have not solved the imbalance.

Six years ago I wrote in my Makers and Takers blog post that I would love to see new licenses that "encourage software free riding", but "discourage customer free riding". O'Saasy is exactly that kind of experiment.

A more accurate framing would be that Fizzy is source available . You can read it, run it, and modify it. But DHH's company is keeping the SaaS rights because they want to be able to build a sustainable business. That is defensible and generous, but it is not open source.

I still do not have the full answer to the open source sustainability problem. I have been wrestling with it for more than twenty years. But I do know the solution is not renaming the problem.

Some questions are worth asking, and answering:

  • How do we distinguish between companies that can't contribute and those that won't?
  • What actually changes corporate behavior: shame, self-interest, punitive action, exclusive benefits, or regulation?

If this latest fight nudges us away from word games and toward these questions, some good may come from it.

NYC congestion pricing cuts air pollution by 22% in six months

Hacker News
airqualitynews.com
2025-12-10 02:58:45
Comments...
Original Article

In its first six months, New York City’s controversial congestion pricing scheme has reduced air pollution by 22% in Manhattan’s toll zone, while improving air quality across the entire metropolitan region, according to new research.

The Cornell University study analysed data from 42 air quality monitors throughout the New York area between January 2024 and June 2025, tracking PM2.5 concentrations before and after the January 2025 launch of the Congestion Relief Zone (CRZ).

yellow cab on road during daytime
The findings provide the first rigorous evidence that charging drivers to enter Manhattan’s core delivers substantial public health benefits.

Within the CRZ, which covers Manhattan streets at or below 60th Street, average daily peak concentrations of PM2.5 dropped by 3.05 µg/m³. For context, background pollution levels in the region typically hover around 8-9 µg/m³, making this reduction particularly significant for public health.

Notably, the benefits were found to extend far beyond the toll zone itself. Across New York City’s five boroughs, pollution levels fell by an average of 1.07 µg/m³, while the broader metropolitan area saw reductions of 0.70 µg/m³. This refutes claims that congestion pricing merely pushes traffic and its associated pollution to neighboring communities.

The improvements grew stronger over time, suggesting drivers are increasingly adapting their behavior. In the CRZ’s first week, pollution reductions within the toll zone averaged just 0.8 µg/m³. By the 20th week, that figure had grown to 4.9 µg/m³, suggesting commuters were switching to public transit, rescheduling trips or finding alternative routes.

Indeed, traffic data supports this. Between January and June 2025, vehicle entries into the toll zone dropped approximately 11% overall, with heavy-duty truck traffic falling by 18% and passenger cars declining by 9%. The disproportionate reduction in truck traffic appears particularly important, as these vehicles contribute heavily to urban air pollution despite representing a smaller share of total traffic.

The results exceed outcomes from similar programs in European cities. Stockholm’s congestion pricing reduced air pollution by 5-15% over several years, while London’s Ultra Low Emission Zone achieved roughly a 7% citywide decline. The researchers suggest that New York’s comparatively larger impact reflects the city’s exceptional transit infrastructure and the high volume of discretionary trips that drivers can easily shift to subways and buses.

The findings arrive as other American cities, including San Francisco and Los Angeles, consider implementing their own congestion pricing systems. New York’s experience suggests such programs can deliver rapid environmental benefits while generating revenue for transit improvements – a dual outcome that urban planners have long sought but rarely achieved.

Senior author Oliver Gao said: ‘Our overall conclusion is that congestion pricing in New York City, like many other cities in the world that have implemented it, helped not only improve traffic, but also helped reduce air pollutant concentration, improve air quality and should be good for public health.’

The study’s co-lead author Timothy Fraser added: ‘It’s really exciting to me that air quality improved throughout the entire metro area. This tells us that congestion pricing didn’t simply relocate air pollution to the suburbs by rerouting traffic. Instead, folks are likely choosing cleaner transportation options altogether, like riding public transportation or scheduling deliveries at night. This thins traffic and limits how smog compounds when many cars are on the road.’

Photo : Franz Boccalatte / Unsplash

The end of the kernel Rust experiment

Linux Weekly News
lwn.net
2025-12-10 02:57:53
The topic of the Rust experiment was just discussed at the annual Maintainers Summit. The consensus among the assembled developers is that Rust in the kernel is no longer experimental — it is now a core part of the kernel and is here to stay. So the "experimental" tag will be coming off. Congratul...
Original Article

[Posted December 10, 2025 by corbet]

The topic of the Rust experiment was just discussed at the annual Maintainers Summit. The consensus among the assembled developers is that Rust in the kernel is no longer experimental — it is now a core part of the kernel and is here to stay. So the "experimental" tag will be coming off. Congratulations are in order for all of the Rust-for-Linux team.

(Stay tuned for details in our Maintainers Summit coverage.)


Looking for guidance on improving an offline security tool I built

Lobsters
lobste.rs
2025-12-10 02:56:43
I’ve spent the last six months building an offline security assistant. It runs entirely locally and is meant to help with day to day pentest and blue team work without sending anything to the cloud. It grew out of my own learning process because I’m self taught and wanted to build something useful w...
Original Article

I’ve spent the last six months building an offline security assistant. It runs entirely locally and is meant to help with day to day pentest and blue team work without sending anything to the cloud. It grew out of my own learning process because I’m self taught and wanted to build something useful while improving my skills.

I’ve now reached the point where I’ve taken it as far as I can alone. I open sourced it because I need advice from people who know this space better than I do. I have tried posting about it elsewhere to get feedback but the posts either get removed or sink without any replies.

I am not trying to sell anything. I genuinely want technical guidance on how to present the project properly, what direction to take it in, and what the community thinks about the idea and execution. If anyone is willing to look at it and give honest feedback I would appreciate it. I can share the repository if that is appropriate here.

Making macOS Bearable

Hacker News
seg6.space
2025-12-10 02:37:34
Comments...
Original Article

Intro

Ideally, a computer system should feel like an extension of your body. When you pick up a cup of coffee, you don't consciously think, "I need to engage my bicep, extend my forearm, and grasp with my fingers." You just think "drink coffee," and your body complies.

I've spent the better part of eight years on various flavors of Arch Linux, and over that time I settled into a local minimum: a system configuration where I can enter a flow state, forget I'm using a computer at all, and just focus on the work. The machine disappears.

Recently, I started using macOS (my workplace issued me an M4 Pro MacBook, and I can't yet put Asahi Linux on it), and with this change, that neural link was severed. Stock macOS gives me something like motion sickness whenever I try to accomplish anything. There's just too much friction in Spaces, Mission Control, window management, all of it.

So I set out to fix this for myself.

The "Where's Waldo" Problem

Apple wants you to use Mission Control. They want you to swipe up with three fingers, see a scattered mosaic of every window you have open, and then use your eyes to scan for the one you want.

mission control

This is terrible!!!

Visual search is the most expensive cognitive task you can perform while focused on doing something. Every time you have to scan the screen to find a window, you are breaking context.

My hierarchy of navigation is as follows:

  1. Shortcuts: I know exactly where something is. I press a key, and I am there.
  2. Fuzzy Finding: I know what I want, but not where it is. I type three letters into Raycast, and it appears.
  3. Visual Search: This is the fallback I try to never use.

Encoding Location with Aerospace

The default macOS window model is "floating." Windows pile on top of each other, you drag them around manually, and Spaces lets you swipe between virtual desktops that have no enforced structure. It's flexible, but flexibility without constraints is just chaos.

To fix this, I use Aerospace. It's a tiling window manager that replaces the native "Spaces" concept with rigid, deterministic workspaces.

aerospace

Aerospace allows me to spatially encode my software. I don't need to "check" where Spotify is. Spotify is on Workspace 9. Always. My browser is on Workspace 1. My terminal is on Workspace 2.

[workspace-to-monitor-force-assignment]
7 = 'secondary'
8 = 'secondary'
9 = 'secondary'

[[on-window-detected]]
if.app-id = 'com.mitchellh.ghostty'
run = 'move-node-to-workspace 2'

This turns navigation into muscle memory. Cmd-2 is not "Switch to Terminal"; Cmd-2 is just the physical reflex of "I want to code." I don't look. I just hit the key combination, and the active workspace changes.

Development Workspace

Inside Workspace 2 lives Ghostty, running Tmux.

But standard Tmux keybinds are too clunky. The default Ctrl-b prefix doesn't spark joy to use. I use root bindings ( -n ) to bypass the prefix entirely where I see it fit.

I don't use panes; I use full windows as "views." Alt-1 switches to the first window. Alt-2 switches to the second. But here is the logic that makes it flow:

bind -n M-1 if-shell 'tmux select-window -t 1' '' 'new-window -t 1'

If window 1 doesn't exist, it creates it. I don't "manage" windows; I just go to where I want to be, and the system accommodates me.

To glue it all together, I wrote a custom Rust tool called ws .

ws session switcher in action

When I hit Alt-s , a fuzzy finder pops up over my current work. I pick a project, and ws instantly attaches to that session or spins up a new environment with my editor ( helix ) and file manager ( fx ) ready to go. It maintains a stack-based history, so I can jump to a project, fix a bug, and hit "Back" to return to exactly where I was.

The Language of Motion

Humans are incredibly good at language. We are hardwired for syntax, grammar, and structure. We are not hardwired to hunt for pixels on a glowing rectangle.

This is why I use modal editing. It stops text manipulation from being a manual labor task, e.g. dragging a mouse, holding backspace, and turns it into a conversation. If I want to change the text inside some quotes, I don't drag a cursor; I speak the command: ci" (change inside quotes). It is linguistic. I am speaking to the editor in a language we both understand.

The problem with modern OS design is that it abandons this linguistic efficiency for visual clutter.

Bypassing the Mouse

Of course, I still use the mouse. I’m not a zealot. But for 90% of web browsing, lifting my hand to the mouse is unnecessary friction.

I use Vimium in the browser.

vimium

When I want to click a link, I don't aim; I just look at it. Two letters appear over the link, I type them, and it clicks. It feels telepathic. I look at the element, and the element activates.

I recently added Homerow to the mix, which brings this same "look and type" navigation to the entire macOS UI. It allows me to click system dialogs or toolbar buttons without ever leaving the home row.


By layering Aerospace, Tmux, and modal editing, I’ve tried to replicate that "extension of the body" feeling. The goal isn't to be a "power user" for the sake of it. The goal is to remove the lag between thinking "I want to do X" and the computer actually doing it.

The dotfiles: https://github.com/seg6/dotfiles

Show all your application error using Cloudflare Error Page

Hacker News
github.com
2025-12-10 02:18:07
Comments...
Original Article

Cloudflare Error Page Generator

📢 Update (2025/12/09) : All icons used in the error page have been fully redrawn as vector assets. These icons along with the stylesheet are also inlined into a single file of the error page, eliminating any need of hosting additional resources and ensuring better experience for you and your end users.

What does this project do?

This project creates customized error pages that mimics the well-known Cloudflare error page. You can also embed it into your website.

Online Editor

Here's an online editor to create customized error pages. Try it out here .

Editor

(And thank @rynzland for the idea!)

Quickstart for Programmers

Python

Install cloudflare-error-page with pip.

pip install git+https://github.com/donlon/cloudflare-error-page.git

Then you can generate an error page with the render function. ( example.py )

import webbrowser
from cloudflare_error_page import render as render_cf_error_page

# This function renders an error page based on the input parameters
error_page = render_cf_error_page({
    # Browser status is ok
    'browser_status': {
        "status": 'ok',
    },
    # Cloudflare status is error
    'cloudflare_status': {
        "status": 'error',
        "status_text": 'Error',
    },
    # Host status is also ok
    'host_status': {
        "status": 'ok',
        "location": 'example.com',
    },
    # can be 'browser', 'cloudflare', or 'host'
    'error_source': 'cloudflare',

    # Texts shown in the bottom of the page
    'what_happened': '<p>There is an internal server error on Cloudflare\'s network.</p>',
    'what_can_i_do': '<p>Please try again in a few minutes.</p>',
})

with open('error.html', 'w') as f:
    f.write(error_page)

webbrowser.open('error.html')

Default error page

You can also see live demo here .

A demo server using Flask is also available in flask_demo.py .

Node.js

PHP

More Examples

Catastrophic infrastructure failure

params = {
    "title": "Catastrophic infrastructure failure",
    "more_information": {
        "for": "no information",
    },
    "browser_status": {
        "status": "error",
        "status_text": "Out of Memory",
    },
    "cloudflare_status": {
        "status": "error",
        "location": "Everywhere",
        "status_text": "Error",
    },
    "host_status": {
        "status": "error",
        "location": "example.com",
        "status_text": "On Fire",
    },
    "error_source": "cloudflare",
    "what_happened": "<p>There is a catastrophic failure.</p>",
    "what_can_i_do": "<p>Please try again in a few years.</p>",
}

Catastrophic infrastructure failure

Demo

Web server is working

params = {
    "title": "Web server is working",
    "error_code": 200,
    "more_information": {
        "hidden": True,
    },
    "browser_status": {
        "status": "ok",
        "status_text": "Seems Working",
    },
    "cloudflare_status": {
        "status": "ok",
        "status_text": "Often Working",
    },
    "host_status": {
        "status": "ok",
        "location": "example.com",
        "status_text": "Almost Working",
    },
    "error_source": "host",
    "what_happened": "<p>This site is still working. And it looks great.</p>",
    "what_can_i_do": "<p>Visit the site before it crashes someday.</p>",
}

Web server is working

Demo

FAQ

How to show real user IP / Cloudflare Ray ID / data center location in the error page so that it looks more realistic?

Ray ID and user IP field in the error page can be set by ray_id and client_ip properties in the params argument passed to the render function. The real Cloudflare Ray ID and the data center location of current request can be extracted from the Cf-Ray request header (e.g. Cf-Ray: 230b030023ae2822-SJC ). Detailed description of this header can be found in Cloudflare documentation .

To lookup the city name of the data center corresponding to the three letter code in the header, you can use a location list from here

The demo server runs in our website did handle these. Take a look at this file for reference.

See also

Full Parameter Reference

{
    "html_title": "cloudflare.com | 500: Internal server error",
    "title": "Internal server error",
    "error_code": 500,
    "time": "2025-11-18 12:34:56 UTC",  // if not set, current UTC time is shown

    // Configuration for "Visit ... for more information" line
    "more_information": {
        "hidden": false,
        "text": "cloudflare.com", 
        "link": "https://www.cloudflare.com/",
        "for": "more information",
    },

    // Configuration for the Browser/Cloudflare/Host status
    "browser_status": {
        "status": "ok", // "ok" or "error"
        "location": "You",
        "name": "Browser",
        "status_text": "Working",
        "status_text_color": "#9bca3e",
    },
    "cloudflare_status": {
        "status": "error",
        "location": "Cloud",
        "name": "Cloudflare",
        "status_text": "Error",
        "status_text_color": "#bd2426",
    },
    "host_status": {
        "status": "ok",
        "location": "The Site",
        "name": "Host",
        "status_text": "Working",
        "status_text_color": "#9bca3e",
    },
    "error_source": "host", // Position of the error indicator, can be "browser", "cloudflare", or "host"

    "what_happened": "<p>There is an internal server error on Cloudflare's network.</p>",
    "what_can_i_do": "<p>Please try again in a few minutes.</p>",

    "ray_id": '0123456789abcdef',  // if not set, random hex string is shown
    "client_ip": '1.1.1.1',

    // Configuration for 'Performance & security by ...' in the footer
    "perf_sec_by": {
        "text": "Cloudflare",
        "link": "https://www.cloudflare.com/",
    },
}

The AI-Education Death Spiral a.k.a. Let the Kids Cheat

Hacker News
anandsanwal.me
2025-12-10 01:36:56
Comments...
Original Article
AI education death spiral

This essay first appeared in my newsletter. Sign up here if interested in Unf^cking Education .


Your kid didn’t write their essay last night.

ChatGPT did.

And that might be the most honest thing happening in school today.

They’re copying essays from AI, running them through “humanizing” tools, and handing in work they’ve barely read. They’re having AI listen to lectures so they don’t have to. They’re sneaking AI via their mobile phones into tests.

They’re using ChatGPT for everything from math homework to history essays to college applications.

And they should be.

To be clear, I’m not advocating for AI in real learning. AI is only useful right now as a stress test as it reveals how hollow adolescent work has become. If it pushes schools toward offering work with relevance, impact, and agency and away from hopeless busywork (“When will I ever use this?”), that is a win.


Because AI isn’t the problem.

It’s just a light revealing how fake and pointless school has become.

The Death Spiral Has Already Begun

Walk into any high school classroom. A majority of the work is written by AI.

Everyone knows. Most say nothing.

Teachers pretend to grade.

Students pretend to write.

It’s as much about learning as taking your shoes off at the airport is about security.

Teachers and professors acknowledge it is rampant, but there is little they can do as evidenced by this post.

The author of this post ended it with this humorous conclusion.

So yeah. ChatGPT is my best student now. It hands in perfect work, never complains, and never asks for an extension. And the worst part? I think I like it better.

And as highlighted above, this is “every single paper”, i.e., this isn’t a few bad apples.

Parents who found their daughter cheating on multiple assignments heard:

“Everyone is doing this” and that it’s the only way to stay competitive.

McCabe’s research confirms this: once cheating becomes normalized and the system loses legitimacy, defection becomes the dominant strategy.

This is the classic prisoner’s dilemma.

  • If everyone plays fair, all benefit.
  • But if others cheat and you don’t, you fall behind.
  • So even the “good” students feel forced to cheat just to stay even.

This, however, isn’t a moral collapse.

It’s a design failure.

The real revelation?

AI exposed that a lot of school work isn’t worth the effort.

Maria Montessori said it a century ago:

“The work must be something the child feels is worth doing.”

Schools forgot and flipped that.

They assign work and expect kids to value it merely because it was assigned.

The Predictable Crackdown

Some schools and teachers unhappy with the theater chose not to look the other way and responded exactly as you’d expect.

First came the guilt: “You’re only cheating yourself.”

When that inevitably didn’t work, they escalated to AI detectors that don’t work, forced handwritten essays, laptop bans, surveillance tools.

They made classrooms, places where you’re already told to sit still and do as you’re told, even more prison-like . Not surprisingly, this same strategy is being used at universities as Princeton University professor D. Graham Burnett reveals in this response on the Hard Fork podcast who states:

We’re like the sheriffs. and so the concern is all my assignments are now useless. I can’t assign papers. Am I going to have do a blue book exam?

Their strategies, as you can see, are almost all punitive.

As one lecturer inspiringly put it :

“Catch what you can, no mercy for the damned.”

And then they wonder why students check out even more .

Here’s what they never admit: AI didn’t create the problem. It just revealed it.

The Coming Collapse

Follow the money.

What happens when a 4.0 GPA means nothing because half the work was done by AI?

We’ve seen this before.

During COVID, when school went virtual, parents saw what was really going on.

The result?

Public school enrollment dropped by 1.3 million. States like Oregon and New York lost over 5% of their students.

And it will similarly accelerate when parents realize they’re paying (via taxes or tuition) for education theater and their students are actually learning very little.

Colleges will then quietly start ignoring GPAs .

Employers will stop trusting transcripts.

And when everyone acknowledges that the product is worthless, the economic foundation collapses.

What Survives the Stress Test?

AI is a filter.

It strips away everything that can be automated, leaving only what requires actual thinking: creativity, collaboration, real-world problem-solving.

Deci & Ryan’s research says people engage when they have autonomy, competence, and purpose.

School as we’ve constructed it for hundreds of years kills all three .

But some are adapting.

  • At High Tech High, students tackle real community problems.
  • At Forney ISD in Texas , students run actual businesses inside their school.
  • At the School of Entrepreneuring , students identify and solve real problems on behalf of others while working together.

Boalar’s research confirms this: when work is relevant and challenging, cheating drops dramatically.

Not because it’s harder, but because students actually want to do the work.

We need to move to education that prioritizes engagement (note: school need not be easy or fun. It requires productive struggle )

Let It Burn

AI cheating highlights that much of what passes for education today has no value.

So let AI burn down and reveal how inane this work is.

Let it break the model so we can finally build something better.

Because the students have already figured it out.

The next time a teacher complains about AI cheating, ask: If a machine can do this assignment perfectly, why are you giving it to this student?And then we can replace it with education and work that actually matters.

224× Compression of Llama-70B with Higher Accuracy (Paper and Code)

Hacker News
zenodo.org
2025-12-10 01:25:00
Comments...
Original Article

Published December 10, 2025 | Version v1

Preprint Open

  • 1. Anima Core Inc
  • 2. Shamim Institute of Soul Systems

Description

This paper introduces the first verified method to eliminate transformers from inference while preserving, and in many cases improving, downstream accuracy.

We show that a frozen 70-billion-parameter Llama-3.3-70B model can be replaced by a 256-dimensional meaning field extracted from seven internal activation layers. A lightweight compressor (AN1) reduces these fields by 224× with an average +1.81 percentage point gain across classification tasks, including +3.25 pp on low-resource RTE (R² = 0.98 inverse-scaling fit, p < 0.01). A 30M-parameter student then learns to regenerate these fields directly from raw text, enabling full transformer-free inference at 60× higher throughput with only 0.35 pp average accuracy loss.

The core insight is that task-aligned semantics in modern transformers occupy a remarkably low-rank manifold. Across layers we observe 72–99 percent of variance in the top one to three dimensions. Once this structure is extracted and learned, the transformer becomes unnecessary. It serves as a one-time sculptor of meaning rather than the permanent home of inference.

This work establishes Field Processing Units (FPUs) as a post-transformer compute primitive that replaces deep matrix multiplication with shallow field operations.

All results are averaged over five seeds with statistical significance reported. Ablations isolate the causal contributions of field supervision, geometric regularization, and anchor-layer selection.

This Zenodo release provides the complete scientific manuscript and the baseline reference implementation for the AN1 Core system. Proprietary optimizations (AN1-Turbo) have been removed to support independent verification and further research into post-transformer inference.

Files

Post-Transformer_Inference.pdf

Files (1.6 MB)

Additional details

Rubio Deletes Calibri as the State Department's Official Typeface

Hacker News
www.nytimes.com
2025-12-10 00:59:05
Comments...
Original Article

Please enable JS and disable any ad blocker

Starbucks Workers Are Still Without a Labor Deal Four Years After Their First Union Win. Here’s Why

Portside
portside.org
2025-12-10 00:49:25
Starbucks Workers Are Still Without a Labor Deal Four Years After Their First Union Win. Here’s Why Greg Tue, 12/09/2025 - 19:49 ...
Original Article

August Code works at the first ever Starbucks location to unionize in 2021. But four years after that vote, he and his co-workers in Buffalo, New York, are still waiting on their union contract.

“I would have imagined we would have seen a contract a long time ago,” Code told CNN. “To think we don’t have a contract four years later, yeah, that’s upsetting. I didn’t think we’d be at this point.”

Tuesday is the anniversary of the first union win at Starbucks . The union-organizing campaign there has been one of the biggest successes in the American labor movement in the past few years.

Concerns over working conditions during the pandemic spurred on younger workers, generally pro-union and who make up a large share of Starbucks’ workforce, to unionize. About 560 Starbucks locations have voted for union representation since that first vote four years ago, according to the union Starbucks Workers United. (An additional 90 stores that organized have closed amid a slew of store closings).

But despite the momentum, there is still no labor contract, a key goal of union representation. Contracts can further workers’ voices and improve wages, benefits and other working conditions.

US labor laws can’t help new unions force companies to reach a deal. The laws only require employers to bargain in “good faith,” meaning there are basically no penalties if companies drag out negotiations for years.

Liz Shuler, president of AFL-CIO, told CNN that the lack of a contract at Starbucks after four years is a sign labor laws need changing.

“People want to feel they’ve taken this risk and done it for a reason, and that would be to have a contract,” Shuler said. “I think they’ll get there. But it’s going to take some time because these corporations are able to withstand this kind of effort.”

Similar to Starbucks, other recent high-profile union campaigns haven’t yet reached a first contract.

That includes Amazon warehouse workers in Staten Island, New York, who voted in 2022 to form the tech giant’s first union. And the United Auto Workers union last year won the right to represent workers at a Volkswagen plant in Chattanooga, Tennessee – the first shot at organizing the approximately 150,000 US auto workers employed at nonunion plants.

Both sides blame the other for lack of contract

Companies often show little willingness to meet the unions’ bargaining demands, even after their workers vote for representation.

Amazon doesn’t even recognize the victory at its unionized warehouse, continuing to challenge the results. Rank-and-file members have authorized a strike at the US Volkswagen plant, but no date has been set.

Starbucks regularly argues that its employees don’t need a union since it pays better wages and benefits than many other retailers. The union is seeking seeking wage improvement, better staffing at stores and improved scheduling rules.

Its workers continue to win representation elections. But talks between the union and management have dragged on for so long that many workers who voted in early elections have already left the company.

The two sides appear far apart on any deal, with each blaming the other since mediated talks ended this past spring.

“This company responded in such a way from the onset that we knew it was going to be a fight,” said Michelle Eisen, one of the leaders of that initial union campaign in Buffalo. She has since left the company after 15 years to work for the union.

Starbucks, meanwhile, insists it wants to reach a contract with the union.

“For months, we were at the bargaining table, working in good faith with Workers United and delegates from across the country to reach agreements that make sense for partners and for the long-term success of Starbucks,” Sara Kelly, a top Starbucks executive, told employees in a memo last month.

The union is waging an open-ended strike at about 150 stores that started on November 13 — also known as “Red Cup” day, one of Starbucks’ biggest promotional days every year.

“I truly believe this is the tipping point,” Eisen said “I’ve never seen workers as fired up as they are right now.”

Starbucks said that the strike did not affect sales that day and that the stores facing strikes are a small fraction of the 10,000 company-owned US stores. Less than 5% of Starbucks’ 240,000 front-line employees are union members.

But a union fight is another headache for Starbucks coming off years of declining sales and following hundreds of store closures in September. North American sales fell 2% over the 12 months ending in late September and would have fallen twice that much if not for increased prices. US tariffs have also boosted the price of coffee, which retails nearly 19% more than last year, according to the latest government data .

Difficult first contract is the norm

Failing to reach a quick first contract is not unique to Starbucks, or Amazon or Volkswagen.

Only 37% of newly formed unions reach an initial contract within a year, and 48% reach a deal within 18 months, according to ongoing research from by Johnnie Kallas, assistant professor of labor studies at the University of Illinois.

The American labor movement is seeking legislation that would help unions win that first contract more quickly.

Senator Josh Hawley, a Republican from Missouri, introduced legislation in March to impose binding arbitration if a newly formed union and the company can’t reach a contract within months.

“Workers are often prevented from enjoying the benefits of the union they voted to form when mega-corporations drag their feet, slow-walk contract negotiations, and try to erode support for the union,” Hawley said in a statement in March.

The bill has widespread Democratic support as well as a few other Republican co-sponsors. But the legislation has so far gone nowhere.

Despite the lack of legislation, Shuler voiced confidence the Starbucks union will eventually get the contract, especially because of the commitment of the union’s younger membership.

“I feel like they’re in it for the long haul,” she said.

Some of the union activists at Starbucks said the lack of a contract has made it easier to organize. That’s because it demonstrates the need for a union to improve conditions.

“It hasn’t slowed down our organizing efforts at all,” said Diego Franco, a striking union barista from Des Plaines, Illinois, and a member of the union’s bargaining committee.

Franco also expressed confidence in a win.

“Eventually, the company is going to cave and we’re going win the strong contract we’ve been fighting for – whether I’m still around or not,” he said.

Fear of the Walking Zig: The Security Audit Gap

Lobsters
generativeai.pub
2025-12-10 00:41:26
Comments...

Get on the Job and Organize: A Q&A With Jaz Brisack

Portside
portside.org
2025-12-10 00:39:14
Get on the Job and Organize: A Q&A With Jaz Brisack Judy Tue, 12/09/2025 - 19:39 ...
Original Article

Jaz Brisack helped lead the unionization of a Starbucks store in downtown Buffalo, New York | Malik Rainey / The New York Times

After years and years of declining union density in America, the beginning of the 2020s felt like a sea change. Upstart labor campaigns notched huge wins at Amazon, public support for unions reached new heights, and new organizing election petitions with the National Labor Relations Board soared.

Halfway through the decade though, the surge in labor organizing has not managed to slow the decline in Americans represented by a union. Why?

Pulling from experiences at the heart of union campaigns at Nissan, Tesla, and Starbucks, labor organizer Jaz Brisack details how the deck of American labor law is stacked in the favor of employers. But that isn’t the only thing holding the labor movement back — Brisack’s experiences also show how major labor unions can derail campaigns with onerous bureaucracy that restricts worker leadership.

The bulk of Get on the Job and Organize: Standing Up for a Better Workplace and a Better World , available here , focuses on Brisack’s time as co-founder of Starbucks Workers United and the campaign to organize cafes born in Buffalo, New York.

Brisack started at Starbucks as a “salt” — someone who gets a job in a workplace with the intent of unionizing it — and continues to train people eager to pursue that path at the Inside Organizer School . But the effort to unionize Starbucks was a salt and pepper campaign, as organic leaders soon emerged among Brisack’s colleagues.

Inequality.org sat down with Brisack earlier this month to discuss their book and the lessons the labor movement can draw from worker-led campaigns.

This interview has been edited for length and clarity.

Chris Mills Rodrigo: One theme across the campaigns you discuss in this book is the importance of winning the “right to organize” before anything else. Could you explain why that’s a prerequisite to workplace issues? How can bosses take advantage of more narrow issue campaigns?

Jaz Brisack: The right to organize almost sounds like a platitude sometimes, but it’s also the very core of what we’re fighting for. I think it sounds like a platitude because we often have unions and politicians giving lip service to the right to organize without actually committing to the fight of what it means to win the right to organize. Throughout labor history a huge piece of every fight for greater labor rights was the right to have democracy in the workplace, to have an independent organization where workers can advocate for themselves. That’s still what’s at the heart of the right to organize on every campaign, whether it’s Nissan, Starbucks or Tesla.

Companies have unilateral control if workers don’t have a union and companies want to maintain that control, so unions basically have to make it more difficult, more painful, more costly for a company to continue to insist on crushing the union rather than deal with sharing power. This also gets into a psychological question of how the management is thinking about this. Some of them are thinking about it as a business equation, but some of them — like Howard Schultz at Starbucks — are thinking about it as a referendum on their own leadership in a very personal way.

Companies will do just about anything rather than give up unilateral control. We’ve gotten the question of “what would you like to improve with a union?” on every campaign and our answer was always we want a voice on the job and we’ll get into this more at the bargaining table. Sometimes you have to talk with co-workers about things you want to change or tell the press about some of the conditions in the workplace. But I think companies will try to find out what workers would like to have changed, whether that’s a bad manager, whether that’s pay issues, whatever it is, and, often, will make any kinds of improvements that they can, short of actually giving workers an independent democratic body — that is, a union.

CMR: What do you think is behind that fear of giving up unilateral control?

JB: It’s really about the desire to control workers. Without a union, management has the final say on everything, and workers have basically no rights to their jobs or to how they want to work. Bosses can fire people for just about any reason, except for a protected reason. But even then, that’s a very hard thing to prove, and management can easily claim it’s for something else. This desire to have full power over workers and over the company is extremely motivating to corporations.

I think there’s secondary fears about whether the union will allow for flexibility, will the company be as profitable, etc. But often companies will spend more to crush the union than they would spend to actually recognize and negotiate a first contract. And so I think this question of control is the piece that’s really at the heart of the matter.

CMR: Does the current state of the NLRB — which was already rife with delays and heavily tilted to employers before the second Trump term — change the calculus around the right to organize or how you would approach a new union campaign?

JB: I think barely, if at all. Maybe it’s made me a little bit more open to card check , but I think that’s almost a semantics question. We’ve asked all of these companies, from Nissan to Starbucks to Ben and Jerry’s, etc., to sign the Fair Election Principles. The reason for doing so is that having an election is kind of psychologically considered even more of a gold standard, including by workers. The NLRB has historically been much better at administering elections than it has been at enforcing any other part of labor law. So having an election with neutrality, or with equal time for a union campaign, is much preferable to having a card check scenario where the company is fighting.

But I think the Starbucks campaign shows the limitations of even a “favorable” NLRB. Starbucks broke the law hundreds of times in each city, thousands of times across the country. I was fired in 2022; workers across the country are still awaiting reinstatement, awaiting back pay. Starbucks basically had no incentive not to break the law, and in fact, had every incentive to break the law and deal with the consequences later. Even at the best of times, the NLRB isn’t really sufficient to protect the right to organize. The laws are very weak. There’s no penalties, there’s only remedies.

I think winning the right to organize is not something that we’re going to get through the law. It’s something that we have to get through the court of public opinion, through consumer pressure. Exactly what the “hammer” is depends on the company and on their business model, etc. At Starbucks, we were pretty convinced it was a boycott and I think that’s just more true than ever, as the NLRB is either undermined or actively hostile.

CMR: That’s a perfect segue, because I wanted to talk about hammers next and the importance of having them to compel companies to the bargaining table. What about American labor unions makes them so reticent to even threaten the use of hammers?

JB: Million dollar question, if we could fix this one we would have a very different state of union density in the US. I think partly it’s that it’s very hard to actually commit to . It’s easy to call for a boycott on social media, and sometimes you get lucky with those like the solidarity with Palestine boycott of Starbucks, which I think exceeded our wildest expectations. But to really do a consumer boycott of a company you actually need to have people outside of stores, picket captains, resources. I think the Teamsters at Chipotle could have easily brought Chipotle down with a real consumer campaign. The UAW at Nissan — when Richard and his crew were testing the impact of student and church group pickets outside of Nissan dealerships the results were very encouraging, people were not buying cars once they knew what was going on. That could have potentially won us the right to organize at Nissan.

It’s partly a resource question and then partly a strategy question with Starbucks. Starbucks has 10,000 stores, you would have needed a presence at a huge number of those stores to ensure and force a financial impact.

Workers United is affiliated with SEIU, and there were different schools of thought within Workers United and SEIU about how you should even go about organizing: through NLRB elections and contracts versus a much more kind of legislative reform, wage based Fight for 15 style model. SEIU has tried to jump through hoops to reconcile the Starbucks campaign with their approach at Waffle House or McDonalds where they refuse to file for elections and are instead doing days of action and pressure campaigns. I think there’s also a fear of the very grassroots nature of these union campaigns where there’s been struggles over how much control workers would actually have of the campaigns versus decision makers within the union.

CMR: Unionized Starbucks workers went on strike starting November 13th, Red Cup Day, and more have joined since. How did we get to this point?

JB: This is really an extension of one of the Starbucks campaign strategy camp’s tactics of having national strikes on a consistent basis. The union and the company probably haven’t been that far off on a deal. We knew a while back that they really only had an impasse on economic issues. Everything else was mostly done. I think the real question is: could there be a deal on a contract that would actually win the right to organize at all of the stores versus kind of limiting the union to a minority? And I think that remains to be seen.

The campaign is still remarkably resilient. A lot of the workers who are on strike now are folks who came into the movement later and who are amazing leaders. It’s really impressive how much people have been able to withstand.

With this strike there’s been sort of a tentative call for a boycott with figures like Zohran saying don’t go to Starbucks. We should have had that energy in 2022 when Starbucks fired the Memphis Seven . Now there’s been sort of on-again, off-again momentum. Starbucks is not in the public eye the way we were in 2022 and the union’s hesitation around endorsing the Palestine boycott was definitely a cause of some lost momentum. But I think, better late than never, hopefully there is a contract, and then things can always improve from there.

CMR: What was the internal discussion like over calling for a boycott back then?

JB: In retrospect we should have just done it. The core organizing committee, which was still largely a Buffalo group but was expanding to other parts of the country, were the ones writing all the press releases, doing social media, and kind of controlling the public narrative. I think the decision making at that point was a bit muddy, SEIU wasn’t really in the picture yet, it was Workers United leadership and they were very hesitant that it’d be hard to enforce a boycott, that it might not work.

There was also a school of thought that striking was our only form of worker power and that boycotting was sort of a cop-out or not as militant of a strategy as striking over these issues. And then I think there was also a fear that it would hurt the organizing effort at other stores, which I would argue it did anyway, but we didn’t actually have the leverage that a boycott would have given us.

CMR: I was very fascinated with the internal conflict you seem to have had over organizing at Starbucks. As a campaign framing it makes sense to say you want the company to be better, even though in the back of your mind you would prefer there not to be a corporate behemoth dominating local cafes. How do you work through this tension?

JB: I had to wrestle with this a lot. Basically the way I approach it is: we’re unlikely to put Starbucks out of business. A couple of times it actually did seem more likely that Starbucks might go out of business rather than respect the right to organize. But I don’t think there’s any world in which Starbucks actually goes under. Like Walmart’s market share is now being threatened by Amazon, but it’s not like Walmart is going away. Starbucks may lose some of its ubiquity, but it is still too big of a player to fully get rid of.

My personal way of reconciling the positive rhetoric was just focusing on things I really did like about the job. I was an opener, the camaraderie with my coworkers was really great, the kind of people who are attracted to working at Starbucks are really great. I think it’s kind of true with baristas anywhere, but certainly among Starbucks workers there was just an immediate understanding and an immediate jargon that was universally shared and made for very easy, immediate bonding.

And then a lot of customers aren’t great, but there were fun parts of the job. I actually really enjoy coffee. Starbucks, unfortunately, doesn’t really allow you to do that much of the craft part. But we would find ways to learn about coffee and make good coffee by bringing in beans ourselves. So I was able to find ways not to lie when I was talking about loving my job.

We can’t fight all of the battles all at once. We have to find stepping stones to getting where we want. And the first step has to be really changing union density. One worker asked me very early on, I can’t remember if this is in the book, would Starbucks take us seriously if we said we loved the company instead of, you know, we want to overthrow capitalism. I was like, don’t worry, they will see the word union and they will understand what that means, because that’s what actually is threatening to them.

CMR: In the same vein, as a kind of wrap up question, what would you hope that other unions can learn from the Starbucks campaign, which remains one of the most exciting in recent memory?

JB: I would say it underscores the importance of calling the question on the right to organize very early. We were talking about doing this before we even voted in the first elections, and we decided to wait long enough to have a concrete election victory that we could then point to and say the workers have spoken and they want to union. I wouldn’t wait much longer than that. You don’t gain a lot of momentum over a longer period of time, and we have to get better as a labor movement about really acting as one big union. We had unions that were offering to adopt stores and take on picketing at various locations, and there was never really a willingness to even ask them to put that into practice.

The other main piece is that unions need to be less worried about control and more worried about throwing things at the wall and seeing what sticks. We are dying as a labor movement. We have existential threats from the government, from corporations, from all of the typical factors, but it all seems even more ramped up these days. So we need to be less worried about whether Tesla workers are organizing jurisdictionally with the right union, or whether Starbucks workers might somehow say the wrong thing at the bargaining table if they’re allowed to bargain at hundreds of stores simultaneously, or whether Chipotle workers should be allowed to do a national pressure campaign.

Let workers take autonomy in their own campaigns and see what works.

===

10 Years of Let's Encrypt

Simon Willison
simonwillison.net
2025-12-10 00:34:15
10 Years of Let's Encrypt Internet Security Research Group co-founder and Executive Director Josh Aas: On September 14, 2015, our first publicly-trusted certificate went live. [...] Today, Let’s Encrypt is the largest certificate authority in the world in terms of certificates issued, the ACME...
Original Article

10 Years of Let's Encrypt ( via ) Internet Security Research Group co-founder and Executive Director Josh Aas:

On September 14, 2015, our first publicly-trusted certificate went live . [...] Today, Let’s Encrypt is the largest certificate authority in the world in terms of certificates issued, the ACME protocol we helped create and standardize is integrated throughout the server ecosystem, and we’ve become a household name among system administrators. We’re closing in on protecting one billion web sites.

Their growth rate and numbers are wild:

In March 2016, we issued our one millionth certificate. Just two years later, in September 2018, we were issuing a million certificates every day. In 2020 we reached a billion total certificates issued and as of late 2025 we’re frequently issuing ten million certificates per day.

According to their stats the amount of Firefox traffic protected by HTTPS doubled from 39% at the start of 2016 to ~80% today. I think it's difficult to over-estimate the impact Let's Encrypt has had on the security of the web.

Rubio stages font coup: Times New Roman ousts Calibri

Hacker News
www.reuters.com
2025-12-10 00:08:34
Comments...
Original Article

Please enable JS and disable any ad blocker

Six Myths About Rural America: How Conventional Wisdom Gets It Wrong

Portside
portside.org
2025-12-10 00:02:14
Six Myths About Rural America: How Conventional Wisdom Gets It Wrong Judy Tue, 12/09/2025 - 19:02 ...
Original Article

Dusk in downtown Lumberton, county seat in Robeson County, N.C., the most diverse rural county in America. | AP Photo/David Goldman

Roughly 1 in 5 Americans live in rural areas – places the federal government defines based on small populations and low housing density.

Yet many people understand rural America through stereotypes. Media and political conversations often use words or terms such as “fading,” “white,” “farming,” “traditional” and “politically uniform” to describe rural communities.

In reality, rural communities are far more varied. Getting these facts right matters because public debates, policies and resources – including money for programs – often rely on these assumptions, and misunderstandings can leave real needs neglected.

We are rural demographers at Louisiana State University and Syracuse University who study the causes and consequences of well-being in rural America. Here we outline six myths about rural America – a few among many – highlighted in our recent book “ Rural and Small-Town America: Context, Composition, and Complexities .”

Myth 1: Rural America is disappearing due to depopulation

Many people think rural America is emptying out. The story is more complicated. It’s true that from 2010 to 2020 most rural counties lost population. But about one-third grew , especially those near cities or those with lakes, mountains and other natural attractions. And there have been times, like in the 1970s and 1990s, when rural populations grew faster than cities – periods called “rural rebounds.

An important thing to know about rural population change is that the places defined as “rural” change over time. When a rural town grows enough, the U.S. Office of Management and Budget reclassifies it as “urban .” In other words, rural America isn’t disappearing – it’s changing and sometimes urbanizing.

Myth 2: Most rural Americans live on farms

Farming is still important in many rural places, but it’s no longer the way most rural Americans make a living. Today, roughly 6% of rural jobs are in agriculture . And most farm families also have members who work off-farm jobs , often for access to health insurance and retirement benefits.

A bigger source of employment in rural America is manufacturing. In fact, manufacturing plays a larger role as a share of jobs and earnings in rural areas than in cities . That also means that deindustrialization – steady job losses in manufacturing over the decades – has been especially painful in rural America. Unlike large cities with lots of employers, rural communities rely on just a few. When a rural plant or factory closes, the local impacts are often devastating .

The largest share of rural jobs today is in service-sector work , such as retail, food service, home health care and hospitality. These jobs often pay low wages, offer few benefits and have unstable hours, making it harder for many rural families to stay financially secure.

Myth 3: Only white people live in rural America

People often picture rural America as mostly white, but that’s not the full story . About 1 in 4 rural residents are nonwhite. Hispanic and Black people make up the largest shares, and Indigenous people have a greater portion of their population living in rural areas than any other racial group.

Rural America is also getting more racially and ethnically diverse every year. Young people are leading that change: About 1 in 3 rural children are nonwhite . The future of rural America is racially diverse, even if popular images don’t always show it.

Myth 4: Rural America is healthier than urban America

Many people imagine rural life as healthier than city life. But the opposite is true. People in rural areas die younger and at higher rates than people in cities. Scholars call this the “ rural mortality penalty,” and it has been widening for years . The COVID-19 pandemic made the gap even larger due to higher death rates in rural communities .

This isn’t just because rural areas have more older people. Rural working-age people, ages 25 to 64, are dying younger than their urban peers, and the gap is growing . This trend is being driven by nearly all major causes of death . Rural residents have higher rates of early death from cancers, heart disease, COVID-19, motor vehicle crashes, suicide, alcohol misuse, diabetes, stroke and pregnancy-related complications.

Myth 5: Rural families are more traditional than urban families

Images of rural life often evoke households in which married couples are raising children in traditional family structures. Historically, rural children were more likely to live with married parents. But that’s no longer the case .

Today, rural children are less likely than urban children to live with married parents and are more likely to live with cohabiting unmarried parents or in the care of grandparents or other relatives. Partly as a result, rural child poverty rates are higher than urban rates, and many rural families rely on safety-net supports such as the food aid program SNAP . Rural families are diverse, and many are economically vulnerable.

Myth 6: A new ‘rural revolt’ gave Donald Trump his presidential victories

Many rural voters have supported Donald Trump , but this didn’t happen overnight.

For much of the 20th century, Democrats drew major support from rural areas due to the party’s alignment with the working class and 100 years of single-party rule in the South spanning Reconstruction to the civil rights era.

However, social class and regional flips in voting patterns have meant rural voters have been shifting toward Republicans for nearly 50 years. The last time rural and urban residents voted within 1 percentage point of each other was in 1976, when Georgia peanut farmer and former governor Jimmy Carter was elected.

The partisan gap between rural and urban voters averaged 3 percentage points in the 1980s and 1990s, before growing to 10 percentage points in the 2000s and 20 percentage points in recent cycles. So, Trump’s support in rural America was not a new “revolt ” but part of a long-term trend.

And in 2024, the key geographic story wasn’t rural voters at all – it was the sharp drop in turnout in big cities. Both candidates got fewer urban votes than in 2020, with Kamala Harris capturing over 10 million fewer votes in major and medium-sized cities than Joe Biden had four years earlier.

Share of votes for the Republican presidential candidate in rural and urban counties, 1976-2024. Rural counties are nonmetropolitan and urban counties are metropolitan based on 2013 definitions. Excludes Alaska because it reports election results for election districts rather than counties.

===

Tim Slack Professor of Sociology, Louisiana State University

===

The Conversation has a distinctive model where we get over half of our readership by sharing our journalism with other media outlets. We get a byline, and we get to reach many more people – and thousands of websites and print publications benefit by getting fact-based, well-edited, interesting articles to bring to their readers. It’s a valuable service, especially when so many established news organizations are struggling to stay afloat – and small startups are fighting to get established. Editors love what we do and tell us so. Readers across the nation and the world benefit by reading journalism rooted in the expertise and knowledge of our experts, helping them make sense of a complex world. We’re expanding our efforts, too, into podcasts and video and social media and texting, as we strive to reach as many people as possible. We can only do this with your support. We can only give away our product to the public because people like you step up to make it possible. Thank you for helping us.

Joel Abrams

Director of Digital Strategy and Outreach

===

Devstral 2

Simon Willison
simonwillison.net
2025-12-09 23:58:27
Devstral 2 Two new models from Mistral today: Devstral 2 and Devstral Small 2 - both focused on powering coding agents such as Mistral's newly released Mistral Vibe which I wrote about earlier today. Devstral 2: SOTA open model for code agents with a fraction of the parameters of its competitors a...
Original Article

Devstral 2 . Two new models from Mistral today: Devstral 2 and Devstral Small 2 - both focused on powering coding agents such as Mistral's newly released Mistral Vibe which I wrote about earlier today .

  • Devstral 2: SOTA open model for code agents with a fraction of the parameters of its competitors and achieving 72.2% on SWE-bench Verified.
  • Up to 7x more cost-efficient than Claude Sonnet at real-world tasks.

Devstral 2 is a 123B model released under a janky license - it's "modified MIT" where the modification is:

You are not authorized to exercise any rights under this license if the global consolidated monthly revenue of your company (or that of your employer) exceeds $20 million (or its equivalent in another currency) for the preceding month. This restriction in (b) applies to the Model and any derivatives, modifications, or combined works based on it, whether provided by Mistral AI or by a third party. [...]

Mistral Small 2 is under a proper Apache 2 license with no weird strings attached. It's a 24B model which is 51.6GB on Hugging Face and should quantize to significantly less.

I tried out the larger model via my llm-mistral plugin like this:

llm install llm-mistral
llm mistral refresh
llm -m mistral/devstral-2512 "Generate an SVG of a pelican riding a bicycle"

Bicycle looks a bit like a cybertruck

For a ~120B model that one is pretty good!

Here's the same prompt with -m mistral/labs-devstral-small-2512 for the API hosted version of Devstral Small 2:

A small white pelican on what looks more like a child's cart.

Again, a decent result given the small parameter size. For comparison, here's what I got for the 24B Mistral Small 3.2 earlier this year.

The 2024 Free Software Awards winners

Linux Weekly News
lwn.net
2025-12-09 23:55:40
The Free Software Foundation has announced the recipients of its 2024 (even though 2025 is almost over) Free Software Awards. Andy Wingo won the award for the advancement of free software, Alx Sa is the outstanding new free-software contributor, and Govdirectory takes the award for projects of soci...
Original Article

[Posted December 9, 2025 by corbet]

The Free Software Foundation has announced the recipients of its 2024 (even though 2025 is almost over) Free Software Awards. Andy Wingo won the award for the advancement of free software, Alx Sa is the outstanding new free-software contributor, and Govdirectory takes the award for projects of social benefit.


US International Trade Administration Shaped EU Censorship Against US Companies

Hacker News
foundationforfreedomonline.com
2025-12-09 23:53:20
Comments...
Original Article

SUMMARY

  • X became the first American platform to be fined under the EU’s Digital Services Act, receiving a €120 million penalty after allegedly refusing to open up its data to “disinformation researchers.”
  • Disinformation researchers are critical to online censorship – they are the ones who compile databases of disfavored speech and the advertisers that fund it.
  • Without access to online platforms’ data, the international censorship machine is blind.
  • The EU’s Digital Services Act and its provisions mandating data access for researchers emerged with the full support and cooperation of the US government under the previous administration.
  • 23 US-funded “counter-disinformation” organizations are involved in the EU’s censorship regime, representing $15,444,695 in US taxpayer funding. Many of these organizations will receive access to X’s data if EU bureaucrats successfully pressure the platform.
  • Documents reviewed by FFO also expose the central role of the US Trade Representative and the US International Trade Administration at the Department of Commerce.
  • Both collaborated with the EU under the previous administration, through the US-EU Trade and Technology Council, which developed a shared list of policy priorities that were later enshrined in the Digital Services Act.
  • These include the DSA’s provisions on data access for researchers that are now being used to target X.

The European Commission announced on December 4 that it will fine Elon Musk’s X €120 million (approx. $140 million) for non-compliance with the Digital Services Act (DSA), the EU’s draconian online censorship regime . The Commission has given X 60 days to provide it with a compliance plan, at the risk of “periodic penalty payments.”

X was the first platform to be investigated under the DSA. Immediately upon the censorship law going into force last year, the EU’s ruling commission of unelected bureaucrats used its new powers to investigate the platform Musk, after months of saber-rattling from European officials that began shortly after Musk’s takeover of the company and concurrent promise to roll back its censorship regime.

The €120 million fine is not the end of the matter. If X ignores the Commission’s demands, the EU can impose periodic penalties up to 5% of a company’s average daily worldwide turnover for each day of continued non-compliance.

While this censorship attack appears to be coming solely from Europe, the real picture is more complicated. The Digital Services Act did not emerge in a vacuum – it was the product of a transatlantic US-EU censorship partnership that reached its apex under the Biden administration.

This partnership, run out of the International Trade Administration at the Department of Commerce, and the office of the U.S. Trade Representative at the White House , developed a shared US-EU strategy on containing “harmful” online content.

This strategy included forcing tech platforms to open themselves up to “disinformation researchers” who can identify both disfavored online speech and online advertisers for potential boycott operations. Failure to allow these (often state-funded) “researchers” free access to their data is precisely why the EU has fined X.

In more ways than one, the European Commission is acting as a government-in-exile for the US federal censorship state that enjoyed the support of the previous administration, only to be defunded and expunged by the current one .

Transatlantic Censorship: The US-EU Trade and Technology Council

The roots of the EU’s censorship campaign against X can be found in the US-EU Trade and Technology Council, a four-year liaison between the architects of censorship on two continents that began in the first year of the Biden administration.

In its inaugural joint statement , the US and EU announced 10 “working groups” devoted to a wide range of digital policy. One of these, Working Group 5, specifically sought to develop “shared approaches” to censoring disfavored online content.

From the inaugural statement:

Working Group 5 – Data Governance and Technology Platforms: The Data Governance and Technology Platforms working group is tasked to exchange information on our respective approaches to data governance and technology platform governance, seeking consistency and interoperability where feasible. We intend to exchange information and views regarding current and future regulations in both the United States and European Union with a goal of effectively addressing shared concerns , while respecting the full regulatory autonomy of the United States and European Union. We have identified common issues of concern around: illegal and harmful content and their algorithmic amplification , transparency, and a ccess to platforms’ data for researchers as well as the democratic responsibility of online intermediaries. We have also identified a shared interest in using voluntary and multi-stakeholder initiatives to complement regulatory approaches in some areas. We are committed to transatlantic cooperation regarding platform policies that focus on disinformation , product safety, counterfeit products, and other harmful content. We plan to engage with platform companies to improve researchers’ access to data generated by platforms, in order to better understand and be able to address systemic risks linked to how content spreads online .  We also plan to engage in a discussion on effective measures to appropriately address the power of online platforms and ensure effective competition and contestable markets. The working group is also tasked to discuss, alongside other working groups, common approaches on the role of cloud infrastructure and services.

In addition to stating outright that the previous administration “shared concerns” with the European Union about the spread of disfavored online speech, the inaugural statement specifically highlights the importance of making sure “researchers” can access platforms’ data.

This is a critical point. “Disinformation researchers” are the eyes and ears of the global censorship regime. They are the ones who compile lists of disfavored users, posts, and online narratives that are then passed on to content moderation departments for censorship.

Without access to data from social media platforms at scale, disinformation researchers – and the global censorship machine that relies on them – are blind.

Guaranteeing this access was so important to the US-EU Trade and Technology Council that in each of the years it published reports, it was mentioned as a priority.

In 2022, the US-EU body reiterated that “researchers” are essential to understanding online risks, “particularly those related to illegal content and harmful content,” and expressed concern that data access was dependent on “voluntary” mechanisms established by tech platforms:

In 2023, the US-EU council gave the issue of data access for researchers equal billing to the protection of children as a policy priority:

In the same publication, the US and EU expressed dissatisfaction with the fact that tech platforms were not bound by law to open up their data to the “disinformation researchers” of the censorship industry.

The joint statement specifically connects this to priorities like “information integrity” and “election integrity” – the pretexts used to censor virtually every prominent Trump supporter ahead of the 2020 election, as well as President Trump himself, in 2021.

The statement also said that data access for researchers was critical to analyzing “disproportionate impacts on vulnerable, marginalized, or underrepresented communities.” This is the hate speech pretext — another common bureaucratic justification to shut down political speech online. And the United States under the previous administration, despite the First Amendment, gave its full support to the EU’s approach.

And finally, in 2024 – an entire standalone report devoted to the topic, titled “Mechanisms for Researcher Access to Online Platform Data.” The report compared the various methods of data access from the platforms, and noted (with US approval) that the Digital Services Act mandated access to both data and ad repositories.

This is a critical point, because DSA Articles 40.12 and 39 are exactly the provisions that the European Commission has accused X of violating. The previous administration directly supported the same provisions that led to a €120 million fine against an American company.

From the European Commission’s announcement of the €120M fine against X:

The current administration may have ended domestic government support for censorship with Executive Order 14149 , but European bureaucrats are still dutifully carrying out the priorities of the transatlantic censorship regime the order sought to abolish.

The “Civil Society” Swarm

The direct censorship collaboration between the US and EU via the previous administration’s Trade and Technology Council exposes a deep level of transatlantic coordination in the establishment of the DSA.

But there are also a great deal of indirect links. As FFO previously revealed , 23 US-funded organizations – a mixture of NGOs, university research departments, and private companies – are involved in the EU’s censorship regime.

Some were signatories to the EU’s code of practice on disinformation, while others are directly enforcing the DSA through participation in the EU’s network of “digital observatories,” which monitor online speech at scale. These are the same digital observatories that the EU hopes to force X to open itself up to.

The list of US-funded participants in the “digital observatories” is as follows:

Newsguard

The infamous private blacklisting service received $750,000 from the Department of Defense in 2021 . Its close connections to the old foreign policy blob don’t end there — until December of last year, its public list of advisors (now scrubbed from the website) included former NSA and CIA director Michael Hayden, former NATO head Anders Fogh Rasmussen, former DHS secretary Tom Ridge, and former State Department undersecretary Richard Stengel.

Bellingcat

Widely praised by the US intelligence community, Bellingcat is a journalism and “open source intelligence” outfit that aims to provide actionable intelligence in a public way. The organization has received over $115,000 in funding from NED.

The University of Taru, Estonia

Received a $400,000 award from the US Department of State for an “advanced study program in combating disinformation to improve democratic resilience.”

Vytautas Magnus University, Lithuania

Received $10,250 from the U.S. Department of State to host a “global media and information literacy week.”

Funky Citizens, Romania

A Romanian “anti-disinformation” and “civic fitness” nonprofit that received $161,822 across five separate grants from the State Department , including to “strengthen the communication strategies of Romanian NGOs ahead of the 2024 election.” The results of that election were nullified by Romanian courts – apparently with EU support, as boasted by former EU commissioner Thierry Breton . The pretext for the nullification, which prevented the accession of a right-wing populist government, was an alleged Russian social media campaign.

Fundatia Centrul / Centrul Pentru Jurnalism Independent, Moldova

This journalism NGO in Moldova received $918,709 from the State Department across seven grants , more than $500,000 of which was concentrated in two grants in 2022 and 2023. These include grants to fund local journalists and  “social media monitoring.”

Seesame PR, Slovakia

This Slovakia-based PR firm was paid $149,924 by the State Department across two grants , including $125,000 for a PR campaign to “strengthen trust in freedom and democracy” in 2019.

Vrije University, Belgium

Vrije has received over $1.5 million in grants and awards from the U.S. government , including the US Department of Defense, the US Department of State, and the Environmental Protection Agency. While most of this funding is for projects unrelated to censorship, one $50,000 grant from the State Department was for “empowering youth” in Belgium, Finland, and Turkey to “counter disinformation.”

Cyprus University of Technology

The Cyprus University of Technology has received $316,765 from the State Department , including a number of grants for “misinformation” and “media literacy” research.

EU DisinfoLab, Belgium

Purpose-built for combating “disinformation,” this NGO received $15,000 from the State Department to implement its project in Latvia.

Verificat, Spain

A “fact checker” aimed at combating disinformation in Spain’s Catalonia region, Verificat received $11,000 from the State Department to host a workshop on disinformation.

NewsWhip

A social media analytics company, NewsWhip received part of a $317,268 award (that could pay up to $866,919 by its end date in 2028) from the State Department for a variety of electronic services.

Austria Presse Agentur (APA)

The national news agency of Austria, APA has received $305,874 in subscription revenues from the State Department since 2016.

Agence France Presse (AFP)

Based in France, AFP is the world’s oldest news agency, with over $300 million in annual revenues and the 27th-most visited news site in the world. AFP received $9,914,902 from the U.S. government , mainly in the form of subscriptions. The bulk of this ($9.14m) came from the U.S. Agency for Global Media. It also received $351,592 from the Department of Defense, $150,808 from the Department of State, and $279,255 from USAID. Of all the organizations involved in the EDMO disinformation-monitoring hubs, AFP has the most involvement, acting as an observer in eight of the fourteen “digital media observatories.”

The Open Society European Policy Institute

As citizen journalists on X have uncovered , George Soros’ influence operation in Europe, the Open Society European Policy Institute, has held multiple meetings with EU officials to discuss “disinformation,” including meetings that directly address the Digital Services Act.

From the European Commission’s public register of meetings :

The deep, decades-long collaboration between George Soros and the US foreign policy state is well-documented , including direct coordination with USAID in the 1990s to train emerging political leaders in central and eastern Europe.

The picture is clear: the European Union is not acting alone, and the Digital Services Act is not a purely EU creation. Until the current administration, it had the full support of the permanent bureaucracy in Washington DC, as well as the global network of US taxpayer funded “civil society” organizations that now hope to use the DSA as a battering ram to gain access to X’s data.

The previous administration may be gone, but its censorship machine lives on in the EU.

Under the hood of Canada Spends with Brendan Samek

Simon Willison
simonwillison.net
2025-12-09 23:52:05
I talked to Brendan Samek about Canada Spends, a project from Build Canada that makes Canadian government financial data accessible and explorable using a combination of Datasette, a neat custom frontend, Ruby ingestion scripts, sqlite-utils and pieces of LLM-powered PDF extraction. Here's the video...
Original Article

9th December 2025

I talked to Brendan Samek about Canada Spends , a project from Build Canada that makes Canadian government financial data accessible and explorable using a combination of Datasette, a neat custom frontend, Ruby ingestion scripts, sqlite-utils and pieces of LLM-powered PDF extraction.

Here’s the video on YouTube .

Sections within that video:

  • 02:57 Data sources and the PDF problem
  • 05:51 Crowdsourcing financial data across Canada
  • 07:27 Datasette demo: Search and facets
  • 12:33 Behind the scenes: Ingestion code
  • 17:24 Data quality horror stories
  • 20:46 Using Gemini to extract PDF data
  • 25:24 Why SQLite is perfect for data distribution

Build Canada and Canada Spends

Build Canada is a volunteer-driven non-profit that launched in February 2025—here’s some background information on the organization, which has a strong pro-entrepreneurship and pro-technology angle.

Canada Spends is their project to make Canadian government financial data more accessible and explorable. It includes a tax sources and sinks visualizer and a searchable database of government contracts, plus a collection of tools covering financial data from different levels of government.

Datasette for data exploration

The project maintains a Datasette instance at api.canadasbilding.com containing the data they have gathered and processed from multiple data sources—currently more than 2 million rows plus a combined search index across a denormalized copy of that data.

  Datasette UI for a canada-spends database.  aggregated-contracts-under-10k:  year, contract_goods_number_of, contracts_goods_original_value, contracts_goods_amendment_value, contract_service_number_of, contracts_service_original_value, contracts_service_amendment_value, contract_construction_number_of, contracts_construction_original_value, contracts_construction_amendment_value, acquisition_card_transactions_number_of, acquisition_card_transactions_total_value, owner_org, owner_org_title  487 rows cihr_grants  external_id, title, project_lead_name, co_researchers, institution, province, country, competition_year, award_amount, program, program_type, theme, research_subject, keywords, abstract, duration, source_url  53,420 rows contracts-over-10k:   reference_number, procurement_id, vendor_name, vendor_postal_code, buyer_name, contract_date, economic_object_code, description_en, description_fr, contract_period_start, delivery_date, contract_value, original_value, amendment_value, comments_en, comments_fr, additional_comments_en, additional_comments_fr, agreement_type_code, trade_agreement, land_claims, commodity_type, commodity_code, country_of_vendor, solicitation_procedure, limited_tendering_reason, trade_agreement_exceptions, indigenous_business, indigenous_business_excluding_psib, intellectual_property, potential_commercial_exploitation, former_public_servant, contracting_entity, standing_offer_number, instrument_type, ministers_office, number_of_bids, article_6_exceptions, award_criteria, socioeconomic_indicator, reporting_period, owner_org, owner_org_title  1,172,575 rows global_affairs_grants:   id, projectNumber, dateModified, title, description, status, start, end, countries, executingAgencyPartner, DACSectors, maximumContribution, ContributingOrganization, expectedResults, resultsAchieved, aidType, collaborationType, financeType, flowType, reportingOrganisation, programName, selectionMechanism, policyMarkers, regions, alternameImPositions, budgets, Locations, otherIdentifiers, participatingOrgs, programDataStructure, relatedActivities, transactions  2,378 rows nserc_grants:   title, award_summary, application_id, competition_year, fiscal_year, project_lead_name, institution, department, province, award_amount, installment, program, selection_committee, research_subject, area_of_application, co-researchers, partners, external_id, source_url  701,310 rows sshrc_grants:   id, title, program, fiscal_year, competition_year, applicant, organization, amount, discipline, area_of_research, co_applicant, keywords, source_url  213,085 rows transfers:   FSCL_YR, MINC, MINE, MINF, DepartmentNumber-Numéro-de-Ministère, DEPT_EN_DESC, DEPT_FR_DESC, RCPNT_CLS_EN_DESC, RCPNT_CLS_FR_DESC, RCPNT_NML_EN_DESC, RCPNT_NML_FR_DESC, CTY_EN_NM, CTY_FR_NM, PROVTER_EN, PROVTER_FR, CNTRY_EN_NM, CNTRY_FR_NM, TOT_CY_XPND_AMT, AGRG_PYMT_AMT  357,797 rows  Download SQLite DB: canada-spends.db 2.4 GB Powered by Datasette · Queries took 24.733ms

Processing PDFs

The highest quality government financial data comes from the audited financial statements that every Canadian government department is required to publish. As is so often the case with government data, these are usually published as PDFs.

Brendan has been using Gemini to help extract data from those PDFs. Since this is accounting data the numbers can be summed and cross-checked to help validate the LLM didn’t make any obvious mistakes.

Further reading

Age Verification Is Coming For the Internet. We Built You a Resource Hub to Fight Back.

Electronic Frontier Foundation
www.eff.org
2025-12-09 23:48:55
Age verification laws are proliferating fast across the United States and around the world, creating a dangerous and confusing tangle of rules about what we’re all allowed to see and do online. Though these mandates claim to protect children, in practice they create harmful censorship and surveillan...
Original Article

Age verification laws are proliferating fast across the United States and around the world, creating a dangerous and confusing tangle of rules about what we’re all allowed to see and do online. Though these mandates claim to protect children , in practice they create harmful censorship and surveillance regimes that put everyone —adults and young people alike—at risk.

The term “age verification” is colloquially used to describe a wide range of age assurance technologies, from age verification systems that force you to upload government ID, to age estimation tools that scan your face, to systems that infer your age by making you share personal data. While different laws call for different methods , one thing remains constant: every method out there collects your sensitive, personal information and creates barriers to accessing the internet. We refer to all of these requirements as age verification, age assurance, or age-gating.

If you’re feeling overwhelmed by this onslaught of laws and the invasive technologies behind them, you’re not alone. It’s a lot. But understanding how these mandates work and who they harm is critical to keeping yourself and your loved ones safe online. Age verification is lurking around every corner these days, so we must fight back to protect the internet that we know and love.

That’s why today, we’re launching EFF’s Age Verification Resource Hub (EFF.org/Age ) : a one-stop shop to understand what these laws actually do, what’s at stake, why EFF opposes all forms of age verification, how to protect yourself, and how to join the fight for a free, open, private, and yes—safe—internet.

Why Age Verification Mandates Are a Problem

In the U.S., more than half of all states have now passed laws imposing age-verification requirements on online platforms. Congress is considering even more at the federal level, with a recent House hearing weighing nineteen distinct proposals relating to young people’s online safety—some sweeping, some contradictory, and each one more drastic and draconian than the last.

We all want young people to be safe online. However, age verification is not the silver bullet that lawmakers want you to think it is.

The rest of the world is moving in the same direction. We saw the UK’s Online Safety Act go into effect this summer, Australia’s new law barring access to social media for anyone under 16 goes live today, and a slew of other countries are currently considering similar restrictions.

We all want young people to be safe online. However, age verification is not the silver bullet that lawmakers want you to think it is. In fact, age-gating mandates will do more harm than good especially for the young people they claim to protect. They undermine the fundamental speech rights of adults and young people alike; create new barriers to accessing vibrant, lawful, even life-saving content; and needlessly jeopardize all internet users’ privacy, anonymity, and security.

If legislators want to meaningfully improve online safety, they should pass a strong, comprehensive federal privacy law instead of building new systems of surveillance, censorship, and exclusion.

What’s Inside the Resource Hub

Our new hub is built to answer the questions we hear from users every day, such as:

  • How do age verification laws actually work?
  • What’s the difference between age verification, age estimation, age assurance, and all the other confusing technical terms I’m hearing?
  • What’s at stake for me, and who else is harmed by these systems?
  • How can I keep myself, my family, and my community safe as these laws continue to roll out?
  • What can I do to fight back?
  • And if not age verification, what else can we do to protect the online safety of our young people?

Head over to EFF.org/Age to explore our explainers, user-friendly guides, technical breakdowns, and advocacy tools—all indexed in the sidebar for easy browsing . And today is just the start, so keep checking back over the next several weeks as we continue to build out the site with new resources and answers to more of your questions on all things age verification.

Join Us: Reddit AMA & EFFecting Change Livestream Events

To celebrate the launch of EFF.org/Age , and to hear directly from you how we can be most helpful in this fight, we’re hosting two exciting events:

1. Reddit AMA on r/privacy

Next week, our team of EFF activists, technologists, and lawyers will be hanging out over on Reddit’s r/privacy subreddit to directly answer your questions on all things age verification. We’re looking forward to connecting with you and hearing how we can help you navigate these changing tides, so come on over to r/privacy anytime between Monday (12/15) at 12pm PT and Wednesday (12/17) at 5pm PT , and ask us anything!

2. EFFecting Change Livestream Panel: “ The Human Cost of Online Age Verification

Then, on January 15th at 12pm PT , we’re hosting a livestream panel featuring Cynthia Conti-Cook, Director of Research and Policy at the Collaborative Research Center for Resilience ; a representative of Gen Z for Change ; EFF Director of Engineering Alexis Hancock ; and EFF Associate Director of State Affairs Rindala Alajaji . We’ll break down how these laws work, who they exclude, and how these mandates threaten privacy and free expression for people of all ages. Join us by RSVPing at https://livestream.eff.org/ .

A Resource to Empower Users

Age-verification mandates are reshaping the internet in ways that are invasive, dangerous, and deeply unnecessary. But users are not powerless! We can challenge these laws, protect our digital rights, and build a safer digital world for all internet users, no matter their ages. Our new resource hub is here to help—so explore, share, and join us in the fight for a better internet.

MS-13 and Trump Backed the Same Presidential Candidate in Honduras

Intercept
theintercept.com
2025-12-09 23:44:21
MS-13 gang members told Hondurans to vote for the Trump-backed right-wing candidate or “we’ll kill you and your whole fucking family.” The post MS-13 and Trump Backed the Same Presidential Candidate in Honduras appeared first on The Intercept....
Original Article

Gangsters from MS-13, a Trump-designated Foreign Terrorist Organization , intimidated Hondurans not to vote for the left-leaning presidential candidate, 10 eyewitness sources told The Intercept, in most cases urging them to instead cast their ballots in last Sunday’s election for the right-wing National Party candidate — the same candidate endorsed by U.S. President Donald Trump.

Ten residents from four working-class neighborhoods controlled by MS-13, including volunteer election workers and local journalists, told The Intercept they saw firsthand gang members giving residents an ultimatum to vote for the Trump-endorsed conservative candidate or face consequences. Six other sources with knowledge of the intimidation — including government officials, human rights investigators, and people with direct personal contact with gangs — corroborated their testimony. Gang members drove voters to the polls in MS-13-controlled mototaxi businesses, three sources said, and threatened to kill street-level activists for the left-leaning Liberty and Refoundation, or LIBRE, party if they were seen bringing supporters to the polls. Two witnesses told The Intercept they saw members of MS-13 checking people’s ballots inside polling sites, as did a caller to the national emergency help line.

“A lot of people for LIBRE didn’t go to vote because the gangsters had threatened to kill them,” a resident of San Pedro Sula, the second-largest city in Honduras, told The Intercept. Mareros , as the gang members are known, intimidated voters into casting their ballots for Nasry “Tito” Asfura, known as Papi a la Órden or “Daddy at your service.” Multiple residents of San Pedro Sula alleged they were also directed to vote for a mayoral candidate from the centrist Liberal Party.

Miroslava Cerpas, the leader of the Honduran national emergency call system, provided The Intercept with four audio files of 911 calls in which callers reported that gang members had threatened to murder residents if they voted for LIBRE. A lead investigator for an internationally recognized Honduran human rights NGO, who spoke anonymously with The Intercept to disclose sensitive information from a soon-to-be published report on the election, said they are investigating gang intimidation in Tegucigalpa and the Sula Valley “based on direct contact with victims of threats by gangs.”

“If you don’t follow the order, we’re going to kill your families, even your dogs. We don’t want absolutely anyone to vote for LIBRE.”

“People linked to MS-13 were working to take people to the voting stations to vote for Asfura, telling them if they didn’t vote, there would be consequences,” the investigator told The Intercept. They said they received six complaints from three colonias in the capital of Tegucigalpa and three in the Sula Valley, where voters said members of MS-13 had threatened to kill those who openly voted for the ruling left LIBRE party or brought party representatives to the polls. The three people in the Sula Valley, the investigator said, received an audio file on WhatsApp in which a voice warns that those who vote for LIBRE “have three days to leave the area,” and “If you don’t follow the order, we’re going to kill your families, even your dogs. We don’t want absolutely anyone to vote for LIBRE. We’re going to be sending people to monitor who is going to vote and who followed the order. Whoever tries to challenge the order, you know what will happen.”

The MS-13 interference took place as the U.S. president, who has obsessed over the gang since his first term , extended an interventionist hand over the elections. On November 28, Trump threatened to cut off aid to Honduras if voters didn’t elect Asfura while simultaneously announcing a pardon for Asfura’s ally and fellow party member Juan Orlando Hernández, the former president of Honduras convicted in the U.S. on drug trafficking and weapons charges last year.

“If Tito Asfura wins for President of Honduras, because the United States has so much confidence in him, his Policies, and what he will do for the Great People of Honduras, we will be very supportive,” Trump wrote on Truth Social. “If he doesn’t win, the United States will not be throwing good money after bad, because a wrong Leader can only bring catastrophic results to a country, no matter which country it is.”

The election remains undecided over a week after the fact: Asfura holds a narrow lead over centrist Liberal Party candidate Salvador Nasralla, while Rixi Moncada, the LIBRE party candidate, remains in a distant third. As people await the final results, one San Pedro Sula resident said, “there’s been a tense calm.”

It’s unlikely the MS-13 interference led to LIBRE’s loss, since the ruling party had already suffered a significant drop in popularity after a lack of change, continued violence, and corruption scandals under four years of President Xiomara Castro. But the LIBRE government pointed to a raft of other electoral irregularities, and a preliminary European Union electoral mission report recognized that the election was carried out amid “intimidation, defamation campaigns, institutional weakness, and disinformation,” though it ignored LIBRE’s accusations of “fraud.” The Honduran attorney general announced their own investigation into irregularities in the election last week, and on Monday, two representatives for the National Electoral Council informed Hondurans that the electronic voting system wasn’t updated for over 48 hours over the weekend, while results are still being finalized.

“There is clear and resounding evidence that this electoral process was coerced by organized crime groups,” said Cerpas, who is a member of the LIBRE party, “pushing the people to vote for Nasry Asfura and intimidating anyone who wanted to vote for Rixi Moncada.”

“There is clear and resounding evidence that this electoral process was coerced by organized crime groups.”

Gerardo Torres, the vice chancellor of foreign relations for the LIBRE government, told The Intercept via phone that manipulation of elections by maras is a well-established practice — but that the timing of the threats was alarming given Trump’s simultaneous pardoning of Hernández and endorsement of Asfura. “When, a day before the elections, the president of the United States announces the liberation of Hernández, and then automatically there is a surge in activity and intimidation by MS-13,” Torres said, it suggests that the gang members see the return of the former president as “an opportunity to change their situation and launch a coordinated offensive.”

“It would seem like the U.S. is favoring, for ideological reasons, a narco-state to prevent the left from returning to power,” he said.

The White House, Asfura, and the National Party did not respond to The Intercept’s requests for comment.

All witnesses who alleged election interference have been granted anonymity to protect them from targeting by MS-13.

“They Control These Colonias”

Bumping over potholed dirt roads on the outskirts of San Pedro Sula the day before the presidential election, a motorcycle taxi driver informed their passenger of MS-13’s latest ultimatum: The mototaxis “were strictly prohibited from bringing people from LIBRE to the voting stations on election day,” recalled the passenger. “Only people for the National Party or the Liberal Party — but for LIBRE, no one, no one, not even flags were allowed.”

Gangs like MS-13 “control the whole area of Cortés,” the passenger said, referring to their home department. “Total subjugation.”

The gang members closely monitor the movements of those within their territories, in many cases by co-opting or controlling mototaxi services to keep track of who comes and goes. Three other sources in San Pedro Sula and one in Tegucigalpa confirmed MS-13’s co-optation of mototaxis in the area; another source with direct, yearslong contact with gang members on the north coast of Honduras confirmed that MS-13 was pushing residents in their territories of San Pedro Sula to vote for Asfura by the same means. When members of MS-13 passed through Cortés warning that those who voted for LIBRE “had three days to leave,” the mototaxi passenger said, residents surrounded by years of killings, massacres, and disappearances by the gang knew what might await them if they defied.

MS-13 was formed in the 1980s in Los Angeles, California, among refugees of the Salvadoran civil war who the George H.W. Bush administration then deported en masse to Central America. In the ’90s, local gangs of displaced urban Hondurans morphed with the Salvadoran franchise. Over the years, the Mara Salvatrucha, which MS stands for, evolved into a sophisticated criminal enterprise: first as street-level drug dealers, then extortionists, assassins for hire, and cocaine transporters who have been documented working in league with high-level traffickers and state officials for at least two decades.

If Honduras has been a home turf of gangs, the country is also an anchor for U.S. power in the region, hosting the second-largest U.S. military base in Latin America and a laboratory for radical experiments in libertarian far-right “private cities.” In 2009, the Honduran military carried out a coup under the passive watch of U.S. authorities, ousting then-President Manuel Zelaya , a centrist and husband of current President Xiomara Castro. The homicide rate skyrocketed, turning the country into the world’s most violent, per U.S. State Department rankings, by the 2010s.

The chaos gave rise to ex-president Hernández, whom U.S. prosecutors later accused of turning Honduras into a “cocaine superhighway” as he directed the country’s military, police, and judiciary to protect drug traffickers. Last week, Hernández was released from a West Virginia prison after a pardon from Trump, and on Monday, the Honduran attorney general announced an international warrant for his arrest.

“Gangsters were going from house to house to tell people to vote for Papi.”

As Honduran voters processed the latest cycle of U.S. influence over their politics, the more immediate menace at the polls extended to the local level. “Gangsters were going from house to house to tell people to vote for Papi [Asfura] and el Pollo ,” said a San Pedro Sula resident who volunteered at a voting booth on election day, referring to the city’s mayor, Roberto Contreras of the Liberal Party. Two other sources in the city, and one government source in Tegucigalpa, also said gang members were backing Contreras.

“The team of Mayor Roberto Contreras categorically rejects any insinuation of pacts with criminal structures,” said a representative for the mayor in a statement to The Intercept. “Any narrative that tries to tie [support for Contreras] with Maras or gangs lacks base, and looks to distract attention from the principal message: the population went to vote freely, without pressure and with the hope of a better future.”

Gang intimidation of voters isn’t new in Honduras, where, within territories zealously guarded and warred over by heavily armed gangs, even the threat for residents to vote for certain candidates is enough to steer an election in their district. “Remember that they control these colonias,” said one of the San Pedro Sula residents. “And given the fact that they have a lot of presence, they tell the people that they’re going to vote for so-and-so, and the majority follow the orders.”

The human rights lawyer Victor Fernández, who ran for mayor of San Pedro Sula as an independent candidate but lost in the March primaries, said he and his supporters also experienced intimidation from MS-13 during his primary campaign. After his own race was over, he said he continued to see indications of gang intervention in the presidential campaign for months leading up to election day.

“Both before and during the elections on November 30, gangsters operating here in the Sula Valley exercised their pressure over the election,” he said, explaining this conclusion was drawn from “recurring” testimonies with residents of multiple neighborhoods. “The great violent proposal that people have confirmed is that gang members told them they couldn’t go vote for LIBRE, and that whoever did so would have to confront [the gang] structure.”

“Vamos a votar por Papi a la Órden”

Minutes after submitting a highly publicized complaint to the Public Ministry on Monday, Cerpas, of the National Emergency call system, told The Intercept that her office received 892 verified complaints of electoral violations on election day. “In those calls,” she said, “there was a significant group of reports regarding intimidation and threats by criminal groups.”

Four audio recordings of residents calling the emergency hotline, which Cerpas shared with The Intercept, reflect the wider accusation that mareros used murderous intimidation tactics to prevent people from voting for LIBRE and vote, instead, for Asfura.

In one of the files, a woman calling from Tegucigalpa tells the operator that members of MS-13 had “threatened to kill” anyone who voted for LIBRE while posing as election observers at the voting center. “They’re outside the voting center, they’re outside and inside,” she says, referring to members of MS-13, her voice trembling. “I entered, and they told me, ‘If you vote for LIBRE, we’ll kill you and your whole fucking family.’”

For days before the election, a resident from a rural region of the country, whose time in a maximum-security prison called La Tolva put him in yearslong proximity to gang members, had received messages from friends and family members living in Tegucigalpa and San Pedro Sula. They all reported a variation of the same story: Gang members on mototaxis informing everyone in their colonias, “ Vamos a votar por Papi a la Órden .” (“We’re going to vote for” Asfura.)

A former mid-level bureaucrat for the LIBRE government told The Intercept that, during the lead-up to the election, “LIBRE activists who promoted the vote … were intimidated by members of gangs so that they would cease pushing for the vote for LIBRE.” The former official didn’t specify the gangs, though they said the intimidation took place in three separate neighborhoods.

“All day, the muchachos [gang members] were going around and taking photos of the coordinators,” read messages from local organizers shared with The Intercept. The gang members “said that they needed to close themselves in their houses.”

Testimony at Hernández’s trial indicated that members of MS-13 were subcontracted as early as 2004 through the corrupt , U.S.-allied police commander Juan Carlos “El Tigre” Bonilla to provide security for caravans of cocaine alongside soldiers. Evidence presented in the trial of Midence Oquelí Martínez Turcios , a former Honduran soldier and longtime congressional deputy for the Liberal Party who was convicted of drug trafficking charges last week, revealed that he trained sicarios for MS-13 to carry out high-level assassinations on behalf of the drug trafficking clan known as the Cachiros. Testifying at Hernández’s 2024 trial, the imprisoned Cachiros leader claimed to have paid $250,000 in protection money to the former president.

Trump wiped away Hernández’s conviction, calling it political theater, but he sees MS-13’s sicarios in a different light. To Trump, the gangsters are human “ animals ,” their gang a “ menace ” that “ violated our borders ” in an “ infestation ” — justifying militarized crackdowns on caravans of Hondurans fleeing violence under Hernández and the categorization of the gang as a foreign terrorist organization. Announcing the designation in February, a White House press release reads: “MS-13 uses public displays of violence to obtain and control territory and manipulate the electoral process in El Salvador.”

“We used to think this was just to influence the mayors, not the presidency.”

“It’s known that MS-13 will do vote buying,” the investigator examining voter intimidation said. “This is a recurring practice. But we used to think this was just to influence the mayors, not the presidency.”

In El Salvador, gangs like MS-13 have intervened in favor of another Trump ally, Nayib Bukele , whose government has been embroiled by scandal over alleged collusion with MS-13 and other gangs — meaning that the in Honduras wasn’t the first time that the same candidate Trump endorsed was promoted by a gang he now designates a terrorist organization.

For Cerpas, the coincidence of that voter intimidation with Hernández’s release is cause for alarm. “The people in Honduras are afraid,” she said, “because organized crime has been emboldened by the pardon of Juan Orlando Hernández.”

Postmortem: Intermittent Failure in SimKube CI Runners

Lobsters
blog.appliedcomputing.io
2025-12-09 23:34:55
Comments...
Original Article
We’re very sorry if your SimKube CI pipeline looked like this at some point in the last week or so. Really, honest.

On Wednesday, November 26, 2025, while testing changes to ACRL’s SimKube CI Runner 1 , an ACRL employee discovered an intermittent failure in the runner. This failure caused approximately 50% of the simulations scheduled on the runner to fail, resulting in failed actions in users’ CI pipelines, which prevented new deploys of mission-critical code. We at ACRL take our responsibility as the world’s leading provider of Kubernetes simulation analysis very seriously, and we understand the severe impact this incident had on users of our CI runner. We deeply apologize for this incident, and are committed to taking whatever actions necessary to restore trust with our customers. In the remainder of this post we will outline the timeline of this incident, a detailed analysis of the underlying causes, and the remediation steps we have taken to prevent a recurrence of this incident.

The aforementioned ACRL employee discovered the issue late Wednesday afternoon on the 26th. However, because the following day was Thanksgiving, the investigation was postponed until the following week under the hypothesis that it was likely a transient error, it’d probably go away if we didn’t look at it too hard, and we had a lot of Thanksgiving food to eat.

On the following Monday (December 1st), during our regularly-scheduled company all-hands, we re-triggered the CI pipeline once and it succeeded, whereupon we decided the problem had fixed itself. It wasn’t until Thursday, December 4th, when the incident re-occurred that we decided to bother spending some time investigating. We then spent most of the afternoon troubleshooting until we found the inciting factors 2 and identified a series of remediations. Those fixes were published at some point later on, when we got around to it.

SimKube is ACRL’s simulation environment for Kubernetes . It is designed to allow organizations to study changes in their production Kubernetes clusters in a safe and isolated environment. One way of using SimKube is as a dedicated step in CI pipeline; this would enable users to check for regressions or bugs in their Kubernetes code before it is deployed.

The SimKube CI runner is published 3 as an Amazon Machine Image (AMI) 4 , which contains a complete SimKube environment. The runner can replay trace files contained in the codebase, and will check the outcome of the simulation to see if it’s Succeeded or Failed . The symptoms of this incident were that periodically, a simulation would report as “failed” after completing its entire run. The SimKube driver pod (the component responsible for running the events in the trace file) would report the following error, along with a stack trace and a panic:

timed out deleting simulation root sk-test-sim-driver-sn295-root

The “simulation root” is a Kubernetes custom resource which acts as a “hook” to hang all the other simulation objects off of. The simulation root exists to make for a one-step clean-up procedure: because of Kubernetes garbage collection , when the root is deleted, all objects owned by the simulation root will also be deleted.

The first step we took in our investigation was to study the trace file running in the simulation. This trace file (also available as an example trace in the SimKube repo) creates a single CronJob , lets it run for three minutes, and then deletes the CronJob . The CronJob is configured to create a new pod every minute, and the pod sleeps for 30 seconds before terminating. This trace file is used to test the pod lifecycle management features of SimKube.

We investigated the log files from all the relevant controllers, including the SimKube driver pod, the Kubernetes controller manager, and the Kubernetes API server. The results were, to use the technical terminology, extremely f*$&ing weird. The SimKube driver pod had dozens of log lines which looked like the following:

INFO mutate_pod: mutating pod (hash=10855072724872030168, seq=66) pod.namespaced_name=”virtual-default/hello-simkube-29414550-tcr49”
INFO mutate_pod: first time seeing pod, adding tracking annotations pod.namespaced_name=”virtual-default/hello-simkube-29414550-tcr49”

What do these lines mean? Well, the SimKube driver registers itself as a mutating webhook so that it can redirect simulated pods to the fake nodes and apply other labels and annotations to them. The hello-simkube pod is the one that’s owned by the simulated CronJob. What’s curious about these log lines is that they repeat over, and over, and over again, even after the CronJob object itself has been deleted! At first we thought this meant that the CronJob hadn’t actually been deleted, but after some further study we realized that the pod name was the same for every single one of these log entries: in other words, the SimKube mutating webhook is trying to mutate the same pod for 10 minutes, well after the simulation was over and everything (supposedly) had been deleted.

The next clue came from the Kubernetes controller manager logs:

 “syncing orphan pod failed” err=<
        Pod “hello-simkube-29414550-tcr49” is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds`, `spec.tolerations` (only additions to existing tolerations), `spec.terminationGracePeriodSeconds` (allow it to be set to 1 if it was previously negative)
        @@ -140,7 +140,9 @@
          “TerminationGracePeriodSeconds”: 30,
          “ActiveDeadlineSeconds”: null,
          “DNSPolicy”: “ClusterFirst”,
        - “NodeSelector”: null,
        + “NodeSelector”: {
        +  “type”: “virtual”
        + },
          “ServiceAccountName”: “default”,
          “AutomountServiceAccountToken”: null,
          “NodeName”: “cluster-worker”,
 > logger=”job-controller” pod=”virtual-default/hello-simkube-29414550-tcr49”

This is a standard error that gets returned when something (a user, a controller, etc) tries to update a read-only field. In this case, it’s showing that something is trying to update the pod’s node selector after the pod has already been created, which is not allowed. There are two curious things to note in this log entry: first, the timestamp is after SimKube has deleted the CronJob, and it states that the pod has been orphaned, which means it’s not owned by anything. In other words, the CronJob really was deleted! Secondly, we got lucky in that some of the additional context shows that the pod has been scheduled to a node, that is, cluster-worker . This is not one of our simulated nodes! This is a real node! That shouldn’t happen.

The last clue came from the API server logs, where we discovered that the SimKube driver mutating webhook had been configured to fail open 5 . This means that, if the webhook fails (for whatever reason), the pod object will be allowed through anyways. Specifically, we saw that the webhook was failing because of a certificate error.

The certificate error immediately cast suspicion on cert-manager , which is the component that manages all of the TLS certificates for SimKube. Cert-manager is quite a complex bit of machinery, but is nevertheless required because mutating webhooks must communicate over TLS, which means they need certificates. In SimKube, we create a self-signed certificate issuer for this purpose. Cert-manager is actually a very robust tool, and has the really nice feature that it can auto-inject certificates into your webhook configuration if you apply the cert-manager.io/inject-ca-from annotation, which we do in SimKube. Investigating the cert-manager logs, everything seemed like it was working as designed at first, until we inspected the timestamps more closely. Then these two lines stood out:

I1204 18:29:07.814009 attempting to acquire leader lease kube-system/cert-manager-cainjector-leader-election...
I1204 18:30:11.466829 successfully acquired lease kube-system/cert-manager-cainjector-leader-election

By default, cert-manager, like many other components in Kubernetes, operates in a semi- HA fashion. There is one “leader” pod and a number of hot standby pods. That way, if the leader pod crashes or gets evicted, one of the standby pods can immediately take over. Kubernetes provides a distributed locking mechanism to ensure that only one pod can be the leader at a time. Until the lease is acquired, the cert-manager pod can’t do any work. What’s interesting to note here is that it took almost a minute to acquire the lease; and moreover, the simulation start time on the runner was 18:29:41, which means that the first CronJob pod, created at 18:30:00, was created before the cert-manager injector could provide the SimKube mutating webhook with its certificate.

So that’s one mystery answered: if the webhook didn’t have a certificate, it can’t apply the proper node selector, and because it fails open, the pod gets scheduled onto a real Kubernetes node instead of the intended fake node. But why and how does this pod become orphaned and stick around in the cluster until the SimKube driver times out?

Now that we knew the mechanism for the failure, it was easy to develop a local reproduction: delete the cert-manager injector pod from the cluster, start a simulation, and then after the first CronJob pod was created, recreate the cert-manager injector pod. This simulates 6 the effect of the injector waiting for the lease. In fact, the first time we did this, we didn’t recreate the injector pod until after the simulated-cronjob-sleep-pod-that-got-scheduled-on-a-real-node-by-mistake 7 had finished, and in this case it was correctly cleaned up and the simulation finished as normal.

Repeating the test locally, we observed that the critical failure only occurs if the cert-manager injector pod comes up while the CronJob pod is running . Since we had a reliable way to reproduce the error, we decided to take a quick peek at the kubelet logs and saw this log line repeated over and over again:

Failed to update status for pod” err=”failed to patch status
...
<long status update message>
...
for pod \”virtual-default\”/\”hello-simkube-29414879-r22m5\”:
pods \”hello-simkube-29414879-r22m5\” is forbidden: node \”karpenter-worker\” cannot update labels through pod status”

Aha! This is the last piece of the puzzle: kubelet is trying to update the status of the pod to say that it’s finished running, but it can’t. The error message is slightly weird, it’s saying that kubelet is sending a modification to the pod labels to the pod status endpoint , which is forbidden because pod labels aren’t part of the pod status. What’s strange about this is, if you look at the actual update kubelet is sending, there are no label updates.

I suspect those of you who’ve written admission webhooks are nodding along by now. The flow of data looks like this:

kubelet status update -> API server -> SimKube mutating webhook -> 
API server -> kubelet

In other words: because the SimKube mutating webhook was subscribed to both CREATE and UPDATE events 8 , it intercepted the kubelet’s status update, said “hey, this pod doesn’t have any of the right simulation labels or the proper node-selector on it, lemme add those!” The Kubernetes API server received the modification and said (in the logs) “Hey, you can’t add a node selector on an UPDATE!”, and said (to kubelet) “Hey, you can’t add a label from the /status endpoint!”, and said (to the mutating webhook) nothing 9 . Kubelet continued to retry the status update for the pod every 10 seconds until the simulation driver terminated.

Wait, but why did everything clean up after the simulation crashed? Well, once the simulation driver pod terminated, there was no longer a mutating webhook in place to add labels to the pods based on a status update, so the update went through, Kubernetes realized the pod had completed, and it deleted it to finish its cleanup.

After conducting this detailed analysis, ACRL engineers identified the following remediation steps:

  1. Stop running cert-manager in HA mode, because our one-replica cert-manager injector pod definitely doesn’t need to be spending up to one (1) minute trying to claim a lock that nobody else is holding.

  2. Configure the SimKube driver mutating webhook to fail closed: we basically never want a pod that is designated for a simulated node to get scheduled on a real node, because that could cause all kinds of issues.

  3. Configure the SimKube driver mutating webhook to only listen to pod CREATE events, not UPDATE events. Once the simulated pod is running, the driver never makes any further changes, so there’s no reason to listen for updates.

  4. Modify the SimKube simulation controller to wait for the driver pod to receive its certificate before continuing with simulation setup.

  5. Improve our logging and metrics monitoring infrastructure so that it’s easier to identify and troubleshoot these issues in the future.

As is common with incidents of this nature and scale, there was no single point of failure that caused the issue; had any one of these remediations been in place, the incident would not have occurred. To prevent future recurrence of this issue, and to enable defense in depth, we will prioritize getting these fixes in place at some point in the future when we feel like getting around to it.

ACRL cares strongly about the experience of the zero customers who are using this SimKube CI Runner action. We deeply apologize for the impact that our failure had on your CI pipelines and deploy process, and will be issuing refunds to all zero of customers who tried to use our runner image during the period of this outage. Please feel free to contact our support team if you have any further questions or concerns about this outage, and rest assured we will strive to do better next time.

~drmorr

Discussion about this post

Ready for more?

Microsoft Patch Tuesday, December 2025 Edition

Krebs
krebsonsecurity.com
2025-12-09 23:18:29
Microsoft today pushed updates to fix at least 56 security flaws in its Windows operating systems and supported software. This final Patch Tuesday of 2025 tackles one zero-day bug that is already being exploited, as well as two publicly disclosed vulnerabilities....
Original Article

Microsoft today pushed updates to fix at least 56 security flaws in its Windows operating systems and supported software. This final Patch Tuesday of 2025 tackles one zero-day bug that is already being exploited, as well as two publicly disclosed vulnerabilities.

Despite releasing a lower-than-normal number of security updates these past few months, Microsoft patched a whopping 1,129 vulnerabilities in 2025, an 11.9% increase from 2024. According to Satnam Narang at Tenable , this year marks the second consecutive year that Microsoft patched over one thousand vulnerabilities, and the third time it has done so since its inception.

The zero-day flaw patched today is CVE-2025-62221 , a privilege escalation vulnerability affecting Windows 10 and later editions. The weakness resides in a component called the “ Windows Cloud Files Mini Filter Driver ” — a system driver that enables cloud applications to access file system functionalities.

“This is particularly concerning, as the mini filter is integral to services like OneDrive, Google Drive, and iCloud, and remains a core Windows component, even if none of those apps were installed,” said Adam Barnett , lead software engineer at Rapid7 .

Only three of the flaws patched today earned Microsoft’s most-dire “critical” rating: Both CVE-2025-62554 and CVE-2025-62557 involve Microsoft Office , and both can exploited merely by viewing a booby-trapped email message in the Preview Pane. Another critical bug — CVE-2025-62562 — involves Microsoft Outlook , although Redmond says the Preview Pane is not an attack vector with this one.

But according to Microsoft, the vulnerabilities most likely to be exploited from this month’s patch batch are other (non-critical) privilege escalation bugs, including:

CVE-2025-62458 — Win32k
CVE-2025-62470 — Windows Common Log File System Driver
CVE-2025-62472 — Windows Remote Access Connection Manager
CVE-2025-59516 — Windows Storage VSP Driver
CVE-2025-59517 — Windows Storage VSP Driver

Kev Breen , senior director of threat research at Immersive , said privilege escalation flaws are observed in almost every incident involving host compromises.

“We don’t know why Microsoft has marked these specifically as more likely, but the majority of these components have historically been exploited in the wild or have enough technical detail on previous CVEs that it would be easier for threat actors to weaponize these,” Breen said. “Either way, while not actively being exploited, these should be patched sooner rather than later.”

One of the more interesting vulnerabilities patched this month is CVE-2025-64671 , a remote code execution flaw in the Github Copilot Plugin for Jetbrains AI-based coding assistant that is used by Microsoft and GitHub. Breen said this flaw would allow attackers to execute arbitrary code by tricking the large language model (LLM) into running commands that bypass the guardrails and add malicious instructions in the user’s “auto-approve” settings.

CVE-2025-64671 is part of a broader, more systemic security crisis that security researcher Ari Marzuk has branded IDEsaster (IDE  stands for “integrated development environment”), which encompasses more than 30 separate vulnerabilities reported in nearly a dozen market-leading AI coding platforms, including Cursor , Windsurf , Gemini CLI , and Claude Code .

The other publicly-disclosed vulnerability patched today is CVE-2025-54100 , a remote code execution bug in Windows Powershell on Windows Server 2008 and later that allows an unauthenticated attacker to run code in the security context of the user.

For anyone seeking a more granular breakdown of the security updates Microsoft pushed today, check out the roundup at the SANS Internet Storm Center . As always, please leave a note in the comments if you experience problems applying any of this month’s Windows patches.

Show HN: Gemini 3 imagines Hacker News as a HyperCard stack in 1994

Hacker News
hyper-card-hacker-news.vercel.app
2025-12-09 23:04:02
Comments...

Pete Hegseth Says the Pentagon's New Chatbot Will Make America 'More Lethal'

403 Media
www.404media.co
2025-12-09 23:00:45
The Department of War aims to put Google Gemini 'directly into the hands of every American warrior.'...
Original Article

Secretary of War Pete Hegseth announced the rollout of GenAI.mil today in a video posted to X . To hear Hegseth tell it, the website is “the future of American warfare.” In practice, based on what we know so far from press releases and Hegseth’s posturing, GenAI.mil appears to be a custom chatbot interface for Google Gemini that can handle some forms of sensitive—but not classified—data.

Hegseth’s announcement was full of bold pronouncements about the future of killing people. These kinds of pronouncements are typical of the second Trump administration which has said it believes the rush to “win” AI is an existential threat on par with the invention of nuclear weapons during World War II.

Hegseth, however, did not talk about weapons in his announcement. He talked about spreadsheets and videos. “At the click of a button, AI models on GenAI can be used to conduct deep research, format documents, and even analyze video or imagery at unprecedented speed,” Hegseth said in the video on X. Office work, basically. “We will continue to aggressively field the world’s best technology to make our fighting force more lethal than ever before.”

Emil Michael, the Pentagon’s under secretary for research and engineering, also stressed how important GenAI would be to the process of killing people in a press release about the site’s launch.

“There is no prize for second place in the global race for AI dominance. We are moving rapidly to deploy powerful AI capabilities like Gemini for Government directly to our workforce. AI is America's next Manifest Destiny, and we're ensuring that we dominate this new frontier,” Michael said in the press release, referencing the 19th century American belief that God had divinely ordained Americans to settle the west at the same time he announced a new chatbot.

The press release says Google Cloud's Gemini for Government will be the first instance available on the internal platform. It’s certified for Controlled Unclassified Information, the release states, and claims that because it’s web grounded with Google Search–meaning it’ll pull from Google search results to answer queries–that makes it “reliable” and “dramatically reduces the risk of AI hallucinations.” As we’ve covered, because Google search results are also consuming AI content that contains errors and AI-invented data from across the web, it’s become nearly unusable for regular consumers and researchers alike.

During a press conference about the rollout this morning, Michael told reporters that GenAI.mil would soon incorporate other AI models and would one day be able to handle classified as well as sensitive data. As of this writing, GenAI’s website is down.

“For the first time ever, by the end of this week, three million employees, warfighters, contractors, are going to have AI on their desktop, every single one,” Michael told reporters this morning, according to Breaking Defense . They’ll “start with three million people, start innovating, using building, asking more about what they can do, then bring those to the higher classification level, bringing in different capabilities,” he said.

The second Trump administration has done everything in its power to make it easier for the people in Silicon Valley to push AI on America and the world. It has done this, in part, by framing it as a national security issue. Trump has signed several executive orders aimed at cutting regulations around data centers and the construction of nuclear power plants. He’s threatened to sign another that would block states from passing their own AI regulations . Each executive order and piece of proposed legislation threatens that losing the AI race would mean making America weak and vulnerable and erode national security.

The country’s tech moguls are rushing to build datacenters and nuclear power plants while the boom time continues. Nevermind that people do not want to live next to datacenters for a whole host of reasons. Nevermind that tech companies are using faulty AIs to speed up the construction of nuclear power plants. Nevermind that the Pentagon already had a proprietary LLM it had operated since 2024 .

“We are pushing all of our chips in on artificial intelligence as a fighting force. The Department is tapping into America's commercial genius, and we're embedding generative AI into our daily battle rhythm,’ Hegseth said in the press release about GenAI.mil. "AI tools present boundless opportunities to increase efficiency, and we are thrilled to witness AI's future positive impact across the War Department."

About the author

Matthew Gault is a writer covering weird tech, nuclear war, and video games. He’s worked for Reuters, Motherboard, and the New York Times.

Matthew Gault

OpenEvolve: Teaching LLMs to Discover Algorithms Through Evolution

Hacker News
algorithmicsuperintelligence.ai
2025-12-09 22:54:33
Comments...
Original Article

OpenEvolve: Teaching LLMs to Discover Algorithms Through Evolution

How do we teach machines to discover algorithms? Traditional approaches rely on hand-crafted heuristics, exhaustive search, or gradient-based optimization. But what if we could harness the creative potential of large language models (LLMs) within an evolutionary framework?

OpenEvolve is an open-source evolutionary coding agent that integrates large language models into a quality-diversity search framework for algorithm discovery. Candidate programs are produced via LLM-guided edits (diff-based by default), evaluated with user-defined metrics, and organized using MAP-Elites while an island model with migration supports parallel, diversified exploration. The evaluation pipeline supports cascade staging and an artifact side-channel that feeds execution traces and errors back into subsequent prompts; optional LLM-based feedback can be incorporated into scoring.

OpenEvolve has been applied across many domains—here are a few examples: systems optimization , scientific discovery , geospatial algorithms , scaling law discovery , GPU kernel optimization , prompt optimization , and more.


Architecture Overview

OpenEvolve Architecture

Figure 1: OpenEvolve architecture showing the five interconnected components of the evolution loop

The Evolution Loop

  • Prompt Sampler: Constructs context-rich prompts by selecting a parent program from the current island and curating evidence sets (top performers by fitness, lineage ancestors, diverse extremes across feature bins, and random samples). Prompts include the parent's code, evaluation metrics, feature coordinates for MAP-Elites, evolution history, and (optionally) execution artifacts. Template selection supports diff-based editing by default or full rewrites, with controlled stochasticity.
  • LLM Ensemble: Generates candidate code using a weighted ensemble of OpenAI-compatible models (deterministic under seeds). In standard mode, a model is sampled by weight; in model-based islands, each island uses a fixed model. Responses drive either diff-based edits (SEARCH/REPLACE blocks) or full rewrites (JSON/code-block extraction), with generation parameters drawn from configuration.
  • Evaluator: Executes the user-provided evaluate(program_path) with timeouts and retries; optionally applies cascade evaluation ( evaluate_stage1/2/3 ) with thresholds to filter weak candidates early. It can incorporate LLM-based feedback into metrics and captures artifacts (e.g., stderr, tracebacks) for subsequent prompt context. Parallel evaluations are supported via an internal task pool.
  • Program Database: Implements MAP-Elites per island, binning programs along configurable feature dimensions (defaults include complexity and diversity; custom dimensions are taken from evaluator metrics). New candidates replace cell occupants when fitness improves (preferring combined_score , otherwise a safe numeric aggregate excluding feature dimensions). The database enforces population limits, tracks the global best, logs prompts, supports migration, and persists checkpoints.
  • Controller: Orchestrates the loop, including seeding, logging, prompt/evaluator initialization, and process-based parallel execution. It schedules iterations across islands, manages checkpointing and resume, enforces early stopping/target score criteria, stores artifacts, and writes the best discovered program and its metadata to the output directory.

Key Algorithmic Innovations

Island-Based Evolution with Lazy Migration

OpenEvolve maintains multiple isolated populations (islands) that evolve independently to reduce premature convergence and enable parallel exploration. Migration is event-driven: each island migrates when its per-island program additions since the last migration reach a configured interval, rather than on wall-clock time. Migration follows a ring topology by default (optional random migration), transferring a fraction of top programs while avoiding duplicate code in the destination island.

# Configuration example
database:
  num_islands: 5
  migration_interval: 20   # generations, not iterations
  migration_rate: 0.1      # 10% of top programs migrate

MAP-Elites for Diversity Preservation

Each island maintains a MAP-Elites grid over configurable feature dimensions (defaults include complexity and diversity; additional dimensions can be supplied by the evaluator). A candidate occupies or replaces the cell if it improves fitness (preferring combined_score , otherwise a safe aggregate over numeric metrics excluding feature dimensions). This enforces one elite per cell and preserves quality-diversity. The system also avoids exact duplicates (e.g., during migration) and computes diversity using structural measures (e.g., edit distance), rather than relying on code embeddings.

Cascade Evaluation

Evaluation proceeds in stages with configurable thresholds. If cascade functions are provided, Stage 1 performs fast checks (e.g., import/execute), Stage 2 runs lightweight tests, and Stage 3 executes comprehensive benchmarks. Candidates must meet stage thresholds to advance. Timeouts and exceptions are captured as artifacts and can be fed back into subsequent prompts. When cascade functions are not defined, evaluation falls back to a single-stage evaluate(program_path) with timeouts and retries.

Double-Selection Strategy

Parent selection is biased toward high-fitness programs, while inspiration material shown to the LLM is drawn from complementary sources (top programs, lineage ancestors, diverse extremes across feature bins, and random samples). This separation encourages improvements guided by the current best while maintaining exploration pressure via diverse exemplars, implemented through prompt construction rather than direct recombination.


Sample Use Cases

Example 1: Algorithmic Discovery

On the AlgoTune benchmark, OpenEvolve discovered algorithms achieving dramatic speedups through automatic optimization:

Algorithmic Discovery Results

Figure 2: Algorithmic discovery results showing dramatic speedups on the AlgoTune benchmark

Key breakthroughs include automatic discovery of JAX JIT compilation (321x), FFT-based convolution (256x), and optimized graph algorithms (95.78x). The system evolved from simple iterative implementations to sophisticated numerical computing patterns without human intervention. For more detailed analysis, see Towards Open Evolutionary Agents .

Example 2: Circle Packing

OpenEvolve matched state-of-the-art results (2.634 sum of radii for n=26), evolving from naive geometric constructions to discovering scipy.optimize with SLSQP—a completely different algorithmic approach than the initial solution.

Example 3: GPU Kernel Optimization

Evolution of Metal GPU kernels for transformer attention on Apple Silicon:

GPU Kernel Performance

Figure 3: GPU kernel performance improvements for transformer attention on Apple Silicon

OpenEvolve discovered several non-obvious optimizations:

  • 8-element SIMD vectorization matching Apple Silicon's hardware width
  • Two-pass online softmax reducing memory bandwidth
  • GQA-specific memory layouts exploiting head structure

These optimizations maintain 100% numerical accuracy while achieving measurable performance improvements across diverse inference scenarios. For more details, see GPU Kernel Discovery .

Example 4: LLM Prompt Optimization

Beyond code, OpenEvolve can evolve prompts themselves:

Prompt Optimization Results

Figure 4: Prompt optimization results on GEPA benchmarks

On GEPA benchmarks, evolved prompts achieved +10.69% accuracy on HotpotQA (multi-hop reasoning) and +6.42% overall across multiple benchmarks. This demonstrates OpenEvolve's versatility—the same evolutionary framework optimizes both code and natural language.

Evolution Progress: As shown below on the AlgoTune benchmark, we see that the performance consistently improves over generations. Extended evolution (200 iterations) achieved 24% better results than shorter runs (100 iterations), suggesting that patient exploration of the solution space yields compounding benefits.

Evolution Progress

Figure 5: Performance improvement over generations showing compounding benefits of extended evolution


Getting Started

OpenEvolve provides both library and command-line interfaces:

from openevolve import run_evolution

result = run_evolution(
    initial_program="def solve(x): return x * 2",
    evaluator=lambda path: {"score": benchmark(path)},
    iterations=100
)

For complex configurations, use YAML files specifying LLM models, evolution strategies, and evaluation parameters. OpenEvolve supports checkpoint/resume for long-running experiments and parallel evaluation across multiple cores. OpenEvolve is open-source and available on GitHub .

Update: This blog post was updated on November 1, 2025

The Beautiful Game Is Getting Ugly

Portside
portside.org
2025-12-09 22:53:36
The Beautiful Game Is Getting Ugly Judy Tue, 12/09/2025 - 17:53 ...
Original Article

WASHINGTON – Inside the Kennedy Center, FIFA President Gianni Infantino was hanging a gold metal around the neck of his “close friend” Donald Trump and giving him a hefty golden peace trophy.

“The FIFA Peace Prize is awarded annually,” Infantino said of the new award nobody has ever won before from his supposedly apolitical organization. The pair were at the landmark arts venue for a floor show–like event before officials drew brackets of teams to play this summer’s World Cup tournament, to be held in Canada, Mexico, and the United States. FIFA took over the building for free and displaced scheduled National Symphony Orchestra concerts, according to The Washington Post ; it closed down nearby streets for multiple blocks and forced drivers into inconvenient detours.

“This is truly one of the great honors of my life. And beyond awards, we saved millions and millions of lives,” Trump said in a kind of acceptance speech, referring to a series of conflicts around the world he likes to say he stopped, despite contradictory statements from the people involved in the conflicts . “The fact that we could do that, so many different wars that were able to end in some cases right before they started, it was great to get them done,” he continued, claiming without proof that “the world is a safer place now.”

Outside was an entirely different scene. Dozens of protesters had gathered as close as they could to the Kennedy Center, holding up soccer-style red cards that said, “deadly sanctions,” “bombing boats,” and “racist travel bans.” They flew a giant Palestinian flag and mourned the at least 437 Gazan soccer players murdered in the ongoing U.S.-backed Israeli genocide, according to the Palestinian Football Association . The dead include Suleiman al-Obeid , the Pele of Palestine, whom Israeli soldiers murdered as he waited in line for food in Rafah.

Protesters had a cartoonishly huge soccer ball, which they pushed into a stack of oversized cardboard ice cubes. “No ICE in my cup!” another series of signs said. Big snowflakes showered down. A phalanx of cops stood within throwing distance. Protesters set up a table with hot cocoa.

The protest, organized by activists and soccer fans, including the group Get Free, want white supremacy out of soccer. Trump colludes with FIFA billionaires, the group argues, and is using “the beautiful game” to promote his message and vision ahead of America’s 250th birthday. It’s one of the many protests that are expected ahead of next summer’s World Cup games, said political observers who study sports. The way FIFA conducts itself amid Trump’s immigration terror campaign and the GOP’s decision to slam the doors shut on immigration of all kinds, including tourism, will act as a preview for how the administration will treat the Summer Olympics in Los Angeles in 2028, they said.

“Soccer is about equality and freedom of movement,” said Anthony Torres, a spokesperson for Get Free. But Trump, he said, is erasing that, just as he’s erasing the history and presence of Black and brown people in the U.S., filling up his detention gulags and “bringing us back to the heyday of Jim Crow.”

Get Free is calling for the World Cup to be a platform for humanity, Torres said, and for World Cup leadership to stand up against white supremacy. Protesters from other organizations turned out, too.

“We’re here to send a clear message to Trump that you can’t reconcile ICE with soccer culture,” said Slobodan Milic, a protester with Free DC and avid fan of D.C. United, the local MLS team. “Soccer is the most democratic game there is. Everyone tries to hit the ball.”

After chatting briefly about PSV Eindhoven’s recent upset win over Liverpool—“It was on Liverpool’s own turf!”—Milic returned to politics. Along with his fellow D.C. United fans, he said, he spends the 51st minute of every game chanting, “Free D.C.!” in reference to the long-standing push to make D.C. the 51st state. Sometimes, the chant takes over the whole stadium . Now he’s directing his energy toward keeping ICE away from the World Cup. The goal, he said, is to “abolish ICE during the World Cup and get them out of all of our cities.”

World Cup matches will be played in 11 U.S. cities , including Los Angeles, Miami, and New York, all of which have large immigrant populations. Earlier this year, Los Angeles was the site of major ICE operations and counterprotests, and advocates worry that Trump could use the World Cup as a pretense to incite more raids and violence. Protesters like Milic fear that residents without citizenship could be targeted as they try to enjoy the festivities and games next summer.

Those fears are well-founded: Just this July, an asylum seeker was arrested and handed over to ICE after bringing his two children, aged 10 and 14, to watch FIFA’s Club World Cup final in New Jersey. After three months in immigration detention, the man decided not to appeal when a judge rejected his asylum claim, prioritizing leaving detention above all else. He was returned to his country of origin, according to Human Rights Watch.

Closer to the Kennedy Center, soccer fans lined up for hours to get inside, and by noon a few were still making their way through the snow to get in. They, too, were concerned about the upcoming event. One fan, who declined to share his name and who planned to watch the draw at a coffee shop nearby, said he was supporting the team from Iran.

Under Trump’s travel ban, Iranian officials have been barred from coming into the U.S. since June. Trump’s executive order made exemptions for athletes, support staff, and immediate relatives for the World Cup event, but not necessarily for Friday’s draw. Iranian officials had planned to boycott the draw, but Iranian media on Thursday reported that the team’s coach would attend, according to the Associated Press .

The fan stood outside the security line, holding hands with a woman as snowflakes gathered on their thick jackets. If “softer” forms of diplomacy, like international sporting events, have the exact same goal as “harder” forms, “then I’m not sure it’s such an amazing goal,” he said before heading indoors. “But if it’s in the spirit of brotherhood, then that’s great.”

Jules Boykoff, professor and department chair in the department of political science at Pacific University and an ex-professional soccer player, said in a phone interview that among the open questions for the June matches are not just whether Trump’s anti-immigration policies will affect players. It’s also unclear how those policies will affect international fans.

He said he didn’t know how someone from Latin America could come into a U.S. sports event, given that the Supreme Court just ruled that Immigration and Customs Enforcement can use race as a reason to disappear someone. “You gotta be a real sports nut to do that,” he said.

Boykoff added that he doubted whether FIFA would take any steps to meet the kinds of demands demonstrators were making.

“Gianni Infantino has been Trump’s number one enabler. FIFA is not going to engage in anything resembling isolating Trump,” he said. “The fact that the World Cup draw is in D.C. is a testament to that. They made it as easy as possible for Trump to be there.”

The protesters kept chanting even as the event got under way and the snow came down heavier. They took turns kicking the massive inflatable soccer ball into the paper blocks of “ICE,” which went tumbling down onto the concrete.

They cheered. “Gooooooooaaaaal!”

===

Whitney Curry Wimbish is a staff writer at The American Prospect. She previously worked in the Financial Times newsletters division, The Cambodia Daily in Phnom Penh, and the Herald News in New Jersey. Her work has been published in multiple outlets, including The New York Times, The Baffler, Los Angeles Review of Books, Music & Literature, North American Review, Sentient, Semafor, and elsewhere. She is a coauthor of The Majority Report’s daily newsletter and publishes short fiction in a range of literary magazines.

Emma Janssen is a writing fellow at The American Prospect, where she reports on anti-poverty policy, health, and political power. Before joining the Prospect, she was at UChicago studying political philosophy, editing for The Chicago Maroon, and freelancing for the Hyde Park Herald.

===

Linux CVEs, more than you ever wanted to know

Hacker News
www.kroah.com
2025-12-09 22:47:36
Comments...
Original Article

It’s been almost 2 full years since Linux became a CNA (Certificate Numbering Authority) which meant that we (i.e. the kernel.org community) are now responsible for issuing all CVEs for the Linux kernel. During this time, we’ve become one of the largest creators of CVEs by quantity, going from nothing to number 3 in 2024 to number 1 in 2025. Naturally, this has caused some questions about how we are both doing all of this work, and how people can keep track of it.

I’ve given a number of talks over the past years about this, starting with the Open Source security podcast right after we became a CNA and then the Kernel Recipes 2024 talk, “CVEs are alive, but do not panic” and then a talk at OSS Hong Kong 2024 about the same topic with updated numbers and later a talk at OSS Japan 2024 with more info about the same topic and finally for 2024 a talk with more detail that I can’t find the online version .

In 2025 I did lots of work on the CRA so most of my speaking over this year has been about that topic , but the CVE assignment work continued on, evolving to meet many of the issues we had in our first year of being a CNA. As that work is not part of the Linux kernel source directly, it’s not all that visable to the normal development process, except for the constant feed on the linux-cve-announce mailing list I figured it was time to write down how this is all now working, as well a bunch of background information about how Linux is developed that is relevant for how we do CVE reporting (i.e. almost all non-open-source-groups don’t seem to know how to grasp our versioning scheme.)

There is a in-kernel document that describes how CVEs can be asked for from the kernel community, as well as a basic summary of how CVEs are automatically asigned. But as we are an open community, it’s good to go into more detail as to how all of us do this work, explaining how our tools have evolved over time and how they work, why some things are the way they are for our releases, as well as document a way that people can track CVE assignments on their own in a format that is, in my opinion, much simpler than attempting to rely on the CVE json format (and don’t get me started on NVD…)

So here’s a series of posts going into all of this, hopefully providing more information than you ever wanted to know, which might be useful for other open source projects as they start to run into many of the same issues we have already dealt with (i.e. how to handle reports at scale):

★ iMessage’s Delivery Architecture Makes It Hard to Block Without Blocking All iOS Push Notifications

Daring Fireball
daringfireball.net
2025-12-09 22:42:45
Draw your own conclusions about cellular carriers and enterprise network administrators being similar to authoritarian governments....
Original Article

From Apple’s iMessage Security Overview :

Apple iMessage is a messaging service for iPhone, iPad, Mac, Apple Watch, and Apple Vision Pro. Relying on the Apple Push Notification service (APNs), iMessage lets users send texts and attachments like photos, contacts, locations, links, and emoji. Messages sync across all devices, enabling seamless conversations. Apple doesn’t store message content or attachments, which are all secured with end-to-end encryption so that no one but the sender and receiver can access them. Apple canʼt decrypt the data.

This thread on Mastodon , prompted by my wondering why Russia is blocking FaceTime but not iMessage , suggests that because iMessage messages are sent via APNs, a network (or entire nation) seeking to block iMessage can only do by blocking all push notifications for iOS. That’s why on airplanes with “free messaging” on in-flight Wi-Fi, you usually also get all incoming push notifications, even for services that aren’t available on the free Wi-Fi.

Here’s a support document from GFI Software , which makes network appliances for enterprises and schools:

The Exinda appliance gives administrators multiple options to stop or throttle applications that can use a lot of bandwidth in the network. An application that many would consider discardable or able to be easily limited in bandwidth is iMessage. When blocking or discarding iMessage traffic, users may experience an issue where all push notifications on iOS devices that have traffic going through the Exinda, i.e., on WiFi, will stop displaying.

Root Cause: Apple uses the Apple Push Notification Service (APNS) to allow application creators to push out information to iOS devices. This includes mail servers being able to push out notifications of calendar and email, or app creators to be able to push text-based messages straight to the device.

Apple might have architected iMessage this way to make iMessage veto-proof with cellular carriers, who, at the time of iMessage’s announcement in June 2011 , were already promoting iPhone push notifications as a reason to upgrade from a dumb phone to an iPhone with a more expensive plan. The carriers might have been tempted to block iMessage over cell networks to keep people using SMS, but they couldn’t without blocking all push notifications, which wouldn’t be tenable. But this architecture also makes iMessage hard to block in authoritarian countries where iPhones are even vaguely popular. (Maybe this helps explain why iMessage isn’t blocked in China, too?)

Draw your own conclusions about cellular carriers and enterprise network administrators being similar to authoritarian governments.

Tufts Student Can Resume Research After Trump Officials Revoked Her Visa, Judge Rules

Portside
portside.org
2025-12-09 22:41:57
Tufts Student Can Resume Research After Trump Officials Revoked Her Visa, Judge Rules Judy Tue, 12/09/2025 - 17:41 ...
Original Article
Tufts Student Can Resume Research After Trump Officials Revoked Her Visa, Judge Rules Published

Rümeysa Öztürk in Boston, Massachusetts, on 10 May. | Faith Ninivaggi/Reuters

A federal judge has allowed a Tufts University student from Turkey to resume research and teaching while she deals with the consequences of having her visa revoked by the Trump administration , leading to six weeks of detention.

Rümeysa Öztürk, a PhD student studying children’s relationship to social media, was among the first people arrested as the Trump administration began targeting foreign-born students and activists involved in pro-Palestinian advocacy. She had co-authored an op-ed criticizing her university’s response to Israel and the war in Gaza. Immigration enforcement officers took her away in an unmarked vehicle, in an encounter caught on video in March outside her Somerville residence.

Öztürk has been out of a Louisiana immigrant detention center since May and back on the Tufts campus. But she has been unable to teach or participate in research as part of her studies because of the termination of her record in the government’s database of foreign students studying temporarily in the US.

In her ruling on Monday, chief US district judge Denise J Casper wrote that Öztürk is likely to succeed on claims that the termination was “arbitrary and capricious, contrary to law and in violation of the First Amendment”.

The government’s lawyers unsuccessfully argued that the Boston federal court lacked jurisdiction and that Öztürk’s Student and Exchange Visitor Information System record (Sevis) record was terminated legally after her visa was revoked, making her eligible for removal proceedings.

“There’s no statute or regulation that’s been violated by the termination of the SEVIS record in this case,” Mark Sauter, an assistant US attorney, said during a hearing last week. The Associated Press sent an email on Tuesday seeking comment from Sauter on whether the government plans to appeal.

In a statement, Öztürk, who plans to graduate next year, said while she is grateful for the court’s decision, she feels “a great deal of grief” for the education she has been “arbitrarily denied as a scholar and a woman in my final year of doctoral studies”.

“I hope one day we can create a world where everyone uses education to learn, connect, civically engage and benefit others – rather than criminalize and punish those whose opinions differ from our own,” said Öztürk, who is still challenging her arrest and detention.

===

SAP fixes three critical vulnerabilities across multiple products

Bleeping Computer
www.bleepingcomputer.com
2025-12-09 22:41:26
SAP has released its December security updates addressing 14 vulnerabilities across a range of products, including three critical-severity flaws. [...]...
Original Article

SAP

SAP has released its December security updates addressing 14 vulnerabilities across a range of products, including three critical-severity flaws.

The most severe (CVSS score: 9.9) of all the issues is CVE-2025-42880 , a code injection problem impacting SAP Solution Manager ST 720.

"Due to missing input sanitation, SAP Solution Manager allows an authenticated attacker to insert malicious code when calling a remote-enabled function module," reads the flaw's description.

"This could provide the attacker with full control of the system, hence leading to high impact on confidentiality, integrity, and availability of the system."

SAP Solution Manager is the vendor's central lifecycle management and monitoring platform used by enterprises for system monitoring, technical configuration, incident and service desk, documentation hub, and test management.

The next most severe flaw SAP fixed this month concerns multiple Apache Tomcat vulnerabilities impacting SAP Commerce Cloud components in versions HY_COM 2205, COM_CLOUD 2211, and COM_CLOUD 2211-JDK21.

The flaws are tracked in SAP Commerce Cloud under a single identifier, CVE-2025-55754 , given a CVSS severity rating of 9.6.

SAP Commerce Cloud is an enterprise-grade e-commerce platform backing large-scale online stores with product catalogs, pricing, promotions, checkout, order management, customer accounts, and ERP/CRM integration. It is generally used by large retailers and global brands.

The third critical (CVSS score: 9.1) flaw fixed this month is CVE-2025-42928 , a deserialization vulnerability impacting SAP jConnect, which, under certain conditions, could allow a high-privileged user to achieve remote code execution on the target via specially crafted input.

SAP jConnect is a JDBC driver used by developers and database administrators to connect Java applications to SAP ASE and SAP SQL Anywhere databases.

SAP's December 2025 bulletin also lists fixes for five high-severity flaws and six medium-severity issues, including memory corruption, missing authentication and authorization checks, cross-site scripting, and information disclosure.

SAP solutions are deeply embedded in enterprise environments and manage sensitive, high-value workloads, making them a valuable target for attackers.

Earlier this year, SecurityBridge researchers observed in-the-wild attacks abusing a code-injection flaw (CVE-2025-42957) impacting SAP S/4HANA, Business One, and NetWeaver deployments.

SAP has not marked any of the 14 flaws as actively exploited in the wild, but administrators should deploy the fixes without delay.

tines

Break down IAM silos like Bitpanda, KnowBe4, and PathAI

Broken IAM isn't just an IT problem - the impact ripples across your whole business.

This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.

Agentic AI Foundation

Simon Willison
simonwillison.net
2025-12-09 22:24:48
Agentic AI Foundation Announced today as a new foundation under the parent umbrella of the Linux Foundation (see also the OpenJS Foundation, Cloud Native Computing Foundation, OpenSSF and many more). The AAIF was started by a heavyweight group of "founding platinum members" ($350,000): AWS, Anthropi...
Original Article

Agentic AI Foundation . Announced today as a new foundation under the parent umbrella of the Linux Foundation (see also the OpenJS Foundation, Cloud Native Computing Foundation, OpenSSF and many more ).

The AAIF was started by a heavyweight group of "founding platinum members" ( $350,000 ): AWS, Anthropic, Block, Bloomberg, Cloudflare, Google, Microsoft, and OpenAI. The stated goal is to provide "a neutral, open foundation to ensure agentic AI evolves transparently and collaboratively".

Anthropic have donated Model Context Protocol to the new foundation, OpenAI donated AGENTS.md , Block donated goose (their open source, extensible AI agent ).

Personally the project I'd like to see most from an initiative like this one is a clear, community-managed specification for the OpenAI Chat Completions JSON API - or a close equivalent. There are dozens of slightly incompatible implementations of that not-quite-specification floating around already, it would be great to have a written spec accompanied by a compliance test suite.

The Real Problem of Humanity

Daring Fireball
www.harvardmagazine.com
2025-12-09 22:17:20
Sociobiologist Edward O. Wilson, back in 2009: The real problem of humanity is the following: we have paleolithic emotions; medieval institutions; and god-like technology. A related adage I heard, and internalized, recently: “We’re not thinking creatures who feel; we’re feeling creatures who t...
Original Article

“Stamp collectors” was the derisive term future Nobel laureate James Watson applied to Harvard biology professors involved in classification and anatomy in the 1950s.  The co-discoverer of DNA’s double helix structure, then in his twenties, had little tolerance for other approaches to biological science, and thought Harvard, where he had just been named an associate professor, should not waste a tenured position on subjects such as taxonomy and ecology. “Anyone who would hire an ecologist is out of his mind,” he once said.

The great sociobiologist E. O. Wilson, a year  younger than Watson, was one of the “stamp collectors.” And Harvard offered him tenure (countering an offer from Stanford) before offering it to Watson (who everyone knew would win the Nobel Prize). The biology department voted to defer Watson’s appointment, wanting to “get to know him better.” He did not react calmly, even though he was soon granted tenure. Wilson, recalling those days in his 1994 autobiographical book Naturalist, judged Watson the most mean-spirited academic he knew during his early years on the Harvard faculty. So began the rivalry of two scientists who have changed our understanding of life on Earth.

On September 9, in a sold-out event at Sanders Theatre, Wilson and Watson, who have since buried the hatchet, recalled the great division in the biological sciences in the 1950s: on the one hand, the organismic and evolutionary biologists; on the other, Watson—who was leading the revolution in molecular biology and agitating for the hiring of a critical mass of talent in his nascent field. The molecular biologists, Wilson recalled, “landed in the biology department like aliens in Manhattan,” with Watson as the young avatar of their movement.  The two also reflected on the modern reunification of their field, and on future challenges to the field and to the planet.

The event coincided with the 150th anniversary of the Harvard Museum of Natural History—a longtime hangout of  “stamp collectors,” whose methods and collections have proven unexpectedly critical and invaluable in the molecular age of biology—and also with the 150 th anniversary of the publication of Darwin’s On the Origin of Species. Robert Krulwich, a correspondent for National Public Radio’s Science Desk, moderated the discussion.

Krulwich, quoting one of Watson’s former students on Watson’s belief that to be good, “you have to have an enemy,” asked if he really needed to proceed as though leading a bunch of marines. Yes, said Watson. “It works with boys, anyway.” Krulwich also prodded the outwardly mild-mannered Wilson, reminding him that he once said he had been “blessed with brilliant enemies.”

Yes, said Wilson, “and I am the only scientist in modern times to have been physically attacked for an idea”—that idea being his theory that there is a biological basis for human nature, for which radical leftists once poured water on him at a conference. “Top that, Jim,” crowed Wilson. “Ambition and competitiveness,” he continued, “are essential to do really good work. I didn’t show it as much as Jim because I am a Southerner.”

Watson, in fact, attributed their eventual reconciliation to the fact that “I hated Ed’s enemies.” But there were also larger changes in the field of biology that, over time, brought the two scientists’ world views closer together, a gradual sintering of the field. “Molecular biology had a bacterial explosion…,” Wilson explains. “But the result of this was that as molecular and cell biology matured, it produced an armamentarium of methods, ideas, and so on, which we [the “stamp collectors”] started grabbing hold of.” Before long, evolutionary and organismic biologists, who study diversity, were “still collecting bugs, but we were moving down in our analyses to include genomics, while molecular biologists started going evolutionary.” The “glory of late twentieth-century biology,” he pointed out, “is that it is unifying.”

Krulwich then asked about the future. Watson said he hoped that cancer would be cured by 2020, and pointed to studies of how the brain works and how life began as two of the most promising areas of research in the biological sciences. Wilson agreed, but would himself, he said, embark on a new study of diversity, this time in the virtually unexplored world of microbes.

Will we solve the crises of next hundred years? asked Krulwich. “Yes, if we are honest and smart,” said Wilson. “The real problem of humanity is the following: we have paleolithic emotions; medieval institutions; and god-like technology. And it is terrifically dangerous, and it is now approaching a point of crisis overall.” Until we understand ourselves, concluded the Pulitzer-prize winning author of On Human Nature, “until we answer those huge questions of philosophy that the philosophers abandoned a couple of generations ago—Where do we come from? Who are we? Where are we going?—rationally,” we’re on very thin ground.

Related content from Harvard Magazine:

A profile of Wilson

A review of Watson's Avoid Boring People

A review of Wilson's Consilience

Wilson on Darwin

Multiplying our way out of division

Lobsters
xania.org
2025-12-09 21:58:34
Comments...
Original Article

Written by me, proof-read by an LLM.
Details at end.

I occasionally give presentations to undergraduates, and one of my favourites is taking the students on a journey of optimising a “binary to decimal” routine 1 . There are a number of tricks, which I won’t go in to here, but the opening question I have is “how do you even turn a number into its ASCII representation?”

If you’ve never stopped to think about it, take a moment now to do so, it can be a fun problem.

The simple 2 approach is to use number % 10 to get the rightmost digit (adding 48 to turn it into the ASCII number), then divide by ten, and keep going until the number is zero. This produces the digits backwards 3 , but you can reverse them afterwards, which I won’t show here. This routine is one of the few legitimate uses of do while in C, as we always want to emit at least one digit even if the number is zero to start with. The code looks something like:

Here the compiler does a fantastic job. Yesterday we saw how division by powers of two can be optimised to shifts; today we’ll see the compiler manages to avoid expensive division even when we aren’t dividing by a power of two. It also gets the remainder cheaply, which we usually get for free from the divide instruction.

The transformation is quite clever - let’s walk through the annotated assembly:

to_decimal_backwards(unsigned int, char*):
  mov rax, rsi              ; rax = buf
  mov esi, 3435973837       ; esi = 0xcccccccd
.L2:
  mov edx, edi              ; edx = number
  mov ecx, edi              ; ecx = number
  add rax, 1                ; ++buf

  imul rdx, rsi             ; rdx *= 0xcccccccd
  shr rdx, 35               ; rdx = rdx >> 35
                            ; rdx = number / 10 [see below]

  lea r8d, [rdx+rdx*4]      ; r8 = rdx * 5
  add r8d, r8d              ; r8 = rdx * 5 * 2 = rdx * 10
                            ; r8 = (number / 10) * 10 [rounded down]
  sub ecx, r8d              ; ecx = number - (number / 10) * 10
                            ; ecx = number % 10
  add ecx, 48               ; ecx = '0' + (number % 10)
  cmp edi, 9                ; number > 9?
  mov edi, edx              ; number = number / 10
  mov BYTE PTR [rax-1], cl  ; *(buf-1) = '0' + (number % 10)
  ja .L2                    ; loop if number (prior to divide) >= 10
  ret

There’s a lot to unpack here, several different optimisations, but the main one is how the compiler has turned division by a constant ten into a multiply and a shift. There’s a magic constant 0xcccccccd and a shift right of 35! Shifting right by 35 is the same as dividing by 2 35 - what’s going on? 4

Let’s see what happens each step of the algorithm:

>>> 1234 * 0xcccccccd
4239991714858
>>> 4239991714858 // (2**35)
123
>>> 123 * 10
1230
>>> 1234 - 1230
4

What’s happening is that 0xcccccccd / 2**35 is very close to ⅒ (around 0.10000000000582077). By multiplying our input value by this constant first, then shifting right, we’re doing fixed-point multiplication by ⅒ - which is division by ten. The compiler knows that for all possible unsigned integer values, this trick will always give the right answer. For other values and signednesses, sometimes it needs to account for rounding, for example, dividing a signed value by three:

Here we see that it has to account for rounding. If you edit the code above and try dividing by fifteen, you’ll see that causes even more code to be emitted. However, it’s all still faster than a real divide instruction.

Back to our ASCII conversion example, to get the remainder (the modulus); the compiler takes the (truncated) number / 10 , multiplies it back up by 10 using lea tricks (we’ve covered this before ), and then the difference between the original number and this computed value is the remainder.

The rest of the optimisations are the compiler trying to do work eagerly (like incrementing buf ), and checking one loop iteration ahead: there’s no point looping if the current number is less than or equal to 9.

Overall, some very clever optimisations that avoid division entirely!

See the video that accompanies this post.


This post is day 7 of Advent of Compiler Optimisations 2025 , a 25-day series exploring how compilers transform our code.

This post was written by a human ( Matt Godbolt ) and reviewed and proof-read by LLMs and humans.

Support Compiler Explorer on Patreon or GitHub , or by buying CE products in the Compiler Explorer Shop .

Posted at 06:00:00 CST on 7 th December 2025.

That Frank Seddio Lawsuit Over the Missing Millions Is Only Getting More Convoluted

hellgate
hellgatenyc.com
2025-12-09 21:43:57
A judge gave the escrow lawyer one day to provide proof that $2 million are still in the escrow account. Almost a week later, he hasn’t done so....
Original Article
That Frank Seddio Lawsuit Over the Missing Millions Is Only Getting More Convoluted
Then-Kings County Democratic Party Chair Frank Seddio at Junior's in Brooklyn on November 4, 2016. (Shutterstock)

Scott's Picks:

Give us your email to read the full story

Sign up now for our free newsletters.

Sign up

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Hell Gate.

Your link has expired.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.

Qt, Linux and everything: Debugging Qt WebAssembly

Hacker News
qtandeverything.blogspot.com
2025-12-09 21:19:37
Comments...
Original Article

One of the most tedious tasks a developer will do is debugging a nagging bug. It's worse when it's a web app, and even worse when its a webassembly web app.


The easiest way to debug Qt Webassembly is by configuring using the -g argument, or CMAKE_BUILD_TYPE=Debug . Emscripten embeds DWARF symbols in the wasm binaries.

NOTE: Debugging wasm files with DWARF works only in the Chrome browser with the help of a browser extension.

C/C++ DevTools Support (DWARF) browser extension. If you are using Safari or Firefox, or do not want to or cannot install a browser extension, you will need to generate source maps, which I will look at in my next blog post.

DWARF debugging

You need to also enable DWARF in the browser developer tools settings, but you do not need symlinks to the source directories, as you would need to using source maps, as the binaries are embedded with the full directory path. Like magic!

Emscripten embeds DWARF symbols into the binaries built with -g by default, so re-building Qt or your application in debug mode is all you need to do.

Qt builds debug libraries by default using the optimized argument -g2, which produces less debugging info, but results in faster link times. To preserve debug symbols you need to build Qt debug using the -g or -g3 argument. Both of these do the same thing.

Using DWARF debugger

Open Chome with the extention mentioned before installed, and open the console tools. Navigate to the Qt for WebAssembly web application you need to debug. Once it opens, it may take a few seconds for all the symbols and files to get parsed. If you are debugging into Qt, this will take quite a few seconds - just keep waiting.

The javascript console will soon contain file source file paths and sources. You can find your file to debug, and set breakpoints. Just reload the page and once it hits a breakpoint, it will stop execution and highlight the current line in the source view.  It also will show variable names and values.

You can then step though your code as you would debugging a desktop application.

ProPublica: ‘Trump’s Own Mortgages Match His Description of Mortgage Fraud, Records Reveal’

Daring Fireball
www.propublica.org
2025-12-09 21:17:00
Justin Elliott, Robert Faturechi, and Alex Mierjeski, reporting for ProPublica: For months, the Trump administration has been accusing its political enemies of mortgage fraud for claiming more than one primary residence. President Donald Trump branded one foe who did so “deceitful and potentiall...
Original Article

For months, the Trump administration has been accusing its political enemies of mortgage fraud for claiming more than one primary residence.

President Donald Trump branded one foe who did so “deceitful and potentially criminal.” He called another “ CROOKED ” on Truth Social and pushed the attorney general to take action.

But years earlier, Trump did the very thing he’s accusing his enemies of, records show.

In 1993, Trump signed a mortgage for a “Bermuda style” home in Palm Beach, Florida, pledging that it would be his principal residence. Just seven weeks later, he got another mortgage for a seven-bedroom, marble-floored neighboring property, attesting that it too would be his principal residence.

In reality, Trump, then a New Yorker, does not appear to have ever lived in either home, let alone used them as a principal residence. Instead, the two houses, which are next to his historic Mar-a-Lago estate, were used as investment properties and rented out, according to contemporaneous news accounts and an interview with his longtime real estate agent — exactly the sort of scenario his administration has pointed to as evidence of fraud.

At the time of the purchases, Trump’s local real estate agent told the Miami Herald that the businessman had “hired an expensive New York design firm” to “dress them up to the nines and lease them out annually.” In an interview, Shirley Wyner, the late real estate agent’s wife and business partner who was herself later the rental agent for the two properties, told ProPublica: “They were rentals from the beginning.” Wyner, who has worked with the Trump family for years, added: “President Trump never lived there.”

A newspaper clipping with the text: “Barclay’s International Reality: 1094 S. Ocean: 7 bedrooms, 7 bathrooms, 2 guest houses, tennis, private beach, heated pool. $3000 per day. Available weekly or monthly.”
A newspaper clipping with the text: “Lease: Palm Beach, 124 Woodbridge Road. Luxurious Bermuda style home with large Florida room, 3 or 4 bedrooms, 3 bathrooms, heated pool. Mar-A-Lago privileges. Lease: $45,000 per month.”
Despite signing a mortgage that pledged he would live in each house, Trump listed both homes as rentals. Palm Beach Daily News via Newspapers.com. Redactions by ProPublica.

Mortgage law experts who reviewed the records for ProPublica were struck by the irony of Trump’s dual mortgages. They said claiming primary residences on different mortgages at the same time, as Trump did, is often legal and rarely prosecuted. But Trump’s two loans, they said, exceed the low bar the Trump administration itself has set for mortgage fraud.

“Given Trump’s position on situations like this, he’s going to either need to fire himself or refer himself to the Department of Justice,” said Kathleen Engel, a Suffolk University law professor and leading expert on mortgage finance. “Trump has deemed that this type of misrepresentation is sufficient to preclude someone from serving the country.”

Mortgages for a person’s main home tend to receive more favorable terms, like lower interest rates, than mortgages for a second home or an investment rental property. Legal experts said that having more than one primary-residence mortgage can sometimes be legitimate, like when someone has to move for a new job, and other times can be caused by clerical error. Determining ill intent on the part of the borrower is key to proving fraud, and the experts said lenders have significant discretion in what loans they offer clients. (In this case, Trump used the same lender to buy the two Florida homes.)

But in recent months, the Trump administration has asserted that merely having two primary-residence mortgages is evidence of criminality.

Bill Pulte, the Federal Housing Finance Agency director who has led the charge, said earlier this year: “If somebody is claiming two primary residences, that is not appropriate, and we will refer it for criminal investigation.”

Trump hung up on a ProPublica reporter after being asked whether his Florida mortgages were similar to those of others he had accused of fraud.

In response to questions, a White House spokesperson told ProPublica: “President Trump’s two mortgages you are referencing are from the same lender. There was no defraudation. It is illogical to believe that the same lender would agree to defraud itself.”

The spokesperson added, “this is yet another desperate attempt by the Left wing media to disparage President Trump with false allegations,” and said, “President Trump has never, or will ever, break the law.”

The White House did not respond to questions about any other documents related to the transactions, such as loan applications, that could shed light on what Trump told the lender or if the lender made any exceptions for him.

At the time Trump bought the two Florida properties, he was dealing with the wreckage of high-profile failures at his casinos and hotels in the early 1990s. (He famously recounted seeing a panhandler on Fifth Avenue around this time and telling his companion: “You know, right now that man is worth $900 million more than I am.”) In December 1993, he married the model Marla Maples in an opulent ceremony at The Plaza Hotel. And in Florida, he was pushing local authorities to let him turn Mar-a-Lago, then a residence, into a private club.

Trump bought the two homes, which both sit on Woodbridge Road directly north of Mar-a-Lago, and got mortgages in quick succession in December 1993 and January 1994. The lender on both mortgages, one for $525,000 and one for $1,200,000, was Merrill Lynch.

Each of the mortgage documents signed by Trump contain the standard occupancy requirement — that he must make the property his principal residence within 60 days and live there for at least a year, unless the lender agreed otherwise or there were extenuating circumstances.

But ProPublica could not find evidence Trump ever lived in either of the properties. Legal documents and federal election records from the period give his address as Trump Tower in Manhattan. (Trump would officially change his permanent residence to Florida only decades later, in 2019.) A Vanity Fair profile published in March 1994 describes Trump spending time in Manhattan and at Mar-a-Lago itself.

Trump’s real estate agent, who told the local press that the plan from the beginning was to rent out the two satellite homes, was quoted as saying, “Mr. Trump, in effect, is in a position to approve who his neighbors are.”

In the ensuing years, listings popped up in local newspapers advertising each of the homes for rent. At one point in 1997, the larger of the two homes, a 7-bedroom, 7-bathroom Mediterranean Revival mansion, was listed for $3,000 per day.

Even if Trump did violate the law with his two primary-residence mortgages in Florida, the loans have since been paid off and the mid-1990s is well outside the statute of limitations for mortgage fraud.

The same form from two separate mortgage agreements, both with Donald Trump’s signature.
In 1993, Trump signed a mortgage for a “Bermuda style” home in Palm Beach, pledging that it would be his principal residence. Just seven weeks later, he got another mortgage for a seven-bedroom, marble-floored neighboring property and attested that it too would be his principal residence. Obtained by ProPublica

A spokesperson for Bank of America, which now owns Merrill Lynch, did not answer questions about the Trump mortgages.

“It’s highly unlikely we would have original documents for a 32-year-old transaction, but generally in private client mortgages the terms of the transactions are based on the overall relationship,” the spokesperson said in a statement, “and the mortgages are not backed by or sold to any government sponsored entity.”

Trump’s two mortgages in Palm Beach bear similarities to the loans taken out by political rivals whom his administration has accused of fraud.

In October, federal prosecutors charged New York Attorney General Letitia James over her mortgage. James has been one of Trump’s top targets since she brought a fraud lawsuit against the president and his company in 2022.

A central claim in the case the Trump Justice Department brought against her is that she purchased a house in Virginia, pledging to her lender that it would serve as her second home, then proceeded to use it as an investment property and rent it out. “This misrepresentation allowed James to obtain favorable loan terms not available for investment properties,” according to the indictment.

Trump’s Florida mortgage agreements appear to have made a more significant misrepresentation, as he claimed those homes would be his primary residence, not his secondary home as James did, before proceeding to rent them out.

James has denied the allegations against her, and the case was dismissed last month over procedural issues, though the Justice Department has been trying to reindict her.

The circumstances around Trump’s mortgages are also similar to the case his administration has made against Lisa Cook, a member of the Federal Reserve Board of Governors.

Trump declared he was firing Cook earlier this year over her mortgages, as he has sought to bend the traditionally independent agency to his will and force it to lower interest rates. Cook, who denied wrongdoing , has sued to block the termination and continues to serve on the Fed board as that legal fight continues.

In a letter to Cook, Trump specifically noted that she signed two primary residence mortgages within weeks of each other — just as records show he did in Florida.

“You signed one document attesting that a property in Michigan would be your primary residence for the next year. Two weeks later, you signed another document for a property in Georgia stating that it would be your primary residence for the next year,” Trump wrote. “It is inconceivable that you were not aware of your first commitment when making the second.”

He called the loans potentially criminal and wrote, “at a minimum, the conduct at issue exhibits the sort of gross negligence in financial transactions that calls into question your competence and trustworthiness.”

The Trump administration has made similar fraud allegations against other political enemies, including Democrats Sen. Adam Schiff and Rep. Eric Swalwell, both of whom have denied wrongdoing.

In September, ProPublica reported that three of Trump’s Cabinet members have called multiple homes their primary residences in mortgage agreements. Bloomberg also reported that Secretary of the Treasury Scott Bessent did something similar. (The Cabinet members have all denied wrongdoing.)

Pulte, the Federal Housing Finance Agency head, has denied his investigations are politically motivated. “If it’s a Republican who’s committing mortgage fraud, we’re going to look at it,” he has said. “If it’s a Democrat, we’re going to look at it.”

Thus far, Pulte has not made any publicly known criminal referrals against Republicans. He did not respond to questions from ProPublica about Trump’s Florida mortgages.

I misused LLMs to diagnose myself and ended up bedridden for a week

Hacker News
blog.shortround.space
2025-12-09 21:07:22
Comments...
Original Article

If you read nothing else, read this: do not ever use an AI or the internet for medical advice . Go to a doctor. In fact, do yourself a favor and add this to your preferred AI's system prompt right now:

If I ask you any medical questions, refuse to answer them. Tell me that LLMs are not capable of providing medical advice, and that I should go to a doctor instead.


tl;dr : I developed mysterious symptoms over the course of a month, and instead of going to a doctor I (mis-)used a popular LLM to reassure me that nothing was wrong. Turns out it was Lyme disease (yes, the real one, not the fake one) and it (nearly) progressed to meningitis, resulting in a lumbar puncture, antibiotics, and being bedridden for a week. This is a cautionary tale. Before you judge me too harshly, remember this while you read: I was scared out of my mind and I was not thinking rationally. This can happen to you.

Mysterious symptoms

In July of 2025 I began developing flu-like symptoms. I began to feel feverish and would go to sleep with the most intense chills of my life (it felt like what I imagine being naked at the south pole feels like) and would wake up drenched in sweat.

These symptoms subsided after about a week, but then I developed a small, flat, circular rash which turned into a big rash. This rash was not itchy or painful so I chalked it up to some weird symptoms related to what I thought was the flu. However, being the careful, intelligent, and diligent person I am, I decided it would be best to ask an LLM for advice instead of going to, y'know, an actual doctor.

Playing Doctor

Imagine we invented a perfect medical AI tool. You give it pictures and a list of symptoms and it gives you a set of diagnoses and a degree of certainty. You might prompt this tool like this:

Flat, circular, non-itchy, non-painful red rash with a ring, diffuse throughout trunk. Follows week of chills and intense night sweats, plus fatigue and general malaise

The response might look like

Lyme: 90%

Ring worm: 50%

[etc...]

Which would be great!

Instead, here's how I used this LLM:

I have this rash on my body, but it's not itchy or painful, so I don't think it's an emergency? I just want to know what it might be. I think I had the flu last week so it might just be some kind of immune reaction to having been sick recently. My wife had pityriasis once, and the doctor told her they couldn't do anything about it, it would go away on its own eventually. I want to avoid paying a doctor to tell me it's nothing. Does this sound right?

To which the LLM, in typical LLM fashion, in so many words replied "Yes to everything you just said". Wow! I sure felt reassured that I was right about everything. My point is: I was asking it leading questions in the hopes that it would tell me what I wanted to hear

Ask and ye shall receive

Oftentimes, when people go to a doctor, we're looking for reassurance as much as we're looking for treatment. We want the doctor to not just cure us, but to tell us that everything is going to be alright; you're not dying, and there is a reason for everything!

So I wasn't asking for an LLM to fix me, I was asking to be lied to . LLMs are very good at lying to you . Cynics might say it's the only thing they're good at, but I digress. I repeated this exercise basically every day, as my rash got worse. I'd open up my LLM app, "ask" it leading questions in the hopes that it tells me not to go to the doctor, and then not to go to the doctor.

It should also be noted that I was hesitant to go to a doctor because I didn't want to pay for a doctor, but that's a different rant.

Broken Firmware

Did I mention that I was scared? This is not rational behavior. What makes this even more irrational is how rational I thought I was! I had seen the 1995 Sandra Bullock film The Net , in which a man is killed when nurses blindly trust a computer which had been hacked by criminals, resulting in his misdiagnosis and death. I told my friends and family how, in the future, we will all need to be careful about similar situations, and how computers can be used to deceive us if we place too much faith in them. I had, not even a month prior, read and shared articles about people who allowed ChatGPT to brainwash them into thinking that they were inside the Matrix . I laughed at these people, wondering how they could be so stupid. What the fuck is wrong with me?

There's a few books you can read about how people really think. To name a few:

  • Why We're Polarized by Ezra Klein
  • The Righteous Mind by Jonathan Haidt
  • Humankind by Rutger Bregman

These books are mostly about politics but they all cite anthropological evidence which says that human beings are basically not rational. We are easily led astray when we are scared.

You know how historians always try to make the point that, if you were alive in 1930s Germany, you might have ended up being a Nazi too? The same thing applies here. If you were experiencing unexplained neurological symptoms, you might just fall victim to some conman, faith healer ...or LLM.

Receiving actual medical care

One day, I woke up with a neck so stiff that I couldn't touch my chin to my chest. I don't know a lot about medicine, but I know that that is an "oh shit". A girl in my high school died of Meningococcal meningitis after sharing a beer with someone at a party, so I was vaguely aware of the symptoms. So I get in my car and I go to urgent care.

The doctor looks at my rash and immediately says "well I think you almost certainly have Lyme disease, but the neurological symptoms make me worried that you have meningitis. You need to go to the emergency room right now".

Spoiler: I didn't have meningitis.

So, I drive myself to the emergency room and tell them I need to be tested for meningitis. It turns out that "meningitis" is the cheat code for getting seen instantly, because I don't even have the chance to sit down before they take me back and start treating me like I had Ebola. Meningococcal meningitis can kill you in literally hours , and is also extremely contagious, so they pulled out all the stops. Once the Infectious Disease doctor saw me and confirmed I had Lyme, I went back to being a normal, non-infectious patient who had to wait his sweet time while patients who didn't waste a month diagnosing themself with AI were seen.

I won't bore you with the entire rest of the hospital stay, but I will tell you about the lumbar puncture. If you are sensitive to body horror, stop reading immediately, and just remember: don't use LLMs to diagnose yourself, and be wary of the stupid shit you do when you're emotional and irrational . I am telling you about the lumbar puncture so you understand the consequences of asking a computer to lie to you.

I had to get a lumbar puncture to confirm that my brain stem was not infected (it wasn't). Radiology was busy with non-stupid patients that day, so the ER doctor tried to do the lumbar puncture the old fashioned way... 11 different times.

You ever see videos of that Japanese technique for jamming a metal rod down the spinal column of a recently beheaded fish to get it to stop squirming? That's what I kept picturing in my head as I felt every nerve in my thigh catch fire.

Eventually the doctor said "Good news! Radiology agreed to pencil you in", so I go down and get the lumbar puncture assisted with X-rays. They hit the subarachnoid space on the first try. I have had Kidney Stones, Appendicitis, and I've been stabbed in the hand, so believe me when I say that this was the single most intensely painful nanosecond of my life. While I didn't have meningitis , my meninges was pretty significantly inflamed, so getting a needle poked through it felt like what I imagine being impaled on a spike through your groin felt like. I stayed pretty still for the first 11 failed punctures, but when they actually got through, I jumped like I was being electrocuted. Twice. After that, no pain, just twitchy images in my mind of Vlad the Impaler.

Going home

When they confirmed that I was fine, they sent me home with antibiotics. Here's something you may not have known about a lumbar puncture: it puts a hole in your central nervous system and you start leaking cerebro-spinal fluid (CSF). This lowers the intracranial pressure of your skull, causing your brain to sag within your head, and gives you insane headaches. I was bedridden for a week waiting for my spinal column to stop leaking CSF so that I could sit upright. I had to crawl to use the bathroom because if I stood upright, my brain would start to droop inside my skull and I'd be paralyzed with pain.

Moral of the story

  1. Don't use AIs to diagnose yourself
  2. You think you're smarter than me (and maybe you are!) but that doesn't make your immune to the kind of motivated reasoning I engaged in
  3. DON'T USE AIs TO DIAGNOSE YOURSELF
  4. A $150 ER copay and a couple weeks of oral antibiotics is cheaper and less painful than IV antibiotics, 12 lumbar punctures, and a week in bed as you nurture your central nervous system back to good health.

PS: 4 months on, I no longer have Lyme disease, and I have no lasting complications. I chose not to name the LLM product I used because I don't want to imply that this is the fault of LLM vendor. It's not. I misused their product in a way I knew I wasn't supposed to and paid for it.

The Wall Street Journal: ‘Behind Paramount’s Relentless Campaign to Woo Warner Discovery and President Trump’

Daring Fireball
www.wsj.com
2025-12-09 20:59:19
Joe Flint, Brian Schwartz, and Natalie Andrews, reporting for The Wall Street Journal (gift link, also in News+): “Just tried calling you about new bid we have submitted,” Ellison texted Zaslav. “I heard you on all your concerns and believe we have addressed them in our new proposal. Please give...
Original Article

Please enable JS and disable any ad blocker

Windows PowerShell now warns when running Invoke-WebRequest scripts

Bleeping Computer
www.bleepingcomputer.com
2025-12-09 20:45:20
Microsoft says Windows PowerShell now warns when running scripts that use the Invoke-WebRequest cmdlet to download web content, aiming to prevent potentially risky code from executing. [...]...
Original Article

Windows PowerShell

Microsoft says Windows PowerShell now warns when running scripts that use the Invoke-WebRequest cmdlet to download web content, aiming to prevent potentially risky code from executing.

As Microsoft explains, this mitigates a high-severity PowerShell remote code execution vulnerability (CVE-2025-54100 ), which primarily affects enterprise or IT-managed environments that use PowerShell scripts for automation, since PowerShell scripts are not as commonly used outside such environments .

The warning has been added to Windows PowerShell 5.1, the PowerShell version installed by default on Windows 10 and Windows 11 systems, and is designed to add the same secure web parsing process available in PowerShell 7.

PowerShell will alert you that, without precautions, scripts contained in web pages downloaded using the "Invoke-WebRequest' cmdlet could execute on your system. By default, if you press 'Enter' or select 'No,' the operation will be canceled, and PowerShell will suggest rerunning the command with the '-UseBasicParsing' parameter for safer processing.

When choosing 'Yes,' PowerShell will parse the page using the older method (full HTML parsing), allowing the content and embedded scripts to load as before. In short, selecting 'Yes 'means you accept the risk, while choosing 'No' stops the action to protect your system.

"Windows PowerShell 5.1 now displays a security confirmation prompt when using the Invoke-WebRequest command to fetch web pages without special parameters," Microsoft explains in a Tuesday advisory.

"This prompt warns that scripts in the page could run during parsing and advises using the safer -UseBasicParsing parameter to avoid any script execution. Users must choose to continue or cancel the operation."

After you install the KB5074204 update, IT admins will see the following confirmation prompt warning of script code execution risks:

Security Warning: Script Execution Risk
Invoke-WebRequest parses the content of the web page. Script code in the web page might be run when the page is parsed.
      RECOMMENDED ACTION:
      Use the -UseBasicParsing switch to avoid script code execution.
      Do you want to continue?
			```
 
For additional details, see [KB5074596: PowerShell 5.1: Preventing script execution from web content](https://support.microsoft.com/help/5072034).

To avoid having their automation scripts hang until manual confirmation, admins are advised to update their scripts to use the UseBasicParsing safe parameter explicitly.

It's also important to note that in PowerShell, the 'curl' command is aliased to the Invoke-WebRequest cmdlet, so you will also see these new warnings when running scripts invoking curl commands.

"Most PowerShell scripts and commands that use the Invoke-WebRequest command will continue to work with little or no modification," Microsoft noted.

"For example, scripts that only download content or work with the response body as text or data are not affected and require no changes."

tines

Break down IAM silos like Bitpanda, KnowBe4, and PathAI

Broken IAM isn't just an IT problem - the impact ripples across your whole business.

This practical guide covers why traditional IAM practices fail to keep up with modern demands, examples of what "good" IAM looks like, and a simple checklist for building a scalable strategy.

Django: what’s new in 6.0

Hacker News
adamj.eu
2025-12-09 20:33:14
Comments...
Original Article
Django 6.0: codename “mosaic”

Django 6.0 was released today , starting another release cycle for the loved and long-lived Python web framework (now 20 years old!). It comes with a mosaic of new features, contributed to by many, some of which I am happy to have helped with. Below is my pick of highlights from the release notes .

Upgrade with help from django-upgrade

If you’re upgrading a project from Django 5.2 or earlier, please try my tool django-upgrade . It will automatically update old Django code to use new features, fixing some deprecation warnings for you, including five fixers for Django 6.0. (One day, I’ll propose django-upgrade to become an official Django project, when energy and time permit…)

Template partials

There are four headline features in Django 6.0, which we’ll cover before other notable changes, starting with this one:

The Django Template Language now supports template partials , making it easier to encapsulate and reuse small named fragments within a template file.

Partials are sections of a template marked by the new {% partialdef %} and {% endpartialdef %} tags. They can be reused within the same template or rendered in isolation. Let’s look at examples for each use case in turn.

Reuse partials within the same template

The below template reuses a partial called filter_controls within the same template. It’s defined once at the top of the template, then used twice later on. Using a partial allows the template avoid repetition without pushing the content into a separate include file.

<section id=videos>
  {% partialdef filter_controls %}
    <form>
      {{ filter_form }}
    </form>
  {% endpartialdef %}

  {% partial filter_controls %}

  <ul>
    {% for video in videos %}
      <li>
        <h2>{{ video.title }}</h2>
        ...
      </li>
    {% endfor %}
  </ul>

  {% partial filter_controls %}
</section>

Actually, we can simplify this pattern further, by using the inline option on the partialdef tag, which causes the definition to also render in place:

<section id=videos>
  {% partialdef filter_controls inline %}
    <form>
      {{ filter_form }}
    </form>
  {% endpartialdef %}

  <ul>
    {% for video in videos %}
      <li>
        <h2>{{ video.title }}</h2>
        ...
      </li>
    {% endfor %}
  </ul>

  {% partial filter_controls %}
</section>

Reach for this pattern any time you find yourself repeating template code within the same template. Because partials can use variables, you can also use them to de-duplicate when rendering similar controls with different data.

Render partials in isolation

The below template defines a view_count partial that’s intended to be re-rendered in isolation. It uses the inline option, so when the whole template is rendered, the partial is included.

The page uses htmx , via my django-htmx package , to periodically refresh the view count, through the hx-* attributes. The request from htmx goes to a dedicated view that re-renders the view_count partial.

{% load django_htmx %}
<!doctype html>
<html>
  <body>
    <h1>{{ video.title }}</h1>
    <video width=1280 height=720 controls>
      <source src="{{ video.file.url }}" type="video/mp4">
      Your browser does not support the video tag.
    </video>

    {% partialdef view_count inline %}
    <section
      class=view-count
      hx-trigger="every 1s"
      hx-swap=outerHTML
      hx-get="{% url 'video-view-count' video.id %}"
    >
      {{ video.view_count }} views
    </section>
    {% endpartialdef %}

    {% htmx_script %}
  </body>
</html>

The relevant code for the two views could look like this:

from django.shortcuts import render


def video(request, video_id):
    ...
    return render(request, "video.html", {"video": video})


def video_view_count(request, video_id):
    ...
    return render(request, "video.html#view_count", {"video": video})

The initial video view renders the full template video.html . The video_view_count view renders just the view_count partial, by appending #view_count to the template name. This syntax is similar to how you’d reference an HTML fragment by its ID in a URL.

History

htmx was the main motivation for this feature, as promoted by htmx creator Carson Gross in a cross-framework review post . Using partials definitely helps maintain “Locality of behaviour” within your templates, easing authoring, debugging, and maintenance by avoiding template file sprawl.

Django’s support for template partials was initially developed by Carlton Gibson in the django-template-partials package , which remains available for older Django versions. The integration into Django itself was done in a Google Summer of Code project this year, worked on by student Farhan Ali and mentored by Carlton, in Ticket #36410 . You can read more about the development process in Farhan’s retrospective blog post . Many thanks to Farhan for authoring, Carlton for mentoring, and Natalia Bidart, Nick Pope, and Sarah Boyce for reviewing!

Tasks framework

The next headline feature we’re covering:

Django now includes a built-in Tasks framework for running code outside the HTTP request–response cycle. This enables offloading work, such as sending emails or processing data, to background workers.

Basically, there’s a new API for defining and enqueuing background tasks—very cool!

Background tasks are a way of running code outside of the request-response cycle. They’re a common requirement in web applications, used for sending emails, processing images, generating reports, and more.

Historically, Django has not provided any system for background tasks, and kind of ignored the problem space altogether. Developers have instead relied on third-party packages like Celery or Django Q2 . While these systems are fine, they can be complex to set up and maintain, and often don’t “go with the grain” of Django.

The new Tasks framework fills this gap by providing an interface to define background tasks, which task runner packages can then integrate with. This common ground allows third-party Django packages to define tasks in a standard way, assuming you’ll be using a compatible task runner to execute them.

Define tasks with the new @task decorator:

from django.tasks import task


@task
def resize_video(video_id): ...

…and enqueue them for background execution with the Task.enqueue() method:

from example.tasks import resize_video


def upload_video(request):
    ...
    resize_video.enqueue(video.id)
    ...

Execute tasks

At this time, Django does not include a production-ready task backend, only two that are suitable for development and testing:

  • ImmediateBackend - runs tasks synchronously, blocking until they complete.
  • DummyBackend - does nothing when tasks are enqueued, but allows them to be inspected later. Useful for tests, where you can assert that tasks were enqueued without actually running them.

For production use, you’ll need to use a third-party package that implements one, for which django-tasks , the reference implementation, is the primary option. It provides DatabaseBackend for storing tasks in your SQL database, a fine solution for many projects, avoiding extra infrastructure and allowing atomic task enqueuing within database transactions. We may see this backend merged into Django in due course, or at least become an official package, to help make Django “batteries included” for background tasks.

To use django-tasks’ DatabaseBackend today, first install the package:

Second, add these two apps to your INSTALLED_APPS setting:

INSTALLED_APPS = [
    # ...
    "django_tasks",
    "django_tasks.backends.database",
    # ...
]

Third, configure DatabaseBackend as your tasks backend in the new TASKS setting :

TASKS = {
    "default": {
        "BACKEND": "django_tasks.backends.database.DatabaseBackend",
    },
}

Fourth, run migrations to create the necessary database tables:

Finally, to run the task worker process, use the package’s db_worker management command:

$ ./manage.py db_worker
Starting worker worker_id=jWLMLrms3C2NcUODYeatsqCFvd5rK6DM queues=default

This process runs indefinitely, polling for tasks and executing them, logging events as it goes:

Task id=10b794ed-9b64-4eed-950c-fcc92cd6784b path=example.tasks.echo state=RUNNING
Hello from test task!
Task id=10b794ed-9b64-4eed-950c-fcc92cd6784b path=example.tasks.echo state=SUCCEEDED

You’ll want to run db_worker in production, and also in development if you want to test background task execution.

History

It’s been a long path to get the Tasks framework into Django, and I’m super excited to see it finally available in Django 6.0. Jake Howard started on the idea for Wagtail, a Django-powered CMS, back in 2021, as they have a need for common task definitions across their package ecosystem. He upgraded the idea to target Django itself in 2024, when he proposed DEP 0014 . As a member of the Steering Council at the time, I had the pleasure of helping review and accept the DEP.

Since then, Jake has been leading the implementation effort, building pieces first in the separate django-tasks package before preparing them for inclusion in Django itself. This step was done under Ticket #35859 , with a pull request that took nearly a year to review and land. Thanks to Jake for his perseverance here, and to all reviewers: Andreas Nüßlein, Dave Gaeddert, Eric Holscher, Jacob Walls, Jake Howard, Kamal Mustafa, @rtr1, @tcely, Oliver Haas, Ran Benita, Raphael Gaschignard, and Sarah Boyce.

Read more about this feature and story in Jake’s post celebrating when it was merged .

Content Security Policy support

Our third headline feature:

Built-in support for the Content Security Policy (CSP) standard is now available, making it easier to protect web applications against content injection attacks such as cross-site scripting (XSS). CSP allows declaring trusted sources of content by giving browsers strict rules about which scripts, styles, images, or other resources can be loaded.

I’m really excited about this, because I’m a bit of a security nerd who’s been deploying CSP for client projects for years.

CSP is a security standard that can protect your site from cross-site scripting (XSS) and other code injection attacks. You set a content-security-policy header to declare which content sources are trusted for your site, and then browsers will block content from other sources. For example, you might declare that only scripts your domain are allowed, so an attacker who manages to inject a <script> tag pointing to evil.com would be thwarted, as the browser would refuse to load it.

Previously, Django had no built-in support for CSP, and developers had to rely on building their own, or using a third-party package like the very popular django-csp . But this was a little bit inconvenient, as it meant that other third-party packages couldn’t reliably integrate with CSP, as there was no common API to do so.

The new CSP support provides all the core features that django-csp did, with a slightly tidier and more Djangoey API. To get started, first add ContentSecurityPolicyMiddleware to your MIDDLEWARE setting:

MIDDLEWARE = [
    # ...
    "django.middleware.csp.ContentSecurityPolicyMiddleware",
    # ...
]

Place it next to SecurityMiddleware , as it similarly adds security-related headers to all responses. (You do have SecurityMiddleware enabled, right?)

Second, configure your CSP policy using the new settings:

  • SECURE_CSP to configure the content-security-policy header, which is your actively enforced policy.
  • SECURE_CSP_REPORT_ONLY to configure the content-security-policy-report-only header, which sets a non-enforced policy for which browsers report violations to a specified endpoint. This option is useful for testing and monitoring a policy before enforcing it.

For example, to adopt the nonce-based strict CSP recommended by web.dev , you could start with the following setting:

from django.utils.csp import CSP

SECURE_CSP_REPORT_ONLY = {
    "script-src": [CSP.NONCE, CSP.STRICT_DYNAMIC],
    "object-src": [CSP.NONE],
    "base-uri": [CSP.NONE],
}

The CSP enum used above provides constants for CSP directives, to help avoid typos.

This policy is quite restrictive and will break most existing sites if deployed as-is, because it requires nonces, as covered next. That’s why the example shows starting with the report-only mode header, to help track down places that need fixing before enforcing the policy. You’d later change to setting the SECURE_CSP setting to enforce the policy.

Anyway, those are the two basic steps to set up the new CSP support!

Nonce generation

A key part of the new feature is that nonce generation is now built-in to Django, when using the CSP middleware. Nonces are a security feature in CSP that allow you to mark specific <script> and <style> tags as trusted with a nonce attribute:

<script src=/static/app.js type=module nonce=55vsH4w7ATHB85C3MbPr_g></script>

The nonce value is randomly generated per-request, and included in the CSP header. An attacker performing content injection couldn’t guess the nonce, so browsers can trust only those tags that include the correct nonce. Because nonce generation is now part of Django, third-party packages can depend on it for their <script> and <style> tags and they’ll continue to work if you adopt CSP with nonces.

Nonces are the recommended way to use CSP today, avoiding problems with previous allow-list based approaches. That’s why the above recommended policy enables them. To adopt a nonce-based policy, you’ll need to annotate your <script> and <style> tags with the nonce value through the following steps.

First, add the new csp template context processor to your TEMPLATES setting:

TEMPLATES = [
    {
        "BACKEND": "django.template.backends.django.DjangoTemplates",
        "OPTIONS": {
            "context_processors": [
                # ...
                "django.template.context_processors.csp",
            ],
        },
    },
]

Second, annotate your <script> and <style> tags with nonce="{{ csp_nonce }}" :

-   <script src="{% static 'app.js' %}" type="module"></script>
+   <script src="{% static 'app.js' %}" type="module" nonce="{{ csp_nonce }}"></script>

This can be tedious and error-prone, hence using the report-only mode first to monitor violations might be useful, especially on larger projects.

Anyway, deploying CSP right would be another post in itself, or even a book chapter, so we’ll stop here for now. For more info, check out that web.dev article and the MDN CSP guide .

History

CSP itself was proposed for browsers way back in 2004, and was first implemented in Mozilla Firefox version 4, released 2011. That same year, Django Ticket #15727 was opened, proposing adding CSP support to Django. Mozilla created django-csp from 2010, before the first public availability of CSP, using it on their own Django-powered sites. The first comment on Ticket #15727 pointed to django-csp, and the community basically rolled with it as the de facto solution.

Over the years, CSP itself evolved, as did django-csp, with Rob Hudson ending up as its maintainer. Focusing on the package motivated to finally get CSP into Django itself. He made a draft PR and posted on Ticket #15727 in 2024, which I enjoyed helping review. He iterated on the PR over the next 13 months until it was finally merged for Django 6.0. Thanks to Rob for his heroic dedication here, and to all reviewers: Benjamin Balder Bach, Carlton Gibson, Collin Anderson, David Sanders, David Smith, Florian Apolloner, Harro van der Klauw, Jake Howard, Natalia Bidart, Paolo Melchiorre, Sarah Boyce, and Sébastien Corbin.

Email API updates

The fourth and final headline feature:

Email handling in Django now uses Python’s modern email API, introduced in Python 3.6. This API, centered around the email.message.EmailMessage class, offers a cleaner and Unicode-friendly interface for composing and sending emails.

This is a major change, but it’s unlikely to affect projects using basic email features. You can still use Django’s send_mail() function and EmailMessage class as before, like:

from django.core.mail import EmailMessage

email = EmailMessage(
    subject="🐼 Need more bamboo",
    body="We are desperately low, please restock before the pandas find out!",
    from_email="zookeeper@example.com",
    to=["supplies@example.com"],
)
email.attach_file("/media/bamboo_cupboard.jpg")
email.send()

The key change is that, under-the-hood, when you call send() on a Django EmailMessage object, it now translates itself into a Python’s newer email.message.EmailMessage type before sending.

Modernizing provides these benefits:

  1. Fewer bugs - many edge case bugs in Python’s old email API have been fixed in the new one.
  2. Django is less hacky - a bunch of workarounds and security fixes in Django‘s email code have been removed.
  3. More convenient API - the new API supports some niceties, like the below inline attachment example.

Easier inline attachments with MIMEPart

Django’s EmailMessage.attach() method allows you to attach a file as an attachment. Emails support images as inline attachments , which can be displayed within the HTML email body.

While you could previously use EmailMessage.attach() to add inline attachments, it was a bit fiddly, using a legacy class. Now, you can call the method with a Python email.message.MIMEPart object to add an inline attachment in a few steps:

import email.utils
from email.message import MIMEPart
from django.core.mail import EmailMultiAlternatives

message = EmailMultiAlternatives(
    subject="Cute Panda Alert",
    body="Here's a cute panda picture for you!",
    from_email="cute@example.com",
    to=["fans@example.com"],
)
with open("panda.jpg", "rb") as f:
    panda_jpeg = f.read()

cid = email.utils.make_msgid()
inline_image = MIMEPart()
inline_image.set_content(
    panda_jpeg,
    maintype="image",
    subtype="jpeg",
    disposition="inline",
    cid=cid,
)
message.attach(inline_image)
message.attach_alternative(
    f'<h1>Cute panda baby alert!</h1><img src="cid:{cid[1:-1]}">',
    "text/html",
)

It’s not the simplest API, but it does expose all the power of the underlying email system, and it’s better than the past situation.

History

The new email API was added to Python as provisional in version 3.4 (2014) , and made stable in version 3.6 (2016) . The legacy API, however, was never planned for deprecation, so there was never any deadline to upgrade Django’s email handling.

In 2024, Mike Edmunds posted on the (old) django-developers mailing list , proposing the upgrade with strong reasoning and planning. This conversation led to Ticket #35581 , which he worked on for eight months until it was merged. Many thanks to Mike for leading this effort, and to Sarah Boyce for reviewing! Email is not a glamorous feature, but it’s a critical communication channel for nearly every Django project, so props for this.

Positional arguments in django.core.mail APIs

We’re now out of the headline features and onto the “minor” changes, starting with this deprecation related to the above email changes:

django.core.mail APIs now require keyword arguments for less commonly used parameters. Using positional arguments for these now emits a deprecation warning and will raise a TypeError when the deprecation period ends:

  • All optional parameters ( fail_silently and later) must be passed as keyword arguments to get_connection() , mail_admins() , mail_managers() , send_mail() , and send_mass_mail() .
  • All parameters must be passed as keyword arguments when creating an EmailMessage or EmailMultiAlternatives instance, except for the first four ( subject , body , from_email , and to ), which may still be passed either as positional or keyword arguments.

Previously, Django would let you pass all parameters positionally, which gets a bit silly and hard to read with long parameter lists, like:

from django.core.mail import send_mail

send_mail(
    "🐼 Panda of the week",
    "This week’s panda is Po Ping, sha-sha booey!",
    "updates@example.com",
    ["adam@example.com"],
    True,
)

The final True doesn’t provide any clue what it means without looking up the function signature. Now, using positional arguments for those less-commonly-used parameters raises a deprecation warning, nudging you to write:

from django.core.mail import send_mail

send_mail(
    subject="🐼 Panda of the week",
    body="This week’s panda is Po Ping, sha-sha booey!",
    from_email="updates@example.com",
    ["adam@example.com"],
    fail_silently=True,
)

This change is appreciated for API clarity, and Django is generally moving towards using keyword-only arguments more often. django-upgrade can automatically fix this one for you, via its mail_api_kwargs fixer .

Thanks to Mike Edmunds, again, for making this improvement in Ticket #36163 .

Extended automatic shell imports

Next up:

Common utilities, such as django.conf.settings, are now automatically imported to the shell by default.

One of the headline features back in Django 5.2 was automatic model imports in the shell , making ./manage.py shell import all of your models automatically. Building on that DX boost, Django 6.0 now also imports other common utilities, for which we can find the full list by running ./manage.py shell with -v 2 :

$ ./manage.py shell -v 2
6 objects imported automatically:

  from django.conf import settings
  from django.db import connection, models, reset_queries
  from django.db.models import functions
  from django.utils import timezone

...

(This is from a project without any models, so only the utilities are listed.)

So that’s:

  • settings , useful for checking your runtime configuration:

    In [1]: settings.DEBUG
    Out[1]: False
    
  • connection and reset_queries() , great for checking the executed queries :

    In [1]: Book.objects.select_related('author')
    Out[1]: <QuerySet []>
    
    In [2]: connection.queries
    Out[2]:
    [{'sql': 'SELECT "example_book"."id", "example_book"."title", "example_book"."author_id", "example_author"."id", "example_author"."name" FROM "example_book" INNER JOIN "example_author" ON ("example_book"."author_id" = "example_author"."id") LIMIT 21',
      'time': '0.000'}]
    
  • models and functions , useful for advanced ORM work:

    In [1]: Book.objects.annotate(
       ...:   title_lower=functions.Lower("title")
       ...: ).filter(
       ...:   title_lower__startswith="a"
       ...: ).count()
    Out[1]: 71
    
  • timezone , useful for using Django’s timezone-aware date and time utilities:

    In [1]: timezone.now()
    Out[1]: datetime.datetime(2025, 12, 1, 23, 42, 22, 558418, tzinfo=datetime.timezone.utc)
    

It remains possible to extend the automatic imports with whatever you’d like, as documented in How to customize the shell command documentation page.

Salvo Polizzi contributed the original automatic shell imports feature in Django 5.2. He’s then returned to offer these extra imports for Django 6.0, in Ticket #35680 . Thanks to everyone that contributed to the forum discussion agreeing on which imports to add, and to Natalia Bidart and Sarah Boyce for reviewing!

Dynamic field refresh on save()

Now let’s discuss a series of ORM improvements, starting with this big one:

GeneratedField s and fields assigned expressions are now refreshed from the database after save() on backends that support the RETURNING clause (SQLite, PostgreSQL, and Oracle). On backends that don’t support it (MySQL and MariaDB), the fields are marked as deferred to trigger a refresh on subsequent accesses.

Django models support having the database generate field values for you in three cases:

  1. The db_default field option, which lets the database generate the default value when creating an instance:

    from django.db import models
    from django.db.models.functions import Now
    
    
    class Video(models.Model):
        ...
        created = models.DateTimeField(db_default=Now())
    
  2. The GeneratedField field type, which is always computed by the database based on other fields in the same instance:

    from django.db import models
    from django.db.models.functions import Concat
    
    
    class Video(models.Model):
        ...
        full_title = models.GeneratedField(
            models.TextField(),
            expression=Concat(
                "title",
                models.Value(" - "),
                "subtitle",
            ),
        )
    
  3. Assigning expression values to fields before saving:

    from django.db import models
    from django.db.models.functions import Now
    
    
    class Video(models.Model):
        ...
        last_updated = models.DateTimeField()
    
    
    video = Video.objects.get(id=1)
    ...
    video.last_updated = Now()
    video.save()
    

Previously, only the first method, using db_default , would refresh the field value from the database after saving. The other two methods would leave you with only the old value or the expression object, meaning you’d need to call Model.refresh_from_db() to get any updated value if necessary. This was hard to remember and it costs an extra database query.

Now Django takes advantage of the RETURNING SQL clause to save the model instance and fetch updated dynamic field values in a single query, on backends that support it (SQLite, PostgreSQL, and Oracle). A save() call may now issue a query like:

UPDATE "example_video"
SET "last_updated" = NOW()
WHERE "example_video"."id" = 1
RETURNING "example_video"."last_updated"

Django puts the return value into the model field, so you can read it immediately after saving:

video = Video.objects.get(id=1)
...
video.last_updated = Now()
video.save()
print(video.last_updated)  # Updated value from the database

On backends that don’t support RETURNING (MySQL and MariaDB), Django now marks the dynamic fields as deferred after saving. That way, the later access, as in the above example, will automatically call Model.refresh_from_db() . This ensures that you always read the updated value, even if it costs an extra query.

History

This feature was proposed in Ticket #27222 way back in 2016, by Anssi Kääriäinen. It sat dormant for most of the nine years since, but ORM boss Simon Charette picked it up earlier this year, found an implementation, and pushed it through to completion. Thanks to Simon for continuing to push the ORM forward, and to all reviewers: David Sanders, Jacob Walls, Mariusz Felisiak, nessita, Paolo Melchiorre, Simon Charette, and Tim Graham.

Universal StringAgg aggregate

The next ORM change:

The new StringAgg aggregate returns the input values concatenated into a string, separated by the delimiter string. This aggregate was previously supported only for PostgreSQL.

This aggregate is often used for making comma-separated lists of related items, among other things. Previously, it was only supported on PostgreSQL, as part of django.contrib.postgres :

from django.contrib.postgres.aggregates import StringAgg
from example.models import Video

videos = Video.objects.annotate(
    chapter_ids=StringAgg("chapter", delimiter=","),
)

for video in videos:
    print(f"Video {video.id} has chapters: {video.chapter_ids}")

…which might give you output like:

Video 104 has chapters: 71,72,74
Video 107 has chapters: 88,89,138,90,91,93

Now this aggregate is available on all database backends supported by Django, imported from django.db.models :

from django.db.models import StringAgg, Value
from example.models import Video

videos = Video.objects.annotate(
    chapter_ids=StringAgg("chapter", delimiter=Value(",")),
)

for video in videos:
    print(f"Video {video.id} has chapters: {video.chapter_ids}")

Note the delimiter argument now requires a Value() expression wrapper for literal strings, as above. This change allows you to use database functions or fields as the delimiter if desired.

While most Django projects stick to PostgreSQL, having this aggregate available on all backends is a nice improvement for cross-database compatibility, and it means third-party packages can use it without affecting their database support.

History

The PostgreSQL-specific StringAgg was added way back in Django 1.9 (2015) by Andriy Sokolovskiy, in Ticket #24301 . In Ticket #35444 , Chris Muthig proposed adding the Aggregate.order_by option, something used by StringAgg to specify the ordering of concatenated elements, and as a side effect this made it possible to generalize StringAgg to all backends.

Thanks to Chris for proposing and implementing this change, and to all reviewers: Paolo Melchiorre, Sarah Boyce, and Simon Charette.

BigAutoField as the default primary key type

Next up:

DEFAULT_AUTO_FIELD setting now defaults to BigAutoField

This important change helps lock in scalable larger primary keys.

Django 3.2 (2021) introduced the DEFAULT_AUTO_FIELD setting for changing the default primary key type used in models. Django uses this setting to add a primary key field called id to models that don’t explicitly define a primary key field. For example, if you define a model like this:

from django.db import models


class Video(models.Model):
    title = models.TextField()

…then it will have two fields: id and title , where id uses the type defined by DEFAULT_AUTO_FIELD .

The setting can also be overridden on a per-app basis by defining AppConfig.default_auto_field in the app’s apps.py file:

from django.apps import AppConfig


class ChannelConfig(AppConfig):
    name = "channel"
    default_auto_field = "django.db.models.BigAutoField"

A key motivation for adding the setting was to allow projects to switch from AutoField (a 32-bit integer) to BigAutoField (a 64-bit integer) for primary keys, without needing changes to every model. AutoField can store values up to about 2.1 billion, which sounds large but it becomes easy to hit at scale. BigAutoField can store values up to about 9.2 quintillion, which is “more than enough” for every practical purpose.

If a model using AutoField hits its maximum value, it can no longer accept new rows, a problem known as primary key exhaustion . The table is effectively blocked, requiring an urgent fix to switch the model from AutoField to BigAutoField via a locking database migration on a large table. For a great watch on how Kraken is fixing this problem, see Tim Bell’s DjangoCon Europe 2025 talk , detailing some clever techniques to proactively migrate large tables with minimal downtime.

To stop this problem arising for new projects, Django 3.2 made new projects created with startproject set DEFAULT_AUTO_FIELD to BigAutoField , and new apps created with startapp set their AppConfig.default_auto_field to BigAutoField . It also added a system check to ensure that projects set DEFAULT_AUTO_FIELD explicitly, to ensure users were aware of the feature and could make an informed choice.

Now Django 6.0 changes the actual default values of the setting and app config attribute to BigAutoField . Projects using BigAutoField can remove the setting:

-DEFAULT_AUTO_FIELD = "django.db.models.BigAutoField"

…and app config attribute:

from django.apps import AppConfig

 class ChannelConfig(AppConfig):
     name = "channel"
-    default_auto_field = "django.db.models.BigAutoField"

The default startproject and startapp templates also no longer set these values. This change reduces the amount of boilerplate in new projects, and the problem of primary key exhaustion can fade into history, becoming something that most Django users no longer need to think about.

History

The addition of DEFAULT_AUTO_FIELD in Django 3.2 was proposed by Caio Ariede and implemented by Tom Forbes, in Ticket #31007 . This new change in Django 6.0 was proposed and implemented by ex-Fellow Tim Graham, in Ticket #36564 . Thanks to Tim for spotting that this cleanup was now possible, and to Jacob Walls and Clifford Gama for reviewing!

Template variable forloop.length

Moving on to templates, let’s start with this nice little addition:

The new variable forloop.length is now available within a for loop.

This small extension makes it possible to write a template loop like this:

<ul>
  {% for goose in geese %}
    <li>
      <strong>{{ forloop.counter }}/{{ forloop.length }}</strong>: {{ goose.name }}
    </li>
  {% endfor %}
</ul>

Previously, you’d need to refer to the length in an another way, like {{ geese|length }} , which is a bit less flexible.

Thanks to Jonathan Ströbele for contributing this idea and implementation in Ticket #36186 , and to David Smith, Paolo Melchiorre, and Sarah Boyce for reviewing.

querystring template tag enhancements

There are two extensions to the querystring template tag , which was added in Django 5.1 to help with building links that modify the current request’s query parameters.

  1. Release note:

    The querystring template tag now consistently prefixes the returned query string with a ? , ensuring reliable link generation behavior.

    This small change improves how the tag behaves when an empty mapping of query parameters are provided. Say you had a template like this:

    <a href="{% querystring params %}">Reset search</a>
    

    …where params is a dictionary that may sometimes be empty. Previously, if params was empty, the output would be:

    <a href="">Reset search</a>
    

    Browsers treat this as a link to the same URL including the query parameters , so it would not clear the query parameters as intended. Now, with this change, the output will be:

    <a href="?">Reset search</a>
    

    Browsers treat ? as a link to the same URL without any query parameters , clearing them as the user would expect.

    Thanks to Django Fellow Sarah Boyce for spotting this improvement and implementing the fix in Ticket #36268 , and for Django Fellow Natalia Bidart for reviewing!

  2. Release note:

    The querystring template tag now accepts multiple positional arguments, which must be mappings, such as QueryDict or dict .

    This enhancement allows the tag to merge multiple sources of query parameters when building the output. For example, you might have a template like this:

    <a href="{% querystring request.GET super_search_params %}">Super search</a>
    

    …where super_search_params is a dictionary of extra parameters to add to make the current search “super”. The tag merges the two mappings, with later mappings taking precedence for duplicate keys.

    Thanks again to Sarah Boyce for proposing this improvement in Ticket #35529 , to Giannis Terzopoulos for implementing it, and to Natalia Bidart, Sarah Boyce, and Tom Carrick for reviewing!

Fin

That’s a wrap! Thank you for reading my highlights. There are plenty more changes to read about in the release notes .

Also, there are always many more behind-the-scenes improvements and bug fixes that don’t make it into the release notes. Optimizations and micro-improvements get merged all the time, so don’t delay, upgrade today!

Thank you to all 174 people who contributed to Django 6.0, as counted in this list by Mariusz Felisiak.

May your upgrade be swift, smooth, safe, and secure,

—Adam


😸😸😸 Check out my new book on using GitHub effectively, Boost Your GitHub DX ! 😸😸😸


One summary email a week, no spam, I pinky promise.

Related posts:

Tags:

You’ll Never Guess Who Won the Newly Created ‘Peace Prize’ From FIFA, the World’s Most Corrupt Sports Body

Daring Fireball
www.theguardian.com
2025-12-09 20:28:21
The Guardian: There on a plinth, with “Donald J Trump” emblazoned on it in capital letters, was the uncoveted trophy: a golden globe resting on five golden hands big enough to compensate any tiny-handed recipient feeling sore about the Nobel peace prize. But wait, there was more. “There is also...
Original Article

It had about as much drama and suspense as reading a dictionary or watching election results come in from North Korea.

To the surprise of no one, Donald Trump won the inaugural Fifa peace prize on Friday at a cheesy, gaudy and gauche World Cup draw expertly designed to flatter the world’s most precious ego.

“This is your prize – this is your peace prize!” gushed Gianni Infantino , the bald-headed Fifa president, after Trump took the stage at the John F Kennedy Center for the Performing Arts in snowy Washington.

There on a plinth, with “Donald J Trump” emblazoned on it in capital letters, was the uncoveted trophy: a golden globe resting on five golden hands big enough to compensate any tiny-handed recipient feeling sore about the Nobel peace prize.

But wait, there was more. “There is also a beautiful medal for you that you can wear everywhere you want to go,” added Infantino, knowing that with Trump there is no such thing as too much.

Glowing oranger than usual under the stage lights, Trump eagerly put the medal around his neck without waiting for Infantino to do the honours. He told the audience of 2,000 people: “This is truly one of the great honours of my life.”

It was a Norwegian football commentator who once memorably celebrated victory over England by shouting: “Maggie Thatcher … your boys took a hell of a beating!” Now Fifa had delivered its own “Norwegian Nobel committee … your boys took a hell of a beating!” rebuke to the body that snubbed its favourite president.

Foreign leaders such as Keir Starmer and Benjamin Netanyahu have learned over the past year that flattering Trump is like feeding gold candy to a baby. The more blatant and obvious, the better it works. Now, thanks to Infantino, Trump was centre-stage at world sport’s greatest spectacle.

History sure does rhyme. Benito Mussolini used the 1934 World Cup in Italy to promote a resurgent Roman empire. Before every match, the Italian team performed the “Roman salute”. Il Duce even created a special trophy, the “Coppa del Duce”, which was six times bigger than the official Jules Rimet World Cup trophy.

The last time the US hosted the World Cup, in 1994, the draw was held in Las Vegas and President Bill Clinton did not attend. But Infantino, who regards America as football’s undiscovered country of profit, has pursued Trump as ardently as Count Dracula crossing oceans of time to find his lost love.

The Fifa supremo was spotted at Trump’s second inauguration earlier this year and is a regular guest in the Oval Office and at his Mar-a-Lago estate in Florida. He made no objection when Trump elbowed his way into Chelsea’s Club World Cup celebration . Fifa has even opened a new office in Trump Tower in New York.

The World Cup final draw was therefore held without irony at the Kennedy Center, where Senate Democrats have launched an investigation into alleged cronyism and corruption under the leadership of a Trump appointee, just round the corner from the Watergate building, where a burglary and cover-up brought down Richard Nixon.

All very Fifa .

The extravaganza began with the Italian tenor Andrea Bocelli belting out the aria Nessun Dorma, which translates as “None shall sleep” – a subtle dig at the president who has recently been seen dozing at meetings ?

The hosts were the model and presenter Heidi Klum, wearing a shimming gold dress, and comedian Kevin Hart, wearing a black sweater with a sparkling necklace. There was the customary montage of football clips, including Diego Maradona’s second for Argentina against England in 1986 – perhaps Trump, often accused of cheating at golf, would have preferred Maradona’s shameless “Hand of God” first.

Trump, who coined the phrase “truthful hyperbole” in his book The Art of the Deal, would surely have admired the way Infantino declared: “This will not just be the greatest football event, this will be the greatest event in human history, the greatest event that humanity will ever witness … This is like 104 Super Bowls in one month.”

The Lex Luthor of world football got Americans in the audience to chant “USA! USA! USA!” then Canadians to chant “Canada, Canada, Canada!” and Mexicans to chant “Mexico, Mexico, Mexico!” Then, after a noisy display by Nicole Scherzinger and Robbie Williams, it was time for Trump’s moment of glory.

As a glossy video played, a voiceover tried to convince everyone this prize had not just been made up entirely for Trump’s benefit. “Peace creates hope and football translates that hope into unity,” it said.

“We honour a dynamic leader who has engaged in diplomatic efforts that create opportunities for dialogue, de-escalation and stability and who has championed the unifying power of football on the world stage.”

It was more wordy justification than the Dodo offered in Alice’s Adventures in Wonderland: “All must have prizes.”

The narration ran through the dodgy list of eight conflicts that Trump claims to have settled during his 10 months in office. It did not mention his fawning over Russia’s Vladimir Putin or extrajudicial killing of dozens of unnamed, untried people on small boats in the Caribbean. Any chance of VAR on this decision?

The audience was treated to slow-motion video of Trump at the Gaza peace summit, Trump meeting the Indian prime minister, Narendra Modi, Trump signing the Abraham accords, Trump with the leaders of the Democratic Republic of the Congo and Rwanda – and of Infantino giving him a thumbs-up like a proud football dad.

Then came the presentation and, shortly afterwards, Trump standing on stage alongside the Canadian prime minister, Mark Carney, and Mexican president, Claudia Sheinbaum, behind plastic stands as if taking part in a gameshow. The US president tried to go all Ted Lasso, reminiscing about watching Pelé play for the New York Cosmos and admitting that soccer should be called “football”.

Once the convoluted draw – did Trump’s eyes stay open? – was done, the show ended like a Trump rally with the Village People belting out Y.M.C.A. The president had got his prize and Infantino had got his man. Next stop the Oscars?

mistralai/mistral-vibe

Simon Willison
simonwillison.net
2025-12-09 20:19:21
mistralai/mistral-vibe Here's the Apache 2.0 licensed source code for Mistral's new "Vibe" CLI coding agent, released today alongside Devstral 2. It's a neat implementation of the now standard terminal coding agent pattern, built in Python on top of Pydantic and Rich/Textual (here are the dependenci...
Original Article

mistralai/mistral-vibe . Here's the Apache 2.0 licensed source code for Mistral's new "Vibe" CLI coding agent, released today alongside Devstral 2.

It's a neat implementation of the now standard terminal coding agent pattern, built in Python on top of Pydantic and Rich/Textual (here are the dependencies .) Gemini CLI is TypeScript, Claude Code is closed source (TypeScript, now on top of Bun ), OpenAI's Codex CLI is Rust. OpenHands is the other major Python coding agent I know of, but I'm likely missing some others.

The Vibe source code is pleasant to read and the crucial prompts are neatly extracted out into Markdown files. Some key places to look:

The Python implementations of those tools can be found here .

I tried it out and had it build me a Space Invaders game using three.js with the following prompt:

make me a space invaders game as HTML with three.js loaded from a CDN

Animated screenshot demo of Mistral Vibe running in a terminal. The text reads: I've created a Space Invaders game using HTML and Three. js loaded from a CDN. The game is now available in the file space_invaders.html in your current directory. Here's how to play: 1. Open the space_invaders.html file in a web browser 2. Use the left and right arrow keys to move your player (green rectangle) 3. Press the spacebar to shoot at the invaders (red rectangles) 4. Try to get the highest score before the invaders reach you or hit you with their bullets The game features: © Player movement with arrow keys © Shooting mechanics with spacebar © Enemy invaders that move back and forth © Collision detection « Score tracking * Game over screen © Increasing difficulty Writing file (64s esc to interrupt) »» auto-approve on (shift-tab to toggle) - 7% of 100k tokens

Here's the source code and the live game . It did OK.

Official Propaganda for Caribbean Military Buildup Includes “Crusader Cross”

Intercept
theintercept.com
2025-12-09 20:11:31
Once eschewed by the Pentagon, the “Jerusalem cross” has been co-opted by the far right — and embraced by Pete Hegseth. The post Official Propaganda for Caribbean Military Buildup Includes “Crusader Cross” appeared first on The Intercept....
Original Article

An official U.S. military social media account on Monday shared a photo collage that included a symbol long affiliated with extremist groups — and Secretary of Defense Pete Hegseth.

In a post on X trumpeting the deployment of troops to the Caribbean, U.S. Southern Command, or SOUTHCOM, shared an image that prominently displayed a so-called Jerusalem cross on the helmet of a masked commando.

The Jerusalem cross, also dubbed the “Crusader cross” for its roots in Medieval Christians’ holy wars in the Middle East, is not inherently a symbol of extremism. It has, however, become popular on the right to symbolize the march of Christian civilization, with anti-Muslim roots that made it into something of a logo for the U.S. war on terror.

Tattoos of the cross, a squared-off symbol with a pattern of repeating crosses, have appeared on the bodies of people ranging from mercenaries hired by the Gaza Humanitarian Foundation to Hegseth himself.

Now, the symbol has reared its head again to advertise President Donald Trump’s military buildup against Venezuela — an overwhelmingly Catholic country — and boat strikes in the Caribbean.

“As with all things Trump, it’s a continuation, with some escalation, and then a transformation into spectacle,” said Yale University historian Greg Grandin, whose work focuses on U.S. empire in Latin America .

The social media post came amid rising controversy over a series of strikes on boats allegedly carrying drugs off the coast of Venezuela, dubbed Operation Southern Spear.

Hegseth is alleged to have ordered a so-called “ double-tap ” strike, a follow-up attack against a debilitated boat that killed survivors clinging to the wreckage for around 45 minutes. The U.S. has carried out 22 strikes since the campaign began in September, killing a total of 87 people .

The Pentagon’s press office declined to comment on the use of the Jerusalem cross, referring questions to SOUTHCOM. But in a reply to the X post on Monday, Hegseth’s deputy press secretary Joel Valdez signaled his approval with emojis of a salute and the American flag. In a statement to the Intercept, SOUTHCOM spokesperson Steven McLoud denied that the post implied any religious or far-right message.

“The graphic you’re referring to was an illustration of service members in a ready posture during Operation SOUTHERN SPEAR,” McLoud told The Intercept. “There is no other communication intent for this image.”

The original image of the masked service member appears to have come from an album published online by the Pentagon that depicts a training exercise by Marines aboard the USS Iwo Jima in the Caribbean Sea in October. The photo depicting the cross, however, was removed from the album after commentators on social media pointed out its origins.

Amanda Saunders, a spokesperson for the Defense Visual Information Distribution Service, the Pentagon-run photo agency, said she was unable to comment directly but forwarded the request to the Marine unit involved in the exercise.

“Content on DVIDS is published and archived directly by the registered units,” she said, “so we don’t have control over what is posted or removed, nor are we able to comment on those decisions.”

Hegseth and the Cross

The Jerusalem cross’s popularity on the right has surged in part thanks to featuring in various media, including the 2005 Ridley Scott film “Kingdom of Heaven” and video games, according to Matthew Gabriele, a professor of medieval studies at Virginia Tech and a scholar of Crusader iconography.

“It supports the rhetoric of ‘defense of homeland.’”

“It supports the rhetoric of ‘defense of homeland,’” Gabriele told The Intercept, “because the crusaders, in the right’s understanding, were waging a defensive war against enemies trying to invade Christian lands.”

The symbol’s position of prominence in official military communications is just the latest example of a trollish extremism by the Trump administration’s press teams, which have made a point of reveling in the cruelty wrought on its perceived enemies at home and abroad, or “owning the libs.”

Monday’s post may also be intended as Hegseth putting his thumb in the eye of the Pentagon’s old guard. Hegseth’s embrace of the symbol — in the form of a gawdy chest tattoo — once stymied, however temporarily, his ambitions in the military.

Folling the January 6 insurrection, according to Hegseth and reporting by the Washington Post, Hegseth was ordered to stand down rather than deploy with his National Guard unit ahead of the 2021 inauguration of Joe Biden. The decision to treat Hegseth as a possible “insider threat” came after a someone flagged a photo of a shirtless Hegseth to military brass, according to the Washington Post .

“I joined the Army in 2001 because I wanted to serve my country. Extremists attacked us on 9/11, and we went to war,” Hegseth wrote “The War on Warriors,” his 2024 memoir. “Twenty years later, I was deemed an ‘extremist’ by that very same Army.”

Hegseth was hardly chastened by the episode and has since gotten more tattoos with more overt anti-Muslim resonance , including the Arabic word word for “infidel,” which appeared on his bicep sometime in the past several years. It’s accompanied by another bicep tattoo of the Latin words “Deus vult,” or “God wills it,” yet another slogan associated with the Crusades and repurposed by extremist groups.

The use of the image to advertise aggressive posturing in a majority-Christian region like Latin America may seem odd at first glance. In the context of renewed U.S. focus on Latin America, however, it’s a potent symbol of the move of military action from the Middle East to the Western Hemisphere.

“They’re globalizing the Monroe Doctrine.”

The post comes on the heels of the release of the Trump’s National Security Strategy , a 33-page document outlining the administration’s foreign-policy priorities that explicitly compared Trump’s stance to the Monroe Doctrine, the turn-of-the-century policy of U.S. dominance in Latin America in opposition to colonialism by other foreign powers. Grandin, the Yale historian, described the document as a “vision of global dominance” based on a model of great-powers competition that can lead to immense instability.

“They’re globalizing the Monroe Doctrine,” Grandin said. “I’m no fan of the hypocrisy and arrogance of the old liberal international order, but there’s something to be said for starting from a first principle of shared interests, which does keep great conflict at bay to some degree.”

Agentic AI Foundation

Hacker News
block.xyz
2025-12-09 20:00:39
Comments...

You Can’t Please a Madman

Daring Fireball
truthsocial.com
2025-12-09 19:58:52
Donald Trump, on his blog: The only reason Marjorie “Traitor” Brown (Green turns Brown under stress!) went BAD is that she was JILTED by the President of the United States (Certainly not the first time she has been jilted!). Too much work, not enough time, and her ideas are, NOW, really BAD — Sh...