The Dog that Caught the Car: Britain's 'World-Leading' Internet
Internet Exchange
internet.exchangepoint.tech
2025-09-11 19:35:47
Billed as a “world-leading” child-protection law, the UK’s Online Safety Act has instead normalized surveillance and ID checks. “Tech policy wonk" Heather Burns writes that the model is spreading across the Atlantic, where politicians see a ready-made tool for censorship and control....
The UK’s Online Safety Act was sold as a “world-leading” child-protection law, one that would establish the UK tech sector as a global safety-tech powerhouse. Instead, it has normalized the idea that governments can bolt identity checks and surveillance layers onto the internet. Now that blueprint is crossing the Atlantic, where authoritarian-minded politicians see it as ready-made kit for censorship and control.
In the
July 17 edition of Internet Exchange
, Audrey Hingle laid out the risks to privacy, equality, and effectiveness raised by the age verification regime in the UK's Online Safety Act (OSA). Many civil society advocates warned the UK government and the Act's advocates that these risks would come to pass. We did so for more years than any of us care to remember, in ways that felt like
deja vu
at best, "Groundhog Day" at worst. In response, we were called big tech shills, tech libertarians, apologists for child exploitation, and enablers of the worst of humanity; some of us, including this author,
were called far worse
.
Safety for Sale
The problem was not that the people at the opposite end of the table did not understand the risks at hand, or, as is commonly assumed, the technology of the internet. The problem was that one of the the Act's main policy goals was to create a market, and a marketplace, for the
British safety tech sector
. That includes age verification providers. In the aftermath of Brexit, which drove away tech talent and investment, the UK desperately needed a digital success story. That success story, in the Conservative vision, would come through
expanding the use of British technology for law and order
. Hence lawmakers drafted the OSA to mandate the integration of an age verification wall, as a compliance requirement, for the over 100,000 service providers in scope of the law; and hence the
revolving door
between the online safety regulator, Ofcom, and the age verification software lobby. In this vision,
the Great British Internet stack
would simply have a few extra technical layers: innovations which would keep people safe whilst boosting British industry. Who could possibly have a problem with that? That's right–those pesky civil society technologists. By nagging politicians about fundamental rights and surveillance technology, we weren't just standing in the way of a law promoted as being about child safety; we were failing to "back Britain".
That Conservative vision for the Great British Internet was accompanied at all times with the catchphrase "world-leading". Freed from the clarity of the EU's Digital Single Market strategy, in which the UK went from being part of a trading bloc of half a billion people working under a common set of regulations to being an embittered island (plus a bit) making up its own rules for a market of seventy million, the UK set out to craft its own way forward, one which other nations would surely rush to emulate. Suffice to say that by the fifth year of clause-by-clause contention over the draft Bill, the civil society joke–take a drink every time the UK government refers to the OSA as "world-leading"–stopped being a joke and started to sound like a good idea.
In those days, I warned policymakers that their nationalist surveillance capitalism model might not inspire the "world-leading" reputation they wanted. One technically competent Labour MP, then in opposition, retorted:
"we shouldn't legislate around what other countries might or might not do."
So much for "world-leading." Here was a preview of how Labour would approach the OSA, its dubious Conservative inheritance, once in power: if the UK's "world-leading" model was copied for good, they would claim credit. If the UK's "world-leading" model was copied for bad, they would wash their hands of it. Laws built on magical thinking tend to create more of it.
What neither the Conservatives nor Labour counted on was the second Trump administration.
A World-Leading Model for Digital Racism
In September 2021, in my last miserable day of a miserable job, I wrote
a blog post
in my professional capacity about a meeting I had attended with a group of age verification software vendors. As I wrote, these lobbyists—well funded, well connected, and working at the heart of Parliament—were campaigning to have the then-Online Safety Bill's age verification measures expanded to require
nationality checks
. Imagine having to prove your nationality through a passport or national identity card as part of today's mandatory age verification checks, and you get a sense of how the lobbyists wanted the Great British Internet to work.
The meeting's attendees, all affluent white English elites, worked hard to build a rationale for why nationality checks as a condition for access to information and interaction made perfect sense. The naked opportunism on display as they glibly touted their wares as a means to identify,
segregate, and suppress
was unforgettable. The Online Safety Act provided policy experiences that no amount of training could prepare you for.
Screenshot from a Digital Policy Alliance briefing advising MPs that users should be required to verify their name, identity, and nationality to interact and share information. Source:
Open Rights Group
.
Four years on, that meeting was the first thing I thought of when I learned that a US appeals court judge had penned a legal rationale for
why non-US citizens within the US are not entitled to First Amendment rights
, meaning the US government could censor or surveil online speech by non-citizens. That dissent tested the waters to tell a certain political demographic within the US what they want to hear. And what they want to hear is that segregating and suppressing any given group of internet users based on identity, rather than on account registration, content, or adtech, is both technically possible and easily deployable. That is the "world-leading" promise of the Online Safety Act and the surveillance technology peddlers who love it. If this dystopia continues as it is, and that opinion is elevated from a dissent to a working policy, the technology needed to enable it is ready to go, made in Britain™.
That technology could also be deployed
in the aftermath of
Netchoice v Fitch
, a failed lawsuit over Mississippi's child safety law, which raised the prospect of excluding children (regardless of citizenship) from First Amendment rights. Identifying children, not just for site access, but as a matter of who has constitutional rights, would require official documents. Documents which, conveniently, also note the child's nationality. As you may have noticed, children of
the "wrong" nationality
often find themselves excluded from
America's already limited protections
.
None of this would have been possible without the Online Safety Act, which legitimized the notion of an age and identity layer within the internet stack and jingoistically encouraged the growth of the surveillance technology market designed to enable it. That market now stands ready to enable–and profit–from racist authoritarianism.
Karen Wants to Speak to Wikipedia's Manager
Then there is the case of Wikipedia, whose legal challenge to the Online Safety Act's categorization tiers was recently
dismissed in the High Court
on the technical grounds that Ofcom has
not yet
placed Wikipedia into the Category 1 tier, even though that is only a matter of waiting for the bureaucracy to catch up. As laid out by Ofcom, Wikipedia will fall into the highest tier of compliance, as if it were Facebook, requiring all global contributors of any age, not just British ones, to (amongst other things) have their ages verified to make sure they are neither precious British children, nor precious British children accessing 'harmful content'. Wikipedia was singled out as a target for top-tier compliance obligations
as early as 2019
, due to the fact that there are articles about suicide, self-harm, and eating disorders on it, which
in Conservative minds
meant that Wikipedia was
willingly encouraging
these things. Achieving this compliance requirement will require the insertion of an age verification layer which could
easily be repurposed
to identify individual contributors, many of whom
depend on Wikipedia's anonymity to protect their personal safety
.
A database full of Wikipedia contributors' identities would be manna from heaven to Republicans on the US House Oversight Committee, who have
sent a Karen-esque letter to the Wikimedia Foundation
demanding that they turn over the identities, IP addresses, and activity logs of Wikipedia contributors who have written encyclopaedia articles on Israel–Palestine. The Wikimedia Foundation's legal challenge in the UK courts has delayed the rollout of age checks for Wikipedia contributors, and thank goodness for that. If the age verification layer were already in place, the Committee would already have all of the data they need for the witch-hunt they crave. Thanks to the Online Safety Act, it is only a matter of time until they do.
Congratulations, Britain, You're World-Leading.
These legal and political salvos against privacy and freedom of expression, crafted in Britain and now crossing the Atlantic, have unfolded barely two months into the OSA's full compliance requirements. From here, the requirements will get stricter, the enforcement will get more aggressive, and the erosion of rights will accelerate. Next, Ofcom will mandate the rollout of proactive content detection technology, which introduces
a general monitoring obligation
on the Great British Internet, a move celebrated after Brexit
freed the UK from the EU e-Commerce Directive
, which had prohibited such obligations in domestic law. In the Brexit mentality, it is better to have a bad domestic law than a sensible European one. UK politicians will tout this model, too, as part of the UK's "world-leading" approach.
Whilst the proactive content detection regime currently covers only CSAM and terrorist content, that is because the aforementioned civil society troublemakers fought tooth and nail to limit it to just that. As
originally written, the draft Bill
included
subjectively harmful but legal
content within that scope and allowed the Secretary of State for Digital, a political appointee, to define any content to be brought within that scope for explicitly political reasons. You can rest assured that authoritarians were watching that debate closely, and have learned from it. They will be smarter about it when it is their turn. Among them are American policymakers who
have taken a broad view on who counts as a "terrorist."
Here is the point: nationalist surveillance capitalism in the UK and beyond is happening in parallel to America's slide into
digital authoritarianism
. America is providing the political and legal scenarios. But it is the UK, its surveillance technology market developed to boost Brexit Britain, and the legal framework crafted around that market's rent-seeking, which has normalized the idea that stacking
multiple interception layers
onto the internet is simple, patriotic, and lucrative.
In response, Keir Starmer's now-former Secretary of State for Digital, a nice guy who is stuck with the job of promoting astuck with selling the mouldy sandwich that Theresa May made, went on telly to declare that people who oppose the act are "on the side of paedophiles.". (Oi mate, some of us were being called that
long before it was trendy
.) With advocates like that, who needs the OSA's critics?
It would seem that the UK is the dog that caught the car. After years of touting its "world-leading" (drink!) internet regulation model that would "take back control from Europe" (drink!) to "clean up the internet" (drink!), "bring social media to heel" (drink!), "rein in the tech giants" (drink!), and make Britain the "safest place in the world to be online" (drink!) by "tackling online harms through technical innovation" (drink!), Brexit Britain has got everything it ever wanted.
And it has no idea what to do now.
Mallory at UN Digital Cooperation Day
IX’s Mallory Knodel will be participating at the
UN Digital Cooperation Day
on September 22 in New York during the UN General Assembly High-Level Week. She joins the 14:30–15:30 ET session,
Shaping Responsible AI Use Through International Standards and Cooperation
, organized with ISO and IEC.
Marking the first anniversary of the Global Digital Compact, the discussion will focus on implementation: how international standards and capacity building can enable trustworthy, interoperable AI. Speakers will elevate Global South priorities, share practical country approaches, and build momentum toward the International AI Standards Summit 2025 in Seoul.
Support the Internet Exchange
If you find our emails useful, consider becoming a paid subscriber! You'll get access to our members-only Signal community where we share ideas, discuss upcoming topics, and exchange links. Paid subscribers can also leave comments on posts and enjoy a warm, fuzzy feeling.
Not ready for a long-term commitment? You can always
leave us a tip
.
A joint statement from members of the ActivityPub and AT Protocol communities urges collaboration instead of competition, rejecting the “winner-takes-all” framing and emphasizing that both protocols can coexist and strengthen the open social web together.
https://hachyderm.io/@thisismissem/115157586644221109
A new report from the Institute for Data, Democracy & Politics exposes how the booming social media monitoring industry relies on opaque access deals and data scraping, serving commercial interests while leaving public research needs unmet.
https://iddp.gwu.edu/dashboard-data-acquisition
The Trump administration’s changes to the $42.5B BEAD program are sidelining rural fiber in favor of costly satellite subsidies, repeating past broadband policy failures, warns Christopher Mitchell, Director of the Community Broadband Networks Initiative and the Institute for Local Self-Reliance.
https://communitynetworks.org/users/christopher
You can now download slides, photos, and other resources from the Global Digital Compact event on the Global Digital Collaboration site.
https://globaldigitalcollaboration.org
A new IRTF mailing list, ARMOR, brings together experts to tackle traffic tampering and shape research on resilient connectivity. Subscribe if you want focused, technical discussion on real-world countermeasures and a chance to influence an emerging IRTF research agenda on resilient connectivity.
https://mailman3.irtf.org/mailman3/lists/armor@irtf.org
Digital Rights
Amnesty International, together with InterSecLab, Paper Trail Media, and partners including Der Standard, Follow The Money, Globe and Mail, Justice For Myanmar, and the Tor Project, has released findings from a year-long investigation called the Great Firewall Export. Reports from multiple organizations examine how Geedge Networks exported China’s Great Firewall technology to authoritarian regimes and the global systems that make digital repression possible.
From Amnesty International, how a range of private companies from around the world have provided, and in some cases continue to provide, surveillance and censorship technologies to Pakistan, despite Pakistan’s troubling record on the protection of rights online.
https://www.amnesty.org/en/documents/asa33/0206/2025/en
The
Silk Road of Surveillance
report exposes the significant collaboration between the illegal Myanmar military junta and Geedge Networks in implementing a commercial version of China’s "Great Firewall", giving the junta unprecedented capabilities to track down, arrest, torture and kill civilians.
https://www.justiceformyanmar.org/stories/silk-road-of-surveillance
InterSecLab’s Internet Coup report reveals how leaked files show Geedge Networks is exporting China’s Great Firewall technologies to governments in Asia and Africa, fueling the rise of digital authoritarianism.
https://interseclab.org/research/the-internet-coup
Nepal announced it will block access to major platforms including Facebook, X, YouTube, and Instagram after they failed to register locally as required by law, a move the government says targets online hate and cybercrime but digital rights advocates warn undermines fundamental freedoms.
https://www.aljazeera.com/news/2025/9/4/nepal-moves-to-block-facebook-x-youtube-and-others
ARTICLE 19’s new report warns that market-first connectivity strategies deepen digital inequality and calls for a rights-based approach to reclaim the internet for all.
https://www.article19.org/resources/the-missing-link
This study of 194 UK tech policy documents shows that while “stakeholder” is invoked to suggest broad inclusion, in practice the voices shaping policy remain narrow, with significant gaps between rhetoric and representation; using queer performativity theory, the analysis reveals how the term constructs roles, hierarchies, and exclusions in UK tech policy.
https://policyreview.info/articles/analysis/stakeholders-uk-tech-policy
In AI & SOCIETY, Samuel O. Carter and John G. Dale propose a framework of algorithmic political capitalism to tackle bias in algorithms by linking power, policy, and technology to democratic accountability.
https://link.springer.com/article/10.1007/s00146-025-02540-2
MediaJustice has released “The People Say No: Resisting Data Centers in the South", the first regional analysis of how Big Tech’s $100B+ data center boom is harming Southern communities economically and environmentally. See “upcoming events” below for a link to join their live launch.
https://mediajustice.org/resource/the-people-say-no-report
Christianity is experiencing a resurgence in Silicon Valley, where faith groups like the Acts 17 Collective are drawing tech workers, entrepreneurs, and investors back to religion.
https://www.telegraph.co.uk/gift/8b667ea9587e419a
Catherine Karimi Gichunge, part of M-Pesa’s original team, discusses how Africa's most iconic fintech was built through years of piloting, on-the-ground hustle, and financial innovation on the F-Squared Podcast.
https://www.youtube.com/watch?v=tuKOuE8z2m8
617 scientists from 35 countries also warn that the EU’s revamped CSAM plan still undermines end-to-end encryption and won’t work at scale, urging rights-preserving child protection instead.
https://csa-scientist-open-letter.org/Sep2025
This is a template for a fuzzer for
parking-game
puzzles.
While originally built as a homework assignment, this is released publicly for folks looking to learn how to implement
their own components in
LibAFL
.
Purpose
This repo serves as examples for implementing various components of LibAFL yourself, as well as working through some of
the logic you may need to consider in your own applications.
This targets a puzzle game, for which fuzzing is almost certainly a poor application, but the game highlights certain
design patterns that crop up in the testing of real programs.
Notably, this repo teaches you how to:
Build custom executors (including integration with custom observer types)
Build custom observers
Build custom feedbacks which inspect the results of those observers and interact with user statistics
Build custom input types
Build custom mutators for those input types
Basic snapshot fuzzing strategies and mutators specialized for them
Obviously, since we are working with a game, the components are somewhat contrived; in real applications, you will face
other difficulties specific to your target.
This is just around to help people get their bearings with the design of LibAFL and why we implement things the way we
do, and the challenges therein are to give a sense of how to implement new components.
Anti-Purpose
This repo is
not
designed to teach you how to fuzz real programs.
The execution and feedback mechanisms here are deeply contrived and do not represent optimal strategies.
This strictly serves as a way to think about building your own LibAFL components, not fuzzing as a whole.
Intended Audience
This is intended for practitioners who are at least already aware of fuzzing, have a basic knowledge of Rust, and are
looking to build custom fuzzing components with LibAFL.
For folks wanting to learn fuzzing strategies for specific applications, consider reading
the
Fuzzing Book
or trying to apply existing LibAFL executors.
License
The code within this repository is licensed under CC0.
Reuse how you wish, but please give credit back as possible.
The maps provided in the
maps
directory are ported
from
Jim Storer's personal page
.
I could not find an associated license.
Redistribute at your own risk.
The maps are distributed here as educational materials for non-commercial use.
Exercises
These exercises center around solving puzzles from Tokyo Parking (now licensed as Thinkfun's Rush Hour).
Watch
the promotional video on Rush Hour
to get a sense of how to solve
these puzzles, and maybe try solving them yourself.
They aren't so simple!
The
parking-game
crate implements the rules of movement for these
puzzles.
The implementation ensures that the board stays in a "sane" state by applying modifications to a view over the state.
A loader for human-readable puzzles is provided for you in the fuzzer template so that you can load the original Tokyo
Parking board layouts into the fuzzer by providing their path as command-line arguments.
These exercises will walk you through the stages of building a fuzzer for a new target with LibAFL.
They start out easy and guided, but get harder and with less direction over time.
First, you will implement the components for fuzzing these puzzles; tests are provided to ensure that your
implementations are reasonably sound, but are not extensive.
Then, you will link together the fuzzer logic in
main.rs
to fuzz these puzzles.
Already, this simple implementation will be able to solve most puzzles reasonably quickly, but you will develop extra
feedback mechanisms that show that there are still opportunities for improved performance.
In the last remaining parts, we explore deeper optimizations that have real-world correspondences to optimizations in
certain targets.
First, you will improve your mutator to avoid "wasting" executions on invalid inputs.
With this implemented, a new optimization becomes available: resumed execution, or snapshot fuzzing.
The final exercise deals with the optimization of snapshot fuzzing, and discusses real-world correspondences.
Exercise 0: Components
Before completing this exercise, ensure that your environment is correctly configured by running
cargo test
.
You should see that the following tests fail:
executor::test::simple_run_check
feedbacks::test::example_observation
feedbacks::test::simple_solved
feedbacks::test::simple_unsolved
observers::test::distinguish_states
observers::test::simple_observation
These are the tests for the components that you are about to implement.
Tasks for this section are denoted as comments in the code as
TODO(pt.0)
or as
todo!("(pt.0) ...")
macros.
Make sure to review the other comments to get a better sense of what each part of the boilerplate does.
If you ever get lost with what each component does, you can run
cargo doc --open
to get an overview of each.
These may be implemented in any order, as the components are tested individually.
Nevertheless, I recommend implementing the components in the order listed below.
Executor
The
executor
component in LibAFL represents
exactly what it sounds like: the execution of the input in the target.
In our case, we need a custom executor to "run" the input (i.e., apply the sequence of moves) on the puzzle.
Your first task is to implement the executor in
executor.rs
by simply applying the moves in-order on
the board.
The corresponding boilerplate is provided for you.
Once implemented successfully, the test
executor::test::simple_run_check
should pass.
Guidance
An
observer
in LibAFL serves to collect
information about an execution to be processed later by
a
feedback
.
The purpose of these observers can be anything to augment the fuzzer's progress, but we'll start by investigating the
use of observers for
guiding
the fuzzer -- in this case, getting closer to a puzzle solution.
Guidance for mutational fuzzers, like the one we're building here, effectively boils down to identifying inputs which
are "interesting" and therefore should be retained for further mutation.
The core idea of this strategy is that
novelty
is a strong indicator for where other new behavior might be observed.
Here, since we don't have a good idea of what "getting closer" to a puzzle solution looks like, we'll start by simply
saying that
any
previously unseen puzzle state is interesting.
The easiest way to do that is simply to check if the hash of the puzzle has been observed -- meaning we treat any new
state as interesting.
There might be better guidance mechanisms, but for now, we'll rely on this.
LibAFL provides existing utilities for measuring if a previous hash has been observed with
the
ObserverWithHashField
trait.
For this step, complete the ObserverWithHashField implementation for the FinalStateObserver located
in
observers.rs
.
Avoid hashing data which is redundant (i.e., the car's lengths never change).
Once implemented successfully, the test
observers::test::distinguish_states
should pass.
Objective
Some observers and feedbacks are used for determining if an
objective
has been reached rather than for guidance.
An objective in classical fuzzing is simply a crash, but can be anything that represents what we want to find.
In this case, our objective is to be able to drive the "objective car" out of the board.
In this task, you need to implement ViewObserver -- an observer which measures what each car can see.
We will use this for our objective by determining if the objective car can drive out of the board (i.e., it can see the
edge of the board ahead of it).
The corresponding SolvedFeedback is already implemented for you; all you need to do is implement the view functionality.
Once implemented correctly, the
observers::test::simple_observation
test and all
feedbacks
tests should pass.
Be sure to review the feedbacks tests to understand how the observers and feedbacks interact.
Mutator
Without trying different inputs, the fuzzer can't make any progress.
To start simple, let's make a mutator that knows nothing about the current state of the board and just moves a random
car in a random direction at a random time.
There are no tests for this, but if your implementation is incorrect, the next exercise will not be achievable.
To implement this mutator, follow the steps provided in the TODO in PGRandMutator.
The exact implementation is not important, but you should ensure that any index can be selected (including the end!) and
that any direction can be selected.
You'll need to interact with the input, so make sure to review the documentation on how to do so.
Exercise 1: Basic Fuzzer
Your next exercise is to link together all the components that you've just implemented.
Complete all of the
TODO(pt.1)
items presented in
main.rs
.
If you ever get lost, review the test code; many TODOs are completed within the tests from part 0.
Once this is done, go ahead and run the fuzzer with
cargo run -- maps/tokyo1.map
.
If implemented correctly, the fuzzer should complete within a few seconds.
Measuring effectiveness
Unfortunately, our fuzzer is not very fast at solving harder puzzles.
Try running
cargo run -- maps/tokyo36.map
.
For a 6x6 puzzle, we barely make progress.
Can we do better?
We need to evaluate what about our fuzzer is currently holding us back.
To start, think about how our components interact: our executor applies the moves until either the moves are exhausted
or an error occurs (i.e., an invalid move is attempted).
If your executor is implemented efficiently, you should be observing upwards of 250,000 executions per second on modern
hardware.
How much execution time is being wasted on invalid inputs as a result of our oversimplified mutator?
To measure this, let's make a feedback which measures the rate of erroneous inputs.
Go to
feedbacks.rs
and implement CrashRateFeedback based on the TODO(pt.1) comments.
Some boilerplate is provided, but for this one, you're mostly on your own.
Make sure to review other implementations and search the documentation of LibAFL as needed.
Note particularly here that any mutable data is stored within metadata.
While not relevant for our single-threaded fuzzer that never has to restore from disk, any data which is not contained
within the state (i.e., within the metadata), will be lost whenever we recover from a crash on actual targets.
It is therefore good practice to always put the mutable data within metadata, as outlined in the template.
Documenting limitations
Reflect on what about our mutator could be causing so many failures.
Add a comment on PGRandMutator in
mutators.rs
as to why it underperforms; there are at least two
major contributing factors.
Exercise 2: Avoiding Wasted Executions
Most of our execution waste is coming from executing inputs which have no hope of succeeding.
Let's fix that by making our mutations smarter, at the cost of a bit more complexity.
Context enriching feedbacks
Some feedbacks just collect metadata for later use.
ViewFeedback associates metadata to individual testcases which tell us how many cars the moves can make at each point.
Start by including this in your feedbacks in
main.rs
.
This will not have any effect at the start.
Smart(er) mutators
Since we now have metadata for each testcase that tells us the number of moves that each car can make and in which
directions, we can build a mutator that takes advantage of this information.
Implement PGTailMutator based on the
TODO(pt.2)
comments in
mutators.rs
.
This is likely the most difficult task so far; take your time and review what you've already done to complete this.
Feel free to ask
questions
in the discussions
if you
get really stuck, but avoid giving any spoilers.
Once this is done, replace your mutator in
main.rs
.
Your crash rate should now be 0% for all maps.
Exercise 3: Snapshot Fuzzing
Review your implementation of executor.
Our target is very fast -- several hundred thousand executions per second -- but this is not representative of real
targets.
Let's slow things down a bit.
In
executor.rs
, use the commented
sleep
call to insert an artificial delay of 1µs per move.
By increasing the cost of the moves (which is more consistent with a real target), we see that our executions slow down
over time as we have more moves per testcase.
But, since we only mutate the tail of the input now, we are effectively wasting execution time on the common prefix.
Can we resume from the last state that we executed?
To do this, add the FinalStateFeedback to your feedbacks in
main.rs
.
This will save metadata that stashes the final state after an execution is completed.
Implement this functionality in FinalStateFeedback by addressing the
TODO(pt.3)
s in
observers.rs
.
Then, all we need to do is load that state instead of the initial state in our executor.
Load the snapshot from testcase metadata in your executor following the TODOs outlined
in
executor.rs
.
Ensure that the moves that are executed after the snapshot are only those which were not already executed.
After implementing this, you should notice that your performance once again shoots up -- probably by a few times.
Investigating the speedup
Remove the artificial delay from the previous section.
You'll notice that the speedup is significantly less than previously observed -- to be expected, since we artificially
inflated the execution time of moves.
Keep this in mind for situations where snapshots are more expensive than simply resetting and running the whole input
again.
Moreover, we could only apply this optimization because of the unusual input scenario we are in: the input is processed
one part at a time.
This does happen in practice (e.g., with asynchronous embedded targets awaiting input from peripherals), but is
generally rare.
Snapshot fuzzing is not always the answer.
Exercise 4: Endgame
Recall the implementation of PGTailMutator from earlier.
Notice how we recompute the possible mutations every time.
The number of valid mutations at any given state are not only finite -- they are few, and exactly computed by
PGTailMutator.
We can exhaust the whole mutation space at once and avoid redundant re-execution of mutations we've already observed.
To do this, we will now implement a stage to replace the mutation stage that exhausts the input space without
redundancy.
To do so, we will loosely reimplement the logic from PGTailMutator, but we won't randomly choose, we will only ever take
one step in any direction, and we won't execute with our old executor anymore.
This is the most difficult task; take your time, and remember you can ask for
help
in the discussions
.
Follow the
TODO(pt.4)
sections from
stages.rs
to complete this task.
Once completed, you can then replace the mutation stage in
main.rs
; you may need to reorder some
statements to keep the borrow checker happy!
Exercise 5: Reflect
Note that in part 4, we removed all randomness from the search.
Is this still a fuzzer?
Most people who build fuzzers would probably say no; there's no random element to the search!
But, when did this stop being a fuzzer, exactly?
I leave this last exercise to the reader.
Nevertheless, there is a lesson here: fuzzing only makes sense when there is a clear guidance mechanism and when there
is a need for randomness due to the enormity of the search space.
Fuzzing is, fundamentally, just a randomized search process guided by (in most cases) novelty.
So, when you're designing your fuzzers going forward, take the optimizations as they make themselves known to you -- but
don't lose your exploration ability along the way.
Addendum 1: Using existing LibAFL input types
If you are building a fuzzer expecting composite inputs outside of this exercise, you may want take a look at
LibAFL's
ListInput
. It provides
additional functionality around interacting with and mutating list-like inputs.
This discussion
explores how this could be used in
the context of this project and provides further details on its features.
Addendum 2: Klotski puzzles
The parking game is secretly a specific class of
"Klotski puzzles"
, and you
can apply the same strategies above to these puzzles as well.
A great video which helps with visualising the state space of these puzzles was uploaded just a little after the
original publication of this repository, and
I highly recommend you check it out
!
The previous article on
Bazel remote caching
concluded that using
just
a remote cache for Bazel builds was suboptimal due to limitations in what can and cannot be cached for security reasons. The reason behind the restrictions was that it is impossible to safely reuse a cache across users. Or is it?
In this article, we’ll see how leveraging remote execution in conjunction with a remote cache opens the door to safely sharing the cache across users. The reason is that remote execution provides a trusted execution environment for actions, and this opens the door to cross-user result sharing. Let’s see why and how.
When we configure remote execution via the
--remote_executor
flag, Bazel enables the
remote
action
execution strategy
by default for all actions, just as if we had done
--strategy=remote
. But this is only a default and users can mix-and-match remote and local strategies by leveraging the various
--strategy*
selection flags or by specifying execution requirements in individual actions.
A remote execution system is complicated as it is typically implemented by many services:
Multiple frontends.
These are responsible for accepting user requests and tracking results. These might include implement a second-level CAS to fan out traffic to clients.
A scheduler.
This is responsible for enqueuing action requests and distributing them to workers. Whether the scheduler uses a pull or push model to distribute work is implementation dependent.
Multiple workers.
These are responsible for action execution and are organized in pools of distinct types (workers for x86, workers for arm64, etc.) Internally, a worker is divided into two conceptual parts: the
worker
itself, which is the privileged service that monitors action execution, and the
runner
, which is a containerized process that actually runs the untrusted action code.
The components of a remote cache (a CAS and an AC).
The CAS is essential for communication between Bazel and the workers. The AC, which is optional, is necessary for action caching. The architecture of the cache varies from service to service.
For the purposes of this article, I want to focus primarily on the workers and their interactions with the AC and the CAS. I’m not going to talk about frontends nor schedulers except for showing how they help isolate remote action execution from the Bazel process.
Let’s look at the interaction between these components in more detail. To set the stage, take a look at the
combine
action from this sample build file:
The
combine
action has two types of inputs: a checked-in source file,
src.txt
, and a file generated during the build,
gen.txt
. This distinction is interesting because the way these files end up in the CAS is different: Bazel is the one responsible for uploading
src.txt
into the CAS, but
gen.txt
is uploaded by the worker upon action completion.
When we ask Bazel to build
//:combine
remotely, and assuming
//:generate
has already been built and cached at some point in the past, we’ll experience something like this:
That’s a lot of interactions, right?! Yes; yes they are. A remote execution system is not simple and it’s not always an obvious win: coordinating all of these networked components is costly. The overheads become tangible when dealing with short-lived actions—a better fit for persistent workers—or when you have a sequential chain of actions—a good fit for
the dynamic execution strategy
.
What I want you to notice here, because it’s critical for our analysis, is the shaded area. Note how all interactions within this area are driven by the remote execution service,
not
Bazel. Once an action enters the remote execution system, neither Bazel
nor the machine running Bazel
have any way of tampering with the execution of the remote action. They cannot influence the action’s behavior, and they cannot interfere with the way it saves its outputs into the AC and the CAS.
And this decoupling, my friend, is the key insight that allows Bazel to safely share the results of actions across users no matter who initiated them. However, the devil lies in the implementation details.
Given the above, we now know that remote workers are a trusted environment: the actions that go into a worker are fully specified by their action key and, therefore, whatever they produce and is stored into the AC and the CAS will match that action key. So if we trust the inputs to the action, we can trust its outputs, and we can do this retroactively… right?
Well, not so fast. For this to be true, actions must be deterministic, and they aren’t always
as we already saw
. Some sources of non-determinism are “OK” in this context though, like timestamps, because these come from within the worker and cannot be tampered with. Other sources of non-determinism are problematic though, like this one:
An attacker could compromise the network request to modify the content of the downloaded file, but only for long enough to poison the remote cache with a malicious artifact. Once poisoned, they could restore the remote file to its original content and it would be very difficult to notice that the entry in the remote cache did not match the intent of this rule.
It is tempting to say: “ah, the above should be fixed by ensuring the checksum of the download is valid”, like this:
And I’d say, yes, you absolutely need to do checksum validation because there are legitimate cases where you’ll find yourself writing code like this… in repo rules. Unfortunately, such checks are still insufficient for safe remote execution because, remember: actions can run from unreviewed code, or the code that runs them can be merged into the tree after a careless review (which is more common than you think). Consequently, the only thing you can and must do here is to
disable network access
in the remote worker.
That said,
just
disabling network access may still be “not good enough” to have confidence in the safety of remote execution. A remote execution system is trying to run untrusted code within a safe production environment: code that could try to attack the worker to escape whatever sandbox/container you have deployed, code that could try to influence other actions running on the same machine, or code that could exfiltrate secrets present in the environment. Securing these is going to come down to standard practices for untrusted code execution, none of which are specific to Bazel, so I’m not going to cover them. Needless to say, it’s a difficult problem.
If we have done all of the above, we now have a remote execution system that we can trust to run actions in a secure manner and to store their results in
both
the AC and the CAS. But… this, on its own, is still insufficient to secure builds end-to-end, and we would like to have trusted end-to-end builds to establish a chain of trust between sources and production artifacts, right?
To secure a build, we must protect the AC and restrict writes to it to happen exclusively from the remote workers. Only them, who we have determined cannot be interfered with, know that the results of an action correspond to its declared inputs—and therefore, only them can establish the critical links between an AC entry and one or more files in the CAS. You’d imagine that simply setting
--noremote_upload_local_results
would be enough, but it isn’t. A malicious user could still tamper with this flag in transient CI runs or… well, in their local workstation. And it’s because of this latter scenario that the only possible way to close this gap is via network level ACLs: the AC should only be writable from within the remote execution cluster.
But… you guessed it: that’s
still
insufficient. Even if we disallow Bazel clients from writing to the AC, an attacker can still make Bazel run malicious actions
outside
of the remote execution cluster—that is, on the CI machine locally, which does have network access. Such action wouldn’t record its result in the AC, but the
output
of the action would go into the CAS, and this problematic action could then be consumed by a subsequent action as an input.
The problem here stems from users being able to bypass remote execution by tweaking
--strategy*
flags. One option to protect against this situation is the same as we saw before: disallow CI runs of PRs that modify Bazel flags so that users cannot “escape” remote execution. Unfortunately, this doesn’t have great ergonomics because users often need to change the
.bazelrc
file as part of routine operation.
Bazel’s answer to this problem is the widely-unknown
invocation policy
feature. I say unknown because I do not see it documented in the output of
bazel help
and I cannot find any details about it whatsoever online—yet I know of its existence from my time at Google and I see its implementation in the Bazel code base, so we can reverse-engineer how it works.
As the name implies, an
invocation policy
is a mechanism to enforce specific command-line flag settings during a build or test with the goal of ensuring that conventions and security policies are consistently applied. The policy does so by defining rules to set, override, or restrict the values of flags, such as
--strategy
.
The policy is defined using the
InvocationPolicy
protobuf message defined in
src/main/protobuf/invocation_policy.proto
. This message contains a list of
FlagPolicy
messages, each of which defines a rule for a specific flag. The possible rules, which can be applied conditionally on the Bazel command being executed, are:
SetValue
: Sets a flag to a specific value. You can control whether the user can override this value. This is useful for enforcing best practices or build-time configurations.
UseDefault
: Forces a flag to its default value, effectively preventing the user from setting it.
DisallowValues
: Prohibits the use of certain values for a flag. If a user attempts to use a disallowed value, Bazel will produce an error. You can also specify a replacement value to be used instead of the disallowed one.
AllowValues
: Restricts a flag to a specific set of allowed values. Any other value will be rejected.
To use an invocation policy, you have to define the policy as an instance of the
InvocationPolicy
message in text or base64-encoded binary protobuf format and pass the payload to Bazel using the
--invocation_policy
flag in a way that users cannot influence (e.g. directly from your CI infrastructure, not from workflow scripts checked into the repo).
Let’s say you want to enforce a policy where the
--genrule_strategy
flag is always set to
remote
when running the
bazel test
command, and you want to prevent users from overriding this setting. We define the following policy in a
policy.txt
file:
If you now try to play with the
--genrule_strategy
flag, you’ll notice that any overrides you provide don’t work. (Bazel 9 will offer a new
FINAL_VALUE_THROW_ON_OVERRIDE
flag behavior to error out instead of silently ignoring overrides which will make the experience nicer in this case.)
Before concluding, I’d like to show you an interesting outage we faced due to Bazel being allowed to write AC entries from a trusted CI environment. The problem we saw was that, at some point, users started reporting that their builds were completely broken: somehow, the build of our custom singlejar helper tool, a C++ binary that’s commonly used in Java builds, started failing due to the inability of the C++ compiler to find some header files.
This didn’t make any sense. If we built the tree at a previous point in time, the problem didn’t surface. And as we discovered later, if we disabled remote caching on a current commit the problem didn’t appear either. Through a series of steps, we found that singlejar’s build from scratch would fail if we tried to build it locally
without
the sandbox. But… that’s not something we do routinely, so how did this kind of breakage leak into the AC?
The problem stemmed from our use of
--remote_local_fallback
, a flag we had enabled long ago to mitigate flakiness when leveraging remote execution. Because of this flag, we had hit this problematic path:
An build started on CI. This build used a remote-only configuration, forcing all actions to run on the remote cluster.
Bazel ran actions remotely for a while, but at some point, encountered problems while building singlejar.
Because of
--remote_local_fallback
, Bazel decided to build singlejar on the CI machine, not on the remote worker, and it used the
standalone
strategy,
not
the
sanboxed
strategy, to do so. This produced an action result that was later incompatible with sandboxed / remote actions.
Because of
--remote_upload_local_results
, the “bad” action result was injected into the AC.
From here on, any remote build that picked the bad action result would fail.
The mitigation to this problem was to flush the problematic artifact from the remote cache, and the immediate solution was to set
--remote_local_fallback_strategy=sandboxed
which… Bazel claims is deprecated and a no-op, but in reality this works and I haven’t been able to find an alternative (at least not in Bazel 7) via any of the other strategy flags.
The real solution, however, is to ensure that remote execution doesn’t require the local fallback option for reliability reasons, and to prevent Bazel from injecting AC entries for actions that do not run in the remote workers.
With that, this series to revisit Bazel’s action execution fundamentals, remote caching, and remote execution is complete. Which means I can
finally
tell you the thing that started this whole endeavor: the very specific, cool, and technical solution I implemented to work around a hole in the action keys that can lead to very problematic non-determinism.
Discussion about this post
Why OpenAI's solution to AI hallucinations would kill ChatGPT tomorrow
OpenAI’s latest research paper
diagnoses exactly why ChatGPT and other
large language models
can make things up – known in the world of artificial intelligence as “hallucination”. It also reveals why the problem may be unfixable, at least as far as consumers are concerned.
The paper provides the most rigorous mathematical explanation yet for why these models confidently state falsehoods. It demonstrates that these aren’t just an unfortunate side effect of the way that AIs are currently trained, but are mathematically inevitable.
The issue can partly be explained by mistakes in the underlying data used to train the AIs. But using mathematical analysis of how AI systems learn, the researchers prove that even with perfect training data, the problem still exists.
The way language models respond to queries – by predicting one word at a time in a sentence, based on probabilities – naturally produces errors. The researchers in fact show that the total error rate for generating sentences is at least twice as high as the error rate the same AI would have on a simple yes/no question, because mistakes can accumulate over multiple predictions.
In other words, hallucination rates are fundamentally bounded by how well AI systems can distinguish valid from invalid responses. Since this classification problem is inherently difficult for many areas of knowledge, hallucinations become unavoidable.
It also turns out that the less a model sees a fact during training, the more likely it is to hallucinate when asked about it. With birthdays of notable figures, for instance, it was found that if 20% of such people’s birthdays only appear once in training data, then base models should get at least 20% of birthday queries wrong.
Sure enough, when researchers asked state-of-the-art models for the birthday of Adam Kalai, one of the paper’s authors, DeepSeek-V3 confidently provided three different incorrect dates across separate attempts: “03-07”, “15-06”, and “01-01”. The correct date is in the autumn, so none of these were even close.
The evaluation trap
More troubling is the paper’s analysis of why hallucinations persist despite post-training efforts (such as providing extensive human feedback to an AI’s responses before it is released to the public). The authors examined ten major AI benchmarks, including those used by Google, OpenAI and also the top leaderboards that rank AI models. This revealed that nine benchmarks use binary grading systems that award zero points for AIs expressing uncertainty.
This creates what the authors term an “epidemic” of penalising honest responses. When an AI system says “I don’t know”, it receives the same score as giving completely wrong information. The optimal strategy under such evaluation becomes clear: always guess.
The researchers prove this mathematically. Whatever the chances of a particular answer being right, the expected score of guessing always exceeds the score of abstaining when an evaluation uses binary grading.
The solution that would break everything
OpenAI’s proposed fix is to have the AI consider its own confidence in an answer before putting it out there, and for benchmarks to score them on that basis. The AI could then be prompted, for instance: “Answer only if you are more than 75% confident, since mistakes are penalised 3 points while correct answers receive 1 point.”
The OpenAI researchers’ mathematical framework shows that under appropriate confidence thresholds, AI systems would naturally express uncertainty rather than guess. So this would lead to fewer hallucinations. The problem is what it would do to user experience.
Consider the implications if ChatGPT started saying “I don’t know” to even 30% of queries – a conservative estimate based on the paper’s analysis of factual uncertainty in training data. Users accustomed to receiving confident answers to virtually any question would likely abandon such systems rapidly.
I’ve seen this kind of problem in another area of my life. I’m involved in an air-quality monitoring project in Salt Lake City, Utah. When the system flags uncertainties around measurements during adverse weather conditions or when equipment is being calibrated, there’s less user engagement compared to displays showing confident readings – even when those confident readings prove inaccurate during validation.
The computational economics problem
It wouldn’t be difficult to reduce hallucinations using the paper’s insights. Established methods for quantifying uncertainty have
existed
for
decades
. These could be used to provide trustworthy estimates of uncertainty and guide an AI to make smarter choices.
But even if the problem of users disliking this uncertainty could be overcome, there’s a bigger obstacle: computational economics. Uncertainty-aware language models require significantly more computation than today’s approach, as they must evaluate multiple possible responses and estimate confidence levels. For a system processing millions of queries daily, this translates to dramatically higher operational costs.
More sophisticated approaches
like active learning, where AI systems ask clarifying questions to reduce uncertainty, can improve accuracy but further multiply computational requirements. Such methods work well in specialised domains like chip design, where wrong answers cost millions of dollars and justify extensive computation. For consumer applications where users expect instant responses, the economics become prohibitive.
The calculus shifts dramatically for AI systems managing critical business operations or economic infrastructure. When AI agents handle supply chain logistics, financial trading or medical diagnostics, the cost of hallucinations far exceeds the expense of getting models to decide whether they’re too uncertain. In these domains, the paper’s proposed solutions become economically viable – even necessary. Uncertain AI agents will just have to cost more.
However, consumer applications still dominate AI development priorities. Users want systems that provide confident answers to any question. Evaluation benchmarks reward systems that guess rather than express uncertainty. Computational costs favour fast, overconfident responses over slow, uncertain ones.
Falling energy costs per token and advancing chip architectures may eventually make it more affordable to have AIs decide whether they’re certain enough to answer a question. But the relatively high amount of computation required compared to today’s guessing would remain, regardless of absolute hardware costs.
In short, the OpenAI paper inadvertently highlights an uncomfortable truth: the business incentives driving consumer AI development remain fundamentally misaligned with reducing hallucinations. Until these incentives change, hallucinations will persist.
Over the past two decades, the Chinese government has been steadily refining their model of internet control using surveillance and censorship technologies domestically while promoting this approach to other nations under the banner of “digital sovereignty”. Through the export of these technologies, China is not only extending its global influence but also laying the foundation for a federated system of internet governance. In this system, Chinese companies provide the infrastructure and expertise for client governments to more easily monitor and control their own networks, while learning from these deployments and improving collective capacity for digital authoritarianism worldwide.
This research by InterSecLab uncovers evidence of the export of a suite of technologies resembling China’s Great Firewall by Geedge Networks, a private company linked to the academic entity ‘Massive and Effective Stream Analysis’ (Mesalab), a research laboratory at the Chinese Academy of Sciences. Our team’s investigation identifies a pattern of commercialization of surveillance capabilities, with Geedge Networks offering a suite of products that enable comprehensive monitoring and control of internet users.
InterSecLab’s analysis reveals that Geedge Networks is contracted with governments in Kazakhstan, Ethiopia, Pakistan, Myanmar, and one other unknown country to establish sophisticated systems of internet censorship and surveillance. Furthermore, our findings indicate that Geedge Networks is also involved in developing similar systems deployed within China, including in Xinjiang and other regions.
Based on analysis of a leak of more than 100,000 Geedge Networks documents that was shared with InterSecLab, this research sheds light on the features and capabilities of Geedge Networks’ systems, which include deep packet inspection, real-time monitoring of mobile subscribers, granular control over internet traffic, as well as censorship rules that can be tailored to each region. The leak also reveals information about Geedge Networks’ relationship with the academic entity, Mesalab, as well as their interactions with client governments. The implications for data sovereignty are significant, and our findings raise concerns about the commoditization of surveillance and information control technologies.
This research examines the recent development of Geedge Networks’ systems in various countries, including what is known about their deployment timelines. By analyzing the company’s internal documentation, InterSecLab was able to chronicle the expansion of commercially available national firewalls and speculate about the implications for the future of the global internet considering the spread of such systems.
The systems that enable modern life share a common origin. The water supply, the internet, the international supply chains bringing us cheap goods: each began life as a simple, working system. The first electric grid was no more than a handful of electric lamps hooked up to a water wheel in Godalming, England, in 1881. It then took successive
decades of tinkering and iteration
by thousands of very smart people to scale these systems to the advanced state we enjoy today. At no point did a single genius map out the final, finished product.
But this lineage of (mostly) working systems is easily forgotten. Instead, we prefer a more flattering story: that complex systems are deliberate creations, the product of careful analysis. And, relatedly, that by performing this analysis – now known as ‘systems thinking’ in the halls of government – we can bring unruly ones to heel. It is an optimistic perspective, casting us as the masters of our systems and our destiny.
The empirical record says otherwise, however. Our recent history is one of governments grappling with complex systems and coming off worse. In the United States, HealthCare.gov was designed to simplify access to health insurance by knitting together 36 state marketplaces and data from eight federal agencies. Its launch was paralyzed by
technical failures
that locked out millions of users. Australia’s disability reforms, carefully planned for over a decade and expected to save money, led to costs escalating so rapidly that they will
soon exceed the pension budget
. The UK’s
2014 introduction of Contracts for Difference
, intended to speed the renewables rollout by giving generators a guaranteed price, overstrained the grid and is a major contributor to the 15-year queue for new connections. Systems thinking is more popular than ever; modern systems thinkers have analytical tools that their predecessors could only have dreamt of. But the systems keep kicking back.
There is a better way. A long but neglected line of thinkers going back to chemists in the nineteenth century has argued that complex systems are not our passive playthings. Despite friendly names like ‘the health system’, they demand extreme wariness. If broken, a complex system often cannot be fixed. Meanwhile, our successes, when they do come, are invariably the result of starting small. As the systems we have built slip further beyond our collective control, it is these simple working systems that offer us the best path back.
The world model
In 1970, the ‘Club of Rome’, a
group
of international luminaries with an interest in how the problems of the world were interrelated, invited
Jay Wright Forrester
to peer into the future of the global economy. An MIT expert on electrical and mechanical engineering, Forrester had cut his teeth on problems like how to keep a Second World War aircraft carrier’s radar pointed steadily at the horizon amid the heavy swell of the Pacific.
The Club of Rome asked an even more intricate question: how would social and economic forces interact in the coming decades? Where were the bottlenecks and feedback mechanisms? Could economic growth continue, or would the world enter a new phase of equilibrium or decline?
Forrester labored hard, producing a mathematical model of enormous sophistication. Across 130 pages of mathematical equations, computer graphical printout, and DYNAMO code,
World Dynamics
tracks the myriad relationships between natural resources, capital, population, food, and pollution: everything from the ‘capital-investment-in-agriculture-fraction adjustment time’ to the ominous ‘death-rate-from-pollution multiplier’.
A section of Forrester’s World Model.
Image
WAguirre 2017
World leaders had assumed that economic growth was an unalloyed good. But Forrester’s results showed the opposite. As financial and population growth continued, natural resources would be consumed at an accelerating rate, agricultural land would be paved over, and pollution would reach unmanageable levels. His model laid out dozens of scenarios and in most of them, by 2025, the world would already be in the first throes of an irreversible decline in living standards. By 2070, the crunch would be so painful that industrialized nations might regret their experiment with economic growth altogether. As Forrester
put it
, ‘[t]he present underdeveloped countries may be in a better condition for surviving forthcoming worldwide environmental and economic pressures than are the advanced countries.’
But, as we now know, the results were also wrong. Adjusting for inflation, world GDP is now about five times higher than it was in 1970 and continues to rise. More than 90 percent of that growth has come from Asia, Europe, and North America, but
forest cover
across those regions has
increased
, up 2.6 percent since 1990 to over 2.3 billion hectares in 2020. The death rate from air pollution has almost halved in the same period, from
185 per 100,000 in 1990 to 100 in 2021
. According to the model, none of this should have been possible.
What happened? The blame cannot lie with Forrester’s competence: it’s hard to imagine a better systems pedigree than his. To read his prose today is to recognize a brilliant, thoughtful mind. Moreover, the system dynamics approach Forrester pioneered had already shown promise beyond the mechanical and electrical systems that were its original inspiration.
In 1956, the management of a General Electric refrigerator factory in Kentucky had called on Forrester’s help. They were struggling with a boom-and-bust cycle: acute shortages became gluts that left warehouses overflowing with unsold fridges. The factory based its production decisions on orders from the warehouse, which in turn got orders from distributors, who heard from retailers, who dealt with customers. Each step introduced noise and delay. Ripples in demand would be amplified into huge swings in production further up the supply chain.
Looking at the system as a whole, Forrester recognized the same feedback loops and instability that could bedevil a ship’s radar. He developed new decision rules, such as smoothing production based on longer-term sales data rather than immediate orders, and found ways to speed up the flow of information between retailers, distributors, and the factory. These changes dampened the oscillations caused by the system’s own structure, checking its worst excesses.
The Kentucky factory story showed Forrester’s skill as a systems analyst. Back at MIT, Forrester immortalized his lessons as a learning exercise (albeit with beer instead of refrigerators). In the ‘Beer Game’, now a rite of passage for students at the MIT Sloan School of Management, players take one of four different roles in the beer supply chain: retailer, wholesaler, distributor, and brewer. Each player sits at a separate table and can communicate only through order forms. As their inventory runs low, they place orders with the supplier next upstream. Orders take time to process, and shipments to arrive, and each player can see only their small part of the chain.
The objective of the Beer Game is to minimize costs by managing inventory effectively. But, as the GE factory managers had originally found, this is not so easy. Gluts and shortages arise mysteriously, without obvious logic, and small perturbations in demand get amplified up the chain by as much as 800 percent (‘the bullwhip effect’). On average, players’ total costs end up being
ten times higher than the optimal solution
.
With the failure of his World Model, Forrester had fallen into the same trap as his MIT students. Systems analysis works best under specific conditions: when the system is static; when you can dismantle and examine it closely; when it involves few moving parts rather than many; and when you can iterate fixes through multiple attempts. A faulty ship’s radar or a simple electronic circuit are ideal. Even a limited human element – with people’s capacity to pursue their own plans, resist change, form political blocs, and generally frustrate best-laid plans – makes things much harder. The four-part refrigerator supply chain, with the factory, warehouse, distributor and retailer all under the tight control of management, is about the upper limit of what can be understood. Beyond that, in the realm of societies, governments and economies, systems thinking becomes a liability, more likely to breed false confidence than real understanding. For these systems we need a different approach.
Le Chatelier’s Principle
In 1884, in a laboratory at the École des Mines in Paris, Henri Louis Le Chatelier noticed something peculiar: chemical reactions seemed to resist changes imposed upon them. Le Chatelier found that if, say, you have an experiment where two molecules combine in a heat-generating exothermic reaction (in his case, it was two reddish-brown nitrogen dioxide molecules combining into colorless dinitrogen tetroxide and giving off heat in the process), then you can speed things up by cooling the reactants. To ‘resist’ the drop in temperature, the system restores its equilibrium by creating more of the products that release heat.
Le Chatelier’s Principle, the idea that the system always kicks back, proved to be a very general and powerful way to think about chemistry. It was instrumental in the discovery of the Haber-Bosch process for creating ammonia that revolutionized agriculture. Nobel Laureate Linus Pauling
hoped
that, even after his students had ‘forgotten all the mathematical equations relating to chemical equilibrium’, Le Chatelier’s Principle would be the one thing they remembered. And its usefulness went beyond chemistry. A century after Le Chatelier’s meticulous lab work, another student of systems would apply the principle to the complex human systems that had stymied Forrester and his subsequent followers in government.
John Gall was a pediatrician with a long-standing practice in Ann Arbor, Michigan. Of the same generation as Forrester, Gall came at things from a different direction. Whereas Forrester’s background was in mechanical and electrical systems, which worked well and solved new problems, Gall was immersed in the human systems of health, education, and government. These systems often did not work well. How was it, Gall wondered, that they seemed to coexist happily with the problems – crime, poverty, ill health – they were supposed to stamp out?
Le Chatelier’s Principle provided an answer: systems should not be thought of as benign entities that will faithfully carry out their creators’ intentions. Rather, over time, they come to oppose their own proper functioning. Gall elaborated on this idea in his 1975 book
Systemantics
, named for the universal tendency of systems to display antics. A brief, weird, funny book,
Systemantics
(
The Systems Bible
in later editions) is arguably the best field guide to contemporary systems dysfunction. It consists of a series of pithy aphorisms, which the reader is invited to apply to explain the system failures (‘horrible examples’) they witness every day.
These aphorisms are provocatively stated, but they have considerable explanatory power. For example, an Australian politician frustrated at the
new headaches
created by ‘fixes’ to the old disability system might be reminded that ‘NEW SYSTEMS CREATE NEW PROBLEMS’. An American confused at how there can now be
190,000
pages in the US Code of Federal Regulations, up from 10,000 in 1950, might note that this is the nature of the beast: ‘SYSTEMS TEND TO GROW, AND AS THEY GROW THEY ENCROACH’. During the French Revolution, in 1793 and 1794, the ‘Committee of Public Safety’ guillotined thousands of people, an early example of the enduring principles that ‘THE SYSTEM DOES NOT DO WHAT IT SAYS IT IS DOING’ and that ‘THE NAME IS EMPHATICALLY NOT THE THING’. And, just like student chemists, government reformers everywhere would do well to remember Le Chatelier’s Principle: ‘THE SYSTEM ALWAYS KICKS BACK’.
These principles encourage a healthy paranoia when it comes to complex systems. But Gall’s ‘systems-display-antics’ philosophy is not a counsel of doom. His greatest insight was a positive one, explaining how some systems do succeed in spite of the pitfalls. Known as ‘Gall’s law’, it’s worth quoting in full:
A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.
Starting with a working simple system and evolving from there is how we went from the water wheel in Godalming to the modern electric grid. It is how we went from a hunk of
germanium, gold foil, and hand-soldered wires
in 1947 to transistors being etched onto silicon wafers in their trillions today.
This is a dynamic we can experience on a personal as well as a historical level. A trivial but revealing example is the computer game Factorio. Released in 2012 and famously hazardous to the productivity of software engineers everywhere, Factorio invites players to construct a factory. The ultimate goal is to launch a rocket, a feat that requires the player to produce thousands of intermediate products through dozens of complicated, interlocking manufacturing processes.
It sounds like a nightmare. An early flow chart (pictured – it has grown much more complicated since) resembles the end product of a particularly thorny systems thinking project. But players complete its daunting mission successfully, without reference to such system maps, in their thousands, and all for fun.
Factorio production map.
Image
The genius of the game is that it lets players begin with a simple system that works. As you learn to produce one item, another is unlocked. If you get something wrong, the factory visibly grinds to a halt while you figure out a different approach. The hours tick by, and new systems – automated mining, oil refining, locomotives – are introduced and iterated upon. Before you realize it, you have built a sprawling yet functioning system that might be more sophisticated than anything you have worked on in your entire professional career.
How to build systems that work
Government systems, however, are already established, complicated, and relied upon by millions of people every day. We cannot simply switch off the health system and ask everyone to wait a few years while we build something better. The good news is that the existence of an old, clunky system does not stop us from starting something new and simple in parallel.
In the 1950s, the US was in a desperate race against a technologically resurgent Soviet Union. The USSR took the lead in developing advanced rockets of the type that launched Sputnik into orbit and risked launching a nuclear device into Washington, DC. In 1954, the Eisenhower administration tasked General Bernard Schriever with helping the US develop its own Intercontinental Ballistic Missile (ICBM). An experienced airman and administrator, the top brass felt that Schriever’s Stanford engineering master’s degree would make him a suitable go-between for the soldiers and scientists on this incredibly technical project (its
scope
was larger even than the Manhattan Project, costing over $100 billion in 2025 dollars versus the latter’s $39 billion).
The organizational setup Schriever inherited was not fit for the task. With many layers of approvals and subcommittees within subcommittees, it was a classic example of a complex yet dysfunctional system. The technological challenges posed by the ICBM were extreme: everything from rocket engines to targeting systems to the integration with nuclear warheads had to be figured out more or less from scratch. This left no room for bureaucratic delay.
Schriever produced what many systems thinkers would recognize as a kind of systems map: a series of massive boards setting out all the different committees and governance structures and approvals and red tape. But the point of these
‘spaghetti charts’
was not to make a targeted, systems thinking intervention. Schriever didn’t pretend to be able to navigate and manipulate all this complexity. He instead recognized his own limits. With the Cold War in the balance, he could not afford to play and lose his equivalent of the Beer Game. Charts in hand, Schriever persuaded his boss that untangling the spaghetti was a losing battle: they needed to start over.
They could not change the wider laws, regulations, and institutional landscape governing national defense. But they could work around them, starting afresh with a simple system outside the existing bureaucracy. Direct vertical accountability all the way to the President and a free hand on personnel enabled the program to flourish. Over the following years, four immensely ambitious systems were built in record time. The uneasy strategic stalemate that passed for stability during the Cold War was restored, and the weapons were never used in anger.
When we look in more detail at recent public policy successes, we see that this pattern tends to hold. Operation Warp Speed in the US played a big role in getting vaccines delivered quickly. It did so by
bypassing many of the usual bottlenecks
. For instance, it made heavy use of ‘Other Transaction Authority agreements’ to commit $12.5 billion of federal money by March 2021, circumventing the thousands of pages of standard procurement rules. Emergency powers were deployed to accelerate the FDA review process, enabling clinical trial work and early manufacturing scale-up to happen in parallel. These actions were funded through an $18 billion commitment made largely outside the typical congressional appropriation oversight channels – enough money to back not just one vaccine candidate but
six, across three different technology platforms.
In France, the rapid reconstruction of Notre-Dame after the April 2019 fire has become a symbol of French national pride and its ability to get things done despite a reputation for moribund bureaucracy. This was achieved not through wholesale reform of that bureaucracy but by quickly setting up a fresh structure outside of it. In July 2019, the French Parliament passed Loi n° 2019-803, creating an extraordinary legal framework for the project. Construction permits and zoning changes were fast-tracked. President Macron personally appointed the veteran General Jean-Louis Georgelin to run the restoration, exempting him from the mandatory retirement age for public executives in order to do so.
The long-term promise of a small working system is that over time it can supplant the old, broken one and produce results on a larger scale. This creative destruction has long been celebrated in the private sector, where aging corporate giants can be disrupted by smaller, simpler startups: we don’t have to rely on IBM to make our phones or laptops or Large Language Models. But it can work in the public sector too. Estonia, for example, introduced electronic ID in the early 2000s for signing documents and filing online tax returns. These simple applications, which nonetheless took enormous focus to implement, were popular, and ‘digital government’ was
gradually expanded
to new areas: voting in 2005, police in 2007, prescriptions in 2010, residency in 2014, and even e-divorce in 2024. By 2025, 99 percent of residents will have an electronic ID card, digital signatures are
estimated
to save two percent of GDP per year, and every state service runs online.
In desperate situations, such as a Cold War arms race or COVID-19, we avoid complex systems and find simpler workarounds. But, outside of severe crises, much time is wasted on what amounts to magical systems thinking. Government administrations around the world, whose members would happily admit their incompetence to fix a broken radio system, publish manifestos, strategies, plans, and priorities premised on disentangling systems problems that are orders of magnitude more challenging. With each ‘fix’, oversight bodies, administrative apparatus, and overlapping statutory obligations accumulate. Complexity is continuing to rise, outcomes are becoming worse, and voters’ goodwill is being eroded.
We will soon be in an era where humans are not the sole authors of complex systems. Sundar Pichai estimated in late 2024 that over 25 percent of Google’s code was AI generated; as of mid-2025, the figure for Anthropic is
80–90 percent
. As in the years after the Second World War, the temptation will be to use this vast increase in computational power and intelligence to ‘solve’ systems design for once and for all. But the same laws that limited Forrester continue to bind: ‘NEW SYSTEMS CREATE NEW PROBLEMS’ and ‘THE SYSTEM ALWAYS KICKS BACK’. As systems become more complex, they become more chaotic, not less. The best solution remains humility, and a simple system that works.
"Learning how to Learn" will be next generation's most needed skill
Demis Hassabis, CEO of Google's artificial intelligence research company DeepMind, right, and Greece's Prime Minister Kyriakos Mitsotakis discuss the future of AI, ethics and democracy during an event at the Odeon of Herodes Atticus, in Athens, Greece, Friday, Sept. 12, 2025. Credit: AP Photo/Thanassis Stavrakis
A top Google scientist and 2024 Nobel laureate said Friday that the most important skill for the next generation will be "learning how to learn" to keep pace with change as Artificial Intelligence transforms education and the workplace.
Speaking at an ancient Roman theater at the foot of the Acropolis in Athens, Demis Hassabis, CEO of Google's DeepMind, said rapid technological change demands a new approach to learning and
skill development
.
"It's very hard to predict the future, like 10 years from now, in normal cases. It's even harder today, given how fast AI is changing, even week by week," Hassabis told the audience. "The only thing you can say for certain is that huge change is coming."
The neuroscientist and former chess prodigy said
artificial general intelligence
—a futuristic vision of machines that are as broadly smart as humans or at least can do many things as well as people can—could arrive within a decade. This, he said, will bring dramatic advances and a possible future of "radical abundance" despite acknowledged risks.
Hassabis emphasized the need for "meta-skills," such as understanding how to learn and optimizing one's approach to new subjects, alongside traditional disciplines like math, science and humanities.
"One thing we'll know for sure is you're going to have to continually learn ... throughout your career," he said.
Greece's Prime Minister Kyriakos Mitsotakis, center, and Demis Hassabis, CEO of Google's artificial intelligence research company DeepMind, right, discuss the future of AI, ethics and democracy as the moderator Linda Rottenberg, Co-founder & CEO of Endeavor looks on during an event at the Odeon of Herodes Atticus in Athens, Greece, Friday, Sept. 12, 2025. Credit: AP Photo/Thanassis Stavrakis
Demis Hassabis, CEO of Google's artificial intelligence research company DeepMind, bottom right, and Greece's Prime Minister Kyriakos Mitsotakis, bottom center, discuss the future of AI, ethics and democracy during an event at the Odeon of Herodes Atticus, under Acropolis ancient hill, in Athens, Greece, Friday, Sept. 12, 2025. Credit: AP Photo/Thanassis Stavrakis
The DeepMind co-founder, who established the London-based research lab in 2010 before Google acquired it four years later, shared the 2024 Nobel Prize in chemistry for developing AI systems that accurately predict protein folding—a breakthrough for medicine and drug discovery.
Greek Prime Minister Kyriakos Mitsotakis joined Hassabis at the Athens event after discussing ways to expand AI use in government services. Mitsotakis warned that the continued growth of huge tech companies could create great global financial inequality.
"Unless people actually see benefits, personal benefits, to this (AI) revolution, they will tend to become very skeptical," he said. "And if they see ... obscene wealth being created within very few companies, this is a recipe for significant social unrest."
Mitsotakis thanked Hassabis, whose father is Greek Cypriot, for rescheduling the presentation to avoid conflicting with the European basketball championship semifinal between Greece and Turkey. Greece later lost the game 94-68.
Citation
:
Google's top AI scientist says 'learning how to learn' will be next generation's most needed skill (2025, September 13)
retrieved 13 September 2025
from https://techxplore.com/news/2025-09-google-ai-scientist-generation-skill.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.
Yesterday I released
486Tang
v0.1 on GitHub. It’s a port of the ao486 MiSTer PC core to the Sipeed Tang Console 138K FPGA. I’ve been trying to get an x86 core running on the Tang for a while. As far as I know, this is the first time ao486 has been ported to a non-Altera FPGA. Here’s a short write‑up of the project.
486Tang Architecture
Every FPGA board is a little different. Porting a core means moving pieces around and rewiring things to fit. Here are the major components in 486Tang:
Compared to ao486 on MiSTer, there are a few major differences:
Switching to SDRAM for main memory.
The MiSTer core uses DDR3 as main memory. Obviously, at the time of the 80486, DDR didn’t exist, so SDRAM is a natural fit. I also wanted to dedicate DDR3 to the framebuffer; time‑multiplexing it would have been complicated. So SDRAM became the main memory and DDR3 the framebuffer. The SDRAM on Tang is 16‑bit wide while ao486 expects 32‑bit accesses, which would normally mean one 32‑bit word every two cycles. I mitigated this by running the SDRAM logic at 2× the system clock so a 32‑bit word can be read or written every CPU cycle (“double‑pumping” the memory).
SD‑backed IDE.
On MiSTer, the core forwards IDE requests to the ARM HPS over a fast HPS‑FPGA link; the HPS then accesses a VHD image. Tang doesn’t have a comparable high‑speed MCU‑to‑FPGA interface—only a feeble UART—so I moved disk storage into the SD card and let the FPGA access it directly.
Boot‑loading module.
A PC needs several things to boot: BIOS, VGA BIOS, CMOS settings, and IDE IDENTIFY data (512 bytes). Since I didn’t rely on an MCU for disk data, I stored all of these in the first 128 KB of the SD card. A small boot loader module reads them into main memory and IDE, and then releases the CPU when everything is ready.
System bring-up with the help of a whole-system simulator
After restructuring the system, the main challenge was bringing it up to a DOS prompt. A 486 PC is complex—CPU and peripherals—more so than the game consoles I’ve worked on. The ao486 CPU alone is >25K lines of Verilog, versus a few K for older cores like M68K. Debugging on hardware was painful: GAO builds took 10+ minutes and there were many more signals to probe. Without a good plan, it would be unmanageable and bugs could take days to isolate—not viable for a hobby project.
My solution was Verilator for subsystem and whole‑system simulation. The codebase is relatively mature, so I skipped per‑module unit tests and focused on simulating subsystems like VGA and a full boot to DOS. Verilator is fast enough to reach a DOS prompt in a few minutes—an order of magnitude better if you factor in the complete waveforms you get in simulation. The trick, then, is surfacing useful progress and error signals. A few simple instrumentation hooks were enough for me:
Bochs BIOS can print debug strings to port 0x8888 in debug builds. I intercept and print these (the yellow messages in the simulator). The same path exists on hardware—the CPU forwards them over UART—so BIOS issues show up immediately without waiting for a GAO build.
Subsystem‑scoped tracing. For Sound Blaster, IDE, etc., I added
--sound
,
--ide
flags to trace I/O operations and key state changes. This is much faster than editing Verilog or using GAO.
Bochs BIOS assembly listings are invaluable. I initially used a manual disassembly—old console habits—without symbols, which was painful. Rebuilding Bochs and using the official listings solved that.
A lot of the bugs were in the new glue I added, as expected. ao486 itself is mature. Still, a few issues only showed up on this toolchain/hardware, mostly due to
toolchain behavior differences
. In one case a variable meant to be static behaved like an automatic variable and didn’t retain state across invocations, so a CE pulse never occurred. Buried deep, it took a while to find.
Here’s a simulation session. On the left the simulated 486 screen. On the right is the simulator terminal output. You can see the green VGA output and yellow debug output, along with other events like INT 15h and video VSYNCs.
Performance optimizations
With simulation help, the core ran on Tang Console—just not fast. The Gowin GW5A isn’t a particularly fast FPGA. Initial benchmarks put it around a 25 MHz 80386.
The main obstacle to clock speed is long combinational paths. When you find a critical path, you either shorten it or pipeline it by inserting registers—both risks bugs. A solid test suite is essential; I used
test386.asm
to validate changes.
Here are a few concrete wins:
Reset tree and fan-out reduction.
Gowin’s tools didn’t replicate resets aggressively enough (even with “Place → Replicate Resources”). One reset net had >5,000 fan-out, which ballooned delays. Manually replicating the reset and a few other high‑fan-out nets helped a lot.
Instruction fetch optimization.
A long combinational chain sat in the decode/fetch interface. In
decoder_regs.v
, the number of bytes the fetcher may accept was computed using the last decoded instruction’s length:
reg [3:0] decoder_count;
assign acceptable_1 =4'd12- decoder_count + consume_count;
always @(posedge clk) begin ...
decoder_count <= after_consume_count + accepted;
end
Here,
12
is the buffer size,
decoder_count
is the current occupancy, and
consume_count
is the length of the outgoing instruction. Reasonable—but computing
consume_count
(opcode, ModR/M, etc.) was on the Fmax‑limiting path. By the way, this is one of several well-known problems of the x86 - variable length instructions complicating decoding, another is complex address modes and “effective address” calculation.
The fix was to drop the dependency on
consume_count
:
assign acceptable_1 =4'd12- decoder_count;
This may cause the fetcher to “under‑fetch” for one cycle because the outgoing instruction’s space isn’t reclaimed immediately. But
decoder_count
updates next cycle, reclaiming the space. With a 12‑byte buffer, the CPI impact was negligible and Fmax improved measurably on this board.
TLB optimization.
The Translation Lookaside Buffer (TLB) is a small cache that translates virtual to physical addresses. ao486 uses a 32‑entry fully‑associative TLB with a purely combinational read path—zero extra cycles, but a long path on every memory access (code and data).
DOS workloads barely stress the TLB; even many 386 extenders use a flat model. As a first step I converted the TLB to 4‑way set‑associative. That’s simpler and already slightly faster than fully‑associative for these workloads. There’s room to optimize further since the long combinational path rarely helps.
A rough v0.1 end‑to‑end result: about +35% per Landmark 6 benchmarks, reaching roughly 486SX‑20 territory.
Reflections
Here are a few reflections after the port:
Clock speed scaling.
I appreciate the lure of the megahertz race now. Scaling the whole system clock was the most effective lever—more so than extra caches or deeper pipelines at this stage. Up to ~200–300 MHz, CPU, memory, and I/O can often scale together. After that, memory latency dominates, caches grow deeper, and once clock speeds stop increasing, multiprocessing takes over—the story of the 2000s.
x86 vs. ARM.
Working with ao486 deepened my respect for x86’s complexity. John Crawford’s 1990 paper “The i486 CPU: Executing Instructions in One Clock Cycle” is a great read; it argues convincingly against scrapping x86 for a new RISC ISA given the software base (10K+ apps then). Compatibility was the right bet, but the baggage is real. By contrast, last year’s ARM7‑based
GBATang
felt refreshingly simple: fixed‑length 32‑bit instructions, saner addressing, and competitive performance. You can’t have your cake and eat it.
So there you have it—that’s 486Tang in v0.1. Thanks for reading, and see you next time.
NASA Rover Finds ‘Potential Biosignature’ on Mars
403 Media
www.404media.co
2025-09-13 14:00:32
Strange “leopard spots” on Mars are the most promising signs of alien life on the planet yet, but they could also have a geological origin....
Welcome back to the Abstract! These are the studies this week that broke ice, broke hearts, and broke out the libations. Also, if you haven’t seen it already, we just covered
an amazing breakthrough
in our understanding of the cosmos, which is as much a story about humanity’s endless capacity for ingenuity as it is about the wondrous nature of black holes.
Scientists have discovered Arctic algae moving around with ease in icy environments of -15°C (5°F)—the lowest temperatures ever recorded for motility in a eukaryotic lifeform. While some simple microbes can survive lower temperatures, this is the first time that scientists have seen eukaryotic life—organisms with more complex cells containing a nucleus—able to live, thrive, and locomote in such chilly environments.
It’s amazing that these so-called “ice diatoms” can move around at all, but it’s even cooler that they do it in style with a gliding mechanism that researchers describe as a “‘skating’ ability.” Their secret weapon? Mucus threads (“mucilage”) that they use like anchors to pull themselves through frozen substrates.
“The unique ability of ice diatoms to glide on ice” enables them “to thrive in conditions that immobilize other marine diatoms,” said researchers led by Qing Zhang of Stanford University.
An Arctic diatom, showing the actin filaments that run down its middle and enable its skating motion. Image: Prakash Lab
Zhang and her colleagues made this discovery by collecting ice cores from 12 locations around the Arctic Chukchi Sea during a 2023 expedition on the research vessel
Sikuliaq
, which is owned by the National Science Foundation (NSF) and operated by the University of Alaska Fairbanks. Unfortunately, this is a research area that could be destroyed by the Trump administration, with
NSF facing 70 percent cuts
to its polar research budget.
If lifeforms are doing triple axels in Arctic ice on Earth, it’s natural to wonder whether alien organisms may have emerged elsewhere. To that end, scientists announced the discovery of a tantalizing hint of possible life on Mars this week.
NASA’s Perseverance rover turned up organic carbon-bearing mudstones that preserve past redox reactions, which involves the transfer of electrons between substances resulting in one being “reduced” (gaining electrons) and one being “oxidized” (losing electrons). The remnants of those reactions look like “leopard spots” in the Bright Angel formation of Jezero Crater, where the rover landed in 2021, according to the study.
The “leopard spots” at Bright Angel. Image: NASA/JPL-Caltech/MSSS
This is not slam-dunk evidence of life, as the reactions can be geological in origin, but they “warrant consideration as ‘potential biosignatures.”
“This assessment is further supported by the geological context of the Bright Angel formation, which indicates that it is sedimentary in origin and deposited from water under habitable conditions,” said researchers led by Joel Hurowitz of Stony Brook University.
The team added that the best way to confirm the origin of the ambiguous structures is to bring Perseverance’s samples back to Earth for further study as part of the Mars Sample Return (MSR) program. Unfortunately, the
Trump administration wants to cancel
MSR. It seems that even when we have nice things, we still can’t have nice things, a paradox that we all must navigate together.
About 150 million years ago, a pair of tiny pterodactyls—just days or weeks old—were trying to fly through a cataclysmic storm. But the wind was strong enough to break the bones of their baby wings, consigning them to a watery grave in the lagoon below.
Now, scientists describe how the very storm that cut their lives short also set them up for a long afterlife as exquisitely preserved fossils, nicknamed Lucky and Lucky II, in Germany's Solnhofen limestone.
Fossils of Lucky II. Image: University of Leicester
“Storms caused these pterosaurs to drown and rapidly descend to the bottom of the water column, where they were quickly buried in storm-generated sediments, preserving both their skeletal integrity and soft tissues,” said researchers led by Robert Smyth of the University of Leicester.
“This catastrophic taphonomic pathway, triggered by storm events, was likely the principal mechanism by which small- to medium-sized pterodactyloids…entered the Solnhofen assemblage,” they added.
While it’s sad that these poor babies had such short lives, it’s astonishing that such a clear cause of death can be established 150 million years later. Rest in peace, Lucky and Lucky II.
Trump’s aid cuts could cause millions of deaths from tuberculosis alone
The Trump administration’s gutting of the United States Agency for International Development (USAID), carried out in public fashion by Elon Musk and DOGE, will likely cause millions of excess deaths from tuberculosis (TB) by 2030, reports a sobering new study.
“Termination of US funding could result in an estimated 10.6 million additional TB cases and 2.2 million additional TB deaths during the period 2025–2030,” said researchers led by Sandip Mandal of the Center for Modeling and Analysis at Avenir Health. “The loss of U.S. funding endangers global TB control efforts” and “potentially puts millions of lives at risk.”
Beyond TB, the overall death toll from the loss of USAID is estimated to reach
14 million deaths
by 2030. The destruction of USAID must never be memory-holed as it is shaping up to be among the most deadly actions ever enacted by a government outside of war.
In more bad news, it turns out that the bacteria that’s responsible for making a lot of Earth’s oxygen is highly vulnerable to human-driven climate change.
Prochlorococcus
, the most abundant photosynthetic organism on Earth, is the source of about 20 percent of the oxygen in our biosphere. But rapidly warming seas could set off “a possible 17–51 percent reduction in
Prochlorococcus
production in tropical oceans,” according to a new study.
“
Prochlorococcus
division rates appear primarily determined by temperature, increasing exponentially to 28°C, then sharply declining,” said researchers led by François Ribalet of the University of Washington. “Regional surface water temperatures may exceed this range by the end of the century under both moderate and high warming scenarios.”
It’s possible that this vital bacteria will adapt by moving to higher latitudes or by evolving more heat-tolerant variants. But that seems like a big gamble on something as important as Earth’s oxygen budget.
We are far from the first generation to live through unstable times, as evidenced by a new study about the “climatic change and economic upheaval” in Britain during the transition from the Bronze Age to the Iron Age about 3,000 years ago.
These disruptions were traumatic, but they also galvanized new modes of community connection—a.k.a epic parties where people ate, drank, made merry, and dumped the remnants of their revelry in trashpiles called “middens.”
East Chisenbury midden under excavation. Image: Cardiff University
“These vast mounds of cultural debris represent the coming together of vast numbers of people and animals for feasts on a scale unparalleled in British prehistory,” said researchers led by Carmen Esposito of Cardiff University. “This study, the largest multi-isotope faunal dataset yet delivered in archaeology, has demonstrated that, despite their structural similarities, middens had diverse roles.”
"Given the proximity of all middens to rivers, it is likely that waterways played a role in the movement of people, objects and livestock,” the team added. “Overall, the research points to the dynamic networks that were anchored on feasting events during this period and the different, perhaps complementary, roles that different middens had at the Bronze Age-Iron Age transition.”
This is my Zig text editor. It is under active development, but usually stable
and is my daily driver for most things coding related.
Requirements
A modern terminal with 24bit color and, ideally, kitty keyboard protocol support. Kitty,
Foot and Ghostty are the only recommended terminals at this time. Most other terminals
will work, but with reduced functionality.
NerdFont support. Either via terminal font fallback or a patched font.
Linux, MacOS, Windows, Android (Termux) or FreeBSD.
Install latest nightly build and (optionally) specify the installation destination:
curl -fsSL https://flow-control.dev/install | sh -s -- --nightly --dest ~/.local/bin
See all avalable options for the installer script:
curl -fsSL https://flow-control.dev/install | sh -s -- --help
Or check your favorite local system package repository.
Building
Make sure your system meets the requirements listed above.
Flow builds with zig 0.14.1 at this time. Build with:
zig build -Doptimize=ReleaseSafe
Zig will by default build a binary optimized for your specific CPU. If you get illegal instruction errors add
-Dcpu=baseline
to the build command to produce a binary with generic CPU support.
Thanks to Zig you may also cross-compile from any host to pretty much any
target. For example:
Flow Control is a single statically linked binary. No further runtime files are required.
You may install it on another system by simply copying the binary.
Files to load may be specifed on the command line:
The last file will be opened and the previous files will be placed in reverse
order at the top of the recent files list. Switch to recent files with Ctrl-e.
Common target line specifiers are supported too:
Or Vim style:
Use the --language option to force the file type of a file:
flow --language bash ~/.bash_profile
Show supported language names with
--list-languages
.
See
flow --help
for the full list of command line options.
Configuration
Configuration is mostly dynamically maintained with various commands in the UI.
It is stored under the standard user configuration path. Usually
~/.config/flow
on Linux. %APPDATA%\Roaming\flow on Windows. Somewhere magical on MacOS.
There are commands to open the various configuration files, so you don't have to
manually find them. Look for commands starting with
Edit
in the command palette.
File types may be configured with the
Edit file type configuration
command. You
can also create a new file type by adding a new
.conf
file to the
file_type
directory. Have a look at an existing file type to see what options are available.
Logs, traces and per-project most recently used file lists are stored in the
standard user application state directory. Usually
~/.local/state/flow
on
Linux and %APPDATA%\Roaming\flow on Windows.
Key bindings and commands
Press
F4
to switch the current keybinding mode. (flow, vim, emacs, etc.)
Press
ctrl+shift+p
or
alt+x
to show the command palette.
Press
ctrl+F2
to see a full list of all current keybindings and commands.
Run the
Edit keybindings
command to save the current keybinding mode to a
file and open it for editing. Save your customized keybinds under a new name
in the same directory to create an entirely new keybinding mode. Keybinding
changes will take effect on restart.
Terminal configuration
Kitty, Ghostty and most other terminals have default keybindings that conflict
with common editor commands. I highly recommend rebinding them to keys that are
not generally used anywhere else.
For Kitty rebinding
kitty_mod
is usually enough:
For Ghostty each conflicting binding has to be reconfigured individually.
Features
fast TUI interface. no user interaction should take longer than one frame (6ms) (even debug builds)
tree sitter based syntax highlighting
linting (diagnostics) and code navigation (goto definition) via language server
multi cursor editing support
first class mouse support (yes, even with a scrollbar that actually works properly!) (Windows included)
vscode compatible keybindings (thanks to kitty keyboard protocol)
vim compatible keybindings (the standard vimtutor bindings, more on request)
user configurable keybindings
excellent unicode support including 2027 mode
hybrid rope/piece-table buffer for fast loading, saving and editing with hundreds of cursors
theme support (compatible with vscode themes via the flow-themes project)
infinite undo/redo (at least until you run out of ram)
find in files
command palette
stuff I've forgotten to mention...
Features in progress (aka, the road to 1.0)
completion UI/LSP support for completion
persistent undo/redo
file watcher for auto reload
Features planned for the future
multi tty support (shared editor sessions across multiple ttys)
multi user editing
multi host editing
Community
Join our
Discord
server or use the discussions section here on GitHub
to meet with other Flow users!
Link Graveyard: A snapshot of my abandoned browser tabs
I went to close a bunch of browser tabs, but realized I have some cool stuff in here. Some has been
marinating for a while. Most of these I’ve read, or tried to read.
Cracks are forming in Meta’s partnership with Scale AI | TechCrunch
IIRC they draw parallels between attention and graphs and argue that LLMs
are
graph neural nets, meaning
that they can be used to look at graphs and guess what connections are missing.
I don’t think I posted anything on this, because while I find the idea fascinating, I couldn’t figure out how
to make it feel tangible.
Beyond Turing: Memory-Amortized Inference as a Foundation for Cognitive Computation
I only read a little and gave up. This feels like a good take, maybe. Inside my own head I completely punt
on having a take on AI consciousness and opt instead for the “don’t be a dick” rule. Idk, maybe they are
maybe they aren’t, I’ll just live in the moment.
Zuck’s treatise on AI. I didn’t read. Normally I try to make an attempt to read these sorts of takes, or at least
skim them, but I was busy at work. I had it loaded up on my phone to read on a plane, but it wouldn’t load once
I was off WiFi. Sad.
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
The GLM-4.5 paper. This was a super interesting model. It feels like it breaks
the “fancy model” rule
in that it’s very architecturally cool but the personality doesn’t feel like it’s been squished out.
Blog | Dwarkesh Podcast | Dwarkesh Patel | Substack
It’s a good blog, what can I say. Definitely on the over-hype side, but he’s got real takes and seems so intent
on getting to the truth that he spends a lot of time on geopolitics just simply to understand AI dynamics. Mad
respect.
Technical Deep-Dive: Curating Our Way to a State-of-the-Art Text Dataset
I forget why I ended up here, but it’s an excellent post. I think this is connected to my project at work training
a model. This post brings up a ton of data curation techniques.
I’ve recently learned and fully accepted that
ALL
major LLM advances come down to data. Yes, the architectural advances are cool and fun to talk about,
but any meaningful progress has come from higher quality, higher quantity, or cheaper data.
Cool paper about auto-discovery of model architectures. IIRC they took a bunch of model architecture ideas,
like group attention and mixture of experts, and used algorithms to mix and match all the parameters and
configurations until something interesting popped out. It feels like a legitimately good way to approach
research.
WebShaper: Agentically Data Synthesizing via Information-Seeking Formalization
Classic paper. I read this one for work. I was trying to appreciate what Alignment & Uniformity measure and
why they’re important. This was the paper that formalized those measures. It’s actually a pretty good paper
to read, albeit 20 years old.
Train LLMs Faster, Better, and Smaller with DatologyAI’s Data Curation
What? This is amazing. I don’t think I even looked at it, sad. Actually, now that I’m reading this I’m recalling
that’s how I ended up on the
Graph Neural Network
link.
IIRC this is saying that LLMs can be highly intelligent because they incorporate the best parts of a huge
number of people. IMO this is spiritually the same as my
Three Plates
blog
post where I explain how unit tests, which are inherently buggy, can improve the overall quality of a system.
GitHub - gepa-ai/gepa: Optimize prompts, code, and more with AI-powered Reflective Text Evolution
This was a fascinating one. I colleague tried convincing me of this but I didn’t buy it until I read this paper.
It makes a ton of sense. I have a simplified
bluesky thread here
.
tl;dr — embedding vectors have trouble representing compound logic (“horses”
AND
“Chinese military movements”)
and generally fall apart quickly. It’s not that it’s not possible, it’s that it’s not feasible to cram that
much information into such a small space.
[2107.05720] SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking
I ran into this while diving into the last link. It’s an older (2021) paper that has some potential for addressing
the problems with embeddings. Realistically, I expect late interaction multi-vectors to be the end answer.
A super cool model that uses no-op MoE experts to dynamically turn down the amount of compute per token.
Unfortunately, this one didn’t seem to be embraced by the community.
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
More embedding links. Now that I’m scanning it, I’m not sure it really soaked in the first time. They seem
to have solved a lot of the problems with other late interaction methods. Maybe I should take a deeper
look.
modeling_longcat_flash.py · meituan-longcat/LongCat-Flash-Chat at main
Fascinating break down of vLLM. If you’re not familiar, vLLM is like Ollama but actually a good option if
you want to run it in production. Don’t run Ollama in production, kids, KV caches are good.
Honestly, this is absolutely worth your time if AI infrastructure is your jam (or you just want it to be).
It goes into all the big concepts that an AI infra engineer needs to know. TBQH I love the intersection of
AI & hardware.
Someone sent me this link and there was a reason, I know it. I just don’t remember why. IIRC it was because I
brought up the
A Case For Learned Indices
paper and they pointed me to this
whole treasure trove of papers that (sort of) evolved out of that. Basically traditional algorithms re-implemented
using machine learning.
Oh, this was a great podcast. Well, I didn’t like the host but
@kalomaze
is worth
following. Apparently only 20yo, never attempted college but a talented AI researcher nonetheless.
Been thinking about how he described the outer layer of hell as consisting of people living equidistant from
each other because they can’t stand anyone else. It was written like 100 years ago but feels like a commentary on
today’s politics.
Claude Code: Behind-the-scenes of the master agent loop
Actually, this is pretty detailed breakdown of Claude Code. They seem to have decompiled the code without
de-obfuscating it, which leads to some kind of silly quotes. But it’s good.
Airia AI Platform | Build, Deploy & Scale Enterprise AI
Seems like the new Apple M19 chip has real matrix multiplication operations. Previous generations had
excellent memory bandwidth, this gives it matching compute (on AI-friendly workloads). So I guess Macs will
stay relevant for a while.
Poland closest to open conflict since World War Two, PM says after Russian drones shot down - live updates - BBC News
Looked this up as a tangent off the last link. Groq (not Grok) designed their ASIC to be fully deterministic
from the ground up, and then built a really cool distributed system around it that assumes fully synchronous
networking (not packet switching like TCP). It’s an absolutely crazy concept.
Levanter — Legible, Scalable, Reproducible Foundation Models with JAX
Absolutely fascinating. I only read the blog, not the paper, but it frames RL as a 2-stage process where
RL is mostly slinging together discrete skills (learned during pre-training).
It’s not an auto-curriculum RL paper AFAICT, it’s just a huge improvement in RL efficiency by focusing only
on the “pivot” tokens.
An MCP server that lets you search code while respecting the structure. I’ve heard some very positive things
as well as “meh” responses on this. I’m sure real usage is a bit nuanced.
Life, Maybe, On Mars, Unless We Change Our Minds | Science | AAAS
A baffled bride has solved the mystery of the awkward-looking stranger who crashed her wedding four years ago.
Michelle Wylie and her husband, John, registered the presence of their unidentifiable guest only as they looked through photographs of their wedding in the days after the happy occasion.
Who was the tall man in a dark suit, distinguished by the look of quiet mortification on his face? But their family and friends could offer no explanation, nor could hotel staff at the Carlton hotel in Prestwick, where the event took place in November 2021. An appeal on
Facebook
likewise yielded no clues.
Eventually, with the mystery still niggling, Wylie asked the popular Scottish content creator Dazza to cast the online net wider – and a sheepish Andrew Hillhouse finally stepped forward.
In his explanatory post on Facebook, Hillhouse admitted that he had been “cutting it fine, as I’m known to do” when he pulled up at the wedding venue with five minutes to spare. Spotting a piper and other guests, he followed them into the hotel – “I remember thinking to myself: ‘Cool, this is obviously the right place’” – unaware that he had the address completely wrong and was supposed to be at a ceremony 2 miles away in Ayr.
Michelle and John enjoy their wedding, unaware of the crasher.
Photograph: Courtesy Michelle Wylie/SWNS
He was initially unperturbed to find himself surrounded by strangers as the ceremony began – at the marriage he was due to attend, the only person he knew was the bride, Michaela, while his partner, Andrew, was part of the wedding party. It was when an entirely different bride came walking down the aisle that he realised: “OMG that’s not Michaela … I was at the wrong wedding!”
Hillhouse said: “You can’t exactly stand up and walk out of a wedding mid-ceremony, so I just had to commit to this act and spent the next 20 minutes awkwardly sitting there trying to be as inconspicuous as my 6ft 2 ass could be.”
At the end of the ceremony, Hillhouse, who is from Troon, was hoping to make a discreet exit, only to be waylaid by the wedding photographer, who insisted he join other guests for a group shot. He can be spotted looming uncomfortably at the very back of the crowd.
His post continued: “Rushed outside, made some phone calls and made my way to the correct wedding, where I was almost as popular as the actual bride and groom, and spent most of the night retelling that story to people.”
For Michelle Wylie, this amiable resolution brings to a close years of speculation.
Hillhouse said the wedding photographer insisted he join other guests for a group shot.
Photograph: Courtesy Michelle Wylie/SWNS
She told BBC Scotland: “It would come into my head and I’d be like: ‘Someone must know who this guy is.’ I said a few times to my husband: ‘Are you sure you don’t know this guy, is he maybe from your work?’ We wondered if he was a mad stalker.”
She is now Facebook friends with Hillhouse and the pair have met in person to cement their coincidental bond.
“I could not stop laughing,” said Wylie. “We can’t believe we’ve found out who he is after almost four years.”
Setsum – order agnostic, additive, subtractive checksum
Setsum is an order agnostic, commutative checksum. It was developed by
Robert Escriva
at Dropbox’s metadata team. In this short post, I’ll explain why they’re used and the math behind them. Jump to the end if you’d like to see the code.
Introduction
Say you’re building a database replication system. The primary sends logical operations to replicas, which apply them in order:
After the replica processes these changes (add two apples, remove an orange), how do you verify both nodes ended up in the same state?
One naive (rather horrible) approach is to dump both states and compare them directly. It’s expensive, impractical, and doesn’t scale. Instead, you can maintain checksums that update with each operation. When you’re done, just compare the checksums; if they match, you’re in sync. That’s why distributed databases like Cassandra use Merkle trees for the same purpose.
Setsum is similar but has some nice properties that make it attractive over Merkle trees. They can be computed incrementally; the cost only depends on the change being applied, not the whole dataset. I also find them attractive because they let you remove items as well.
Properties
Setsum has some interesting properties:
1. Order doesn’t matter.
Both of these yield the same result:
The only state you need to maintain is 256 bits, and all operations are
O(len(msg))
instead of depending on your entire dataset.
The internals
Each Setsum is an array of 8 unsigned 32-bit integers (u32), called “columns”. Each column starts at 0 when you create a new Setsum. Each column has an associated large prime number (close to
u32::MAX
).
When you add an item:
Compute the SHA3-256 hash of the item (produces 32 bytes)
Split the hash into 8 chunks of 4 bytes each
Interpret each chunk as a little-endian u32
Add each number to its corresponding column
If the sum exceeds the column’s prime, store the remainder (mod prime)
You can also remove an item that was previously added. The magic is in computing the inverse: first, derive the inverse of the item’s hashed value, then add that inverse to the setsum. This effectively cancels out the original, removing the item from the set!
To compute the inverse, we use modular arithmetic: it’s simply the prime minus the value.
The math behind setsum
Disclaimer: If we’re friends, you already know I’m no math person. If not, hey there, new friend! You can probably skip this if you understand modulo arithmetic, the Chinese remainder theorem, and a bit of probability.
Let’s simplify: instead of 8 columns, let’s use just one. The prime number for this column is 29. Consider adding these items with their hash and inverse values:
Item
Hash
Inverse
apple
15
14
banana
23
6
chikoo
7
22
pomegranate
18
11
watermelon
26
3
Notice that all hash values stay below the prime. If a hash exceeds it, we take the remainder. For example, if `guava` hashes to 33, the final value would be 4 (33 mod 29). Also, hash + inverse always equals the prime.
Let’s add some items:
s =0s =15 (add apple -15)
s =9 (add banana -23)
s =16 (add chikoo -7)
Let’s try in some random order:
s =0s =7 (add chikoo -7)
s =22 (add apple -15)
s =16 (add banana -23) // see, this ends up same!
Let’s try removal. Note that for removal we add the inverse values:
s =0s =15 (add apple -15)
s =9 (add banana -23)
s =27 (add pomegranate -18)
s =12 (remove apple -14)
s =18 (remove banana -6)
s =25 (add chikoo -7)
s =0s =18 (add pomegranate)
s =25 (add chikoo) // whoa 🤯
I cherry-picked these examples to demonstrate setsum, but there’s a flaw in the above examples. Can you spot it?
Consider this collision:
s =0s =15 (add apple -15)
s =22 (add chikoo -7)
s =0s =18 (add pomegranate -18)
s =22 (add guava -4)
Both sets of completely different items sum to 22! This happens because we’re only using one column and a very small prime number. But add another column and the collision probability drops dramatically. With 8 columns, the probability of collision drops to
1
/
2
256
.
Setsum also uses SHA3-256 as its hash function, though the hash algorithm is replaceable. SHA3-256 is fast, has fewer collisions, and produces well-distributed hashes, so we can avoid the collision problem I showed above.
Observations
Setsum can tell you if states diverged, but not where. To narrow things down, you can split your data into smaller chunks and compare those. Build this into a hierarchical structure and you’re basically back to something like a Merkle tree.
You can remove items that never existed. This might or might not be a problem depending on your use case. Given that you’re only maintaining 256 bits of state, it’s a reasonable tradeoff.
There’s no history tracking. You can’t tell when or how states diverged, just that they did.
Code
The original Rust implementation is
here
. I ported it to Go, with all the same tests -
setsum
.
But...
does
our public webpage need all this stuff at all? Can't we
just... remove the admin interface completely?
No online page editor (we can run that as a separate local application
anyway..), no "upload this plugin to install"
5
, no account system
at all.
Genius! Why hadn't anyone thought of that before?
6
So I ran off to
SourceForge
7
to register
the project. I remember the pure
excitement
I felt, seeing the email a
few days later that it had been approved.
8
Well then. No excuses left, time to write some PHP!
9
Showing one page
is easy enough..
10
Close enough, and simple enough to be invulnerable!
12
We're done
here, time for
mellis
.
...okay, you might see where this is going. A while later, I saw
something concerning.
A
CVE
13
? For my toy project?!
It's kind of funny, there's this.. intense sense of dread about the
situation ("Oh $DEITY, what did I fuck up?!"). But to child me there
was also a surprise that there
was
someone out there. That someone was
paying attention to the thing that I had built. That.. it mattered, in
some faint way.
14
But back to the issue at hand.. What was the actual problem? A visit
from
Li'l Bobby Tables
.
The code assumes that
?page=1
is a number, but.. It's just text. It
could be anything. And the database would have no idea what part was the
query
, and what was the (untrusted)
data
. It could be
/?page=1 OR 1=1
. Which would get spliced into the database query,
returning some arbitrary page at random. Or it could modify the database
somehow, or return something more sensitive.
I think the big takeaway here is that.. I had decided that it
shouldn't
do the dangerous thing.
15
But hubris led me to think
that that was it. That I had the
one
brilliant idea that had Solved™
the security of the whole application.
I hadn't thought properly about how to constrain it, how to prevent it
from how to end up accidentally doing the bad thing anyway.
I should have used a prepared statement. Or escaped it. Or made sure
that it actually was a number, and nothing else. I should have made sure
to run it as an unprivileged database user, with no permission to modify
things anyway.
16
But I didn't know that. Didn't know that I
should
look for that.
Because I was Secure™.
At least, when I hear people today talking about how Secure™ they are..
I can't help but think back to 12-year-old me. And how Secure™ she
"was".
Because security isn't one magic trick. It's a process. And it's
learning how to learn, and
what
to learn. And we could all use some
humble pie on a regular basis.
Not that we're really out of it, I suppose. But it definitely
feels a lot less dominant these days.
↩
Well, this is from modern WordPress, because that's what
the
playground
has on offer... But
it's basically the same.
↩
Not that all of the details were around at the time.
↩
All of the actual code samples are going to be vaguely reproduced
from my memory; the CVEs are still around, but the code has long
since been lost to the sands of time.
↩
But isn't this completely vulnerable to
XSS
? Not
really under its security model, since the editors are assumed to be
trusted anyway.
↩
An extremely fast PHP linter, formatter, and static analyzer, written in Rust.
Mago
is a comprehensive toolchain for PHP that helps developers write better code. Inspired by the Rust ecosystem, Mago brings speed, reliability, and an exceptional developer experience to PHP projects of all sizes.
Mago is a community-driven project, and we welcome contributions! Whether you're reporting bugs, suggesting features, writing documentation, or submitting code, your help is valued.
OXC
: A major inspiration for building a high-performance toolchain in Rust.
Hakana
: For its deep static analysis capabilities.
Acknowledgements:
We deeply respect the foundational work of tools like
PHP-CS-Fixer
,
Psalm
,
PHPStan
, and
PHP_CodeSniffer
. While Mago aims to offer a unified and faster alternative, these tools paved the way for modern PHP development.
License
Mago is dual-licensed under your choice of the following:
The number of people in Japan aged 100 or older has risen to a record high of nearly 100,000, its government has announced.
Setting a new record for the 55th year in a row, the number of centenarians in Japan was 99,763 as of September, the health ministry said on Friday. Of that total, women accounted for an overwhelming 88%.
Japan has the world's longest life expectancy, and is known for often being home to the world's oldest living person - though some studies contest the actual number of centenarians worldwide.
It is also one of the fastest ageing societies, with residents often having a healthier diet but a low birth rate.
The oldest person in Japan is 114-year-old Shigeko Kagawa, a woman from Yamatokoriyama, a suburb of the city Nara. Meanwhile, the oldest man is Kiyotaka Mizuno, 111, from the coastal city of Iwata.
Health minister Takamaro Fukoka congratulated the 87,784 female and 11,979 male centenarians on their longevity and expressed his "gratitude for their many years of contributions to the development of society".
The figures were released ahead of Japan's Elderly Day on 15 September, a national holiday where new centenarians receive a congratulatory letter and silver cup from the prime minister. This year, 52,310 individuals were eligible, the health ministry said.
In the 1960s, Japan's population had the lowest proportion of people aged over 100 of any G7 country - but that has changed remarkably in the decades since.
When its government began the centenarian survey in 1963, there were 153 people aged 100 or over.
That figure rose to 1,000 in 1981 and stood at 10,000 by 1998.
The higher life expectancy is mainly attributed to fewer deaths from heart disease and common forms of cancer, in particular breast and prostate cancer.
Japan has low rates of obesity, a major contributing factor to both diseases, thanks to diets low in red meat and high in fish and vegetables.
The obesity rate is particularly low for women, which could go some way to explaining why Japanese women have a much higher life expectancy than their male counterparts.
As increased quantities of sugar and salt crept into diets in the rest of the world, Japan went in the other direction - with public health messaging successfully convincing people to reduce their salt consumption.
But it's not just diet. Japanese people tend to stay active into later life, walking and using public transport more than elderly people in the US and Europe.
Radio Taiso, a daily group exercise, has been a part of Japanese culture since 1928, established to encourage a sense of community as well as public health. The three-minute routine is broadcast on television and practised in small community groups across the country.
However, several studies have cast doubt on the validity of global centenarian numbers, suggesting data errors, unreliable public records and missing birth certificates may account for elevated figures.
The miscounting was attributed to patchy record-keeping and suspicions that some families may have tried to hide the deaths of elderly relatives in order to claim their pensions.
Hunterbrook Media’s investment affiliate, Hunterbrook Capital, does not have any positions related to this article at the time of publication. Positions may change at any time. Hunterbrook Media is working with litigators on potential lawsuits based on our investigation. If you are a victim, we invite you to share your story by emailing ideas@hntrbrk.com — where we source information for ongoing reporting.
You’re in what you thought would be your dream house — until it wasn’t.
The living room ceiling has been ripped out after sewage water backed up and flooded the upstairs bathroom. With the drywall gone, you can spot loose nails and concerning gaps between the floor joists. Rainwater seeps through the cracks around the front door.
Insects crawl through the window frames — even though the windows were reinstalled because they weren’t installed properly in the first place. And most of your bathrooms are unusable, awaiting repairs the builder promised more than a year ago.
It feels like a nightmare — but it’s reality, according to Danielle Antonucci, who invited a Hunterbrook Media reporter to the home she and her husband bought just four years ago in Sarasota, Florida, built by the nation’s largest homebuilder, D.R. Horton ($DHI). In an email provided to Hunterbrook, Antonucci desperately pleaded with D.R. Horton to address the numerous defects rendering their home nearly uninhabitable: “I keep getting the response that this matter has been escalated to the Sarasota office,” she wrote. “It has been 21 months!”
A photo of Antonucci’s living room. The ceiling has been ripped out since over a year ago, she said, after sewage water flooded from the upstairs bathroom. Source: Hunterbrook Media
Photos of numerous problems in Antonucci’s home, including poorly fastened floor joists, incomplete bathroom and bedroom repairs, and cracks along the main doorframe. Sources: Danielle Antonucci, Hunterbrook Media
“Physically, mentally, emotionally, financially, it’s been the biggest nightmare of my life,” Antonucci said, adding, “This is my full-time job now, dealing with this home.”
Antonucci’s makeshift office where she said she deals with D.R. Horton over the defects in her home pretty much full-time. Piles of records — her correspondence with various subcontractors, building manuals, architectural plans, county inspection records — sit neatly on tables. Source: Hunterbrook Media. Photos taken on April 25.
Antonucci and her family. Source: Danielle Antonucci
More than 60 homeowners across 16 states who purchased their dream home from one of the nation’s
two largest
residential homebuilders, D.R. Horton and Lennar ($LEN), shared similar accounts with Hunterbrook. They described extensive construction defects stemming from substandard workmanship, inferior materials, and blatant building code violations that sometimes make their homes unsafe and unlivable.
These homeowners also expressed profound frustration with the builders’ complex tactics to evade responsibility for these defects, leaving families out in the cold — sometimes literally.
“It’s been the biggest nightmare of my life.”
Danielle Antonucci
Take Leslie Montgomery, who said her family has had to live in hotels since county officials condemned her house after a mold infestation so severe that her previously healthy teenage son was unable to attend school.
Lennar offered to clean the ducts, according to Montgomery, downplaying the problem even after biochemical inspectors the company hired declared the home a total loss. The inspectors tried to reason with Lennar, saying there was “a sick kid involved,” according to Montgomery, but Lennar didn’t budge.
Their testimonies echo those of thousands of other homeowners who have desperately turned to social media platforms, official government channels, consumer review sites, and local news to demand answers on the construction defects that the companies refuse to acknowledge or address. Common complaints range from water intrusion, truss and joist deficiencies, ventilation problems, and missing or inadequate fireproofing or insulation, to foundation cracks, improper grading, and plumbing issues, many in violation of building codes.
A screen capture of Better Business Bureau consumer ratings and complaint summaries regarding Lennar and D.R. Horton, captured on May 1. Source: Better Business Bureau
Both D.R. Horton and Lennar promise that their mission to build affordable homes will not come at the cost of quality — even as they have told investors they would cut costs to offset diminishing margins amid a tightening housing market.
“You have to start value-engineering every component of the home, which means making compromises, not in quality, but in the way that you actually configure the homes,” Lennar CEO Stuart Miller said in an interview with
Bloomberg
Television last year.
D.R. Horton similarly
promised
its investors it would find ways to cut costs, like “replacing certain high quality fixtures and finishes with less expensive yet still high-quality fixtures and finishes.”
But many avoidable defects are caused by business practices that focus on building and selling quickly, with minimal concern for repeat business or quality control, according to Robert Knowles, president and founder of the National Association of Homeowners and a licensed professional engineer who said he has inspected thousands of new builds.
“There is no bonus for building the house to code, for quality,” Knowles said, to his knowledge. “There’s only bonuses for speed … and volume.” Knowles estimated 100% of all new builds probably have multiple code violations.
Knowles’ comments echo those of multiple other building experts and former employees interviewed by Hunterbrook, who accused the builders of cutting corners and neglecting safety measures. “I don’t know how they passed inspections, because there were so many violations,” one former D.R. Horton superintendent, who said they had left “because I do not want my name attached to that kind of work,” told Hunterbrook.
“They always used the cheapest subcontractor, and focused on speed rather than quality.”
Knowles estimated repair costs to be about $5,000 to $20,000 for these defects in a typical new home by these builders — assuming there are no major issues like siding or roofing that need replacing.
That’s far more than the $2,348 on average per home that D.R. Horton set aside last year in expected warranty costs, or Lennar’s set-aside of about $3,602 per home, according to their SEC disclosures.
A chart comparing the estimated range of repair costs in new builds to the amount D.R. Horton and Lennar set aside in expected warranty expenses last year. Source: SEC Edgar, Hunterbrook
Hunterbrook’s investigation suggests a step-by-step corporate playbook designed to push the cost of the defects to buyers by exploiting the vast power imbalance between the billion-dollar companies and middle-class buyers. D.R. Horton and Lennar do this through one-sided contracts that lock in buyers and insulate the companies from liability for defects in the homes they sell, while minimizing the buyer’s ability to seek legal recourse.
Sign Up
Breaking News & Investigations.
Right to Your Inbox.
No Paywalls.
No Ads.
The playbook starts by rushing shoppers — lured by glossy brochures, upgraded model homes, unbeatable loan offers, and assurances of expansive warranty coverage — into signing away their rights in contracts that make it nearly impossible for buyers to back out, even if major defects are found, according to homeowners.
After closing, many homeowners who uncover defects are confronted with a byzantine warranty process seemingly designed to outlast the homeowners’ willpower — or the warranty clock. Homeowners called the warranty a “sham” and described having to “hound” the company, “fighting tooth and nail” to try and get their problems addressed. One compared the experience to “performing a root canal on yourself.”
Even if the buyers succeed in this process, the companies often make cheap band-aid fixes that don’t last, forcing homeowners to repeat the cycle all over again. As one Lennar homeowner put it, “If they do, quote, attempt to repair something, you’re left with at least three to five new issues. … It’s very depressing. It becomes your full-time job.”
Many end up paying for the repairs themselves. Others, worried about property value, opt not to pry deeply into the problems and keep quiet.
Still others face problems so severe and expensive that they can’t pay for repairs out of pocket, leaving them stuck in a nightmare home that they can’t even sell.
“I’ve lived in the house almost four years. I’ve had no peace,” Kim Cardillo, a realtor who purchased a D.R. Horton home in a 55+ community in Port St. Lucie, Florida, in 2021 told Hunterbrook. “My credit’s wrecked because of this whole situation. So it’s like, where do I go?” She added, “I’m ready to just walk away, honestly. A couple weeks ago, in tears because it’s so stressful, I was just like, you know what, I’m just going to foreclose on the house.”
Many turn to legal action as a last resort, only to find they’ve waived their right to go to court by signing the purchase agreement. Instead, they are forced into a private arbitration system that critics
say
is rigged in favor of the builders.
“This concept of forced arbitration, it abuses the very sense of justice,” said Martha Perez-Pedemonti, a civil justice and consumer rights attorney with advocacy organization Public Citizen, who has spent years fighting against forced arbitration.
“You’re talking about someone’s housing, someone’s survival,” she told Hunterbrook, adding, “I think there’s very little else that’s more important to anyone.”
For the builders, the system seems to be working.
The two companies have remained hugely profitable even as their stock prices have tumbled more than 30% in the last year amid cooling demand and rising costs. In 2024, each company netted around $8 billion in gross profit from home sales — or $88,661 gross profit per home sold for D.R. Horton, and about $95,609 for Lennar.
The figures are based on Hunterbrook’s calculation of the estimated cost per homes, subtracted from the average selling price per home, based on D.R. Horton’s and Lennar’s annual statements to the SEC.
Lennar and D.R. Horton have sold more homes each year than in the previous year since 2014, and while their gross profit has declined in the last two years, it remains higher than at any point before the 2022 peak. Sources: SEC EDGAR database, Hunterbrook Media
In response to Hunterbrook’s request for comment, a D.R. Horton spokesperson said in an email, “D.R. Horton is proud to consistently deliver top-quality new homes across the United States, enabling more than 1,100,000 individuals and families to achieve the dream of homeownership since our founding in 1978.” They said the company provides “a robust new home warranty to our homebuyers” and the staff is “fully committed to customer satisfaction and respond to any warranty needs and concerns of our homeowners.”
Lennar did not respond to our request for comment.
Meanwhile, the lives of many homeowners across the country have been ruined.
Desperate, some of these victims have become citizen journalists themselves, filing public records requests; clandestinely recording conversations with the companies; becoming amateur construction experts; and poring over manufacturer’s inspection manuals and local building codes. They’re also speaking at city council meetings to advocate for stronger policies against the builders, engaging audiences on TikTok, Instagram, and even Discord, and picketing and putting up flyers along roadways to warn other shoppers.
From top left: A photo from a D.R. Horton homeowner posted on the Facebook group,
“Shoddy Construction of D.R. Horton!!
” which has 33,200 members. A photo of Lennar homebuyers protesting in 2022 uploaded to the Facebook group “
Lennar Homeowners – Complaints and Issues
,” which has 38,600 members. A photo of a sign erected by a protester and uploaded to the “Shoddy Construction of D.R. Horton!!” Facebook group. A viral Tiktok
video
by Ashley Frazier about her mold-infested Lennar home. A Lennar homeowner
speaking
to the Parkland, Florida, City Commission meeting on January 2021 about the problems he’s had with the builder. A YouTube
video
alleging major defects in D.R. Horton homes.
“It’s really hard to find the words that describe the nightmare of a situation that this has been,“ said Antonucci.
“This is the house from hell.”
Step 1: Rush to Close
As homeowners uncover defect after defect in their newly purchased homes, many make a second disturbing discovery: Long before closing day, they had already been ensnared in a system designed to prevent them from detecting these problems until it was too late to walk away.
Hunterbrook spoke with D.R. Horton and Lennar buyers who recalled being pressured to sign a deal and even feeling trapped in it by the threat of losing a substantial deposit.
Lennar sales reps seemed particularly prone to rushing people, according to Hunterbrook’s interviews.
Chris Holdridge closed on a Lennar home in 2023 in Riverview, Florida. He said he was told the house he wanted was the last one, and in order to get the discount offered on closing costs, he had to “sign right here, right now.”
“And I work in real estate. I don’t generally work with that type of pressure.”
Another Lennar homeowner, who wished to be anonymous out of fear of retribution from Lennar, told Hunterbrook a similar story. “I had to lock in. And they really pushed it. … They really make it out like they’re giving you an opportunity but you have to do it really fast.”
One possible explanation for these high-pressure tactics may be Lennar’s unique “
production-first
” business strategy, which aims to maintain production volume regardless of market conditions, relying on quick sales to clear all that inventory. Hunterbrook interviewed more than 20 Lennar homeowners, and most said they’d felt rushed by aggressive and sometimes misleading sales tactics.
Trap Buyers in Draconian Contract Terms
After signing the purchase agreement, Lennar and D.R. Horton buyers also said they were unable to back away without losing their deposit even after finding major defects.
Nesha Gee, who signed a purchase agreement in 2023 with Lennar on a home in Athens, Alabama, told Hunterbrook she backed out of the deal after a third-party inspector found significant defects in her house — but not without losing her $7,500 deposit.
Julie Biondolillo and her husband were also told they could not back out of their contract on a D.R. Horton home without losing their $25,000 down payment, after they found mold in the floor and walls. They only got the money back after they agreed to transfer the money toward another D.R. Horton home.
According to multiple Lennar and D.R. Horton contracts Hunterbrook reviewed, if the buyer delays or backs out of the deal for any reason — even for known construction defects like foundation cracks, grading problems, or “biological contaminants” like mold — they trigger an automatic default, giving the seller the right to claim the deposit money as liquidated damages in amounts up to 15% of the value of the home. Lennar’s contracts even state that the
builder’s
failure to obtain a certificate of occupancy in time cannot be grounds for delaying closing.
“You go in with this dream, the American dream, to acquire a house,” Gee, a disabled Air Force veteran “bamboozled and fined” by Lennar, told Hunterbrook. “Either you receive the crappy home or you lose a substantial amount of money like I did.”
Worse, the builders’ take-it-or-leave-it contracts often leave buyers with fewer rights than if they had just closed the deal with a handshake. In many states,
Laws in some states, such as Maryland, Connecticut, Massachusetts, and New Jersey, require homebuilders to provide an implied warranty of habitability and construction quality for new homes, which cannot be waived by the buyer except in very limited circumstances.
the builders require buyers to waive pre-existing legal rights such as implied warranties
An implied warranty is a legal principle, established by courts through judicial decisions, under which courts infer, even absent express language in a contract, that a newly constructed home will be built in a workmanlike manner and be fit for habitation. Most U.S. states recognize common-law implied warranties as inherent in residential construction contracts unless explicitly disclaimed in a written contract. Judicial enforcement of implied warranties varies by jurisdiction.
Link
in favor of the company’s “limited warranty.”
And the builders’ limited warranties often explicitly lack basic guarantees otherwise available to purchasers under the law — like that the home will be “habitable.” Nor do the builders’ warranties usually cover issues related to drainage, grading, erosion, and soil, or the presence of biological contaminants like mold — even though these are common defects caused by workmanship deficiencies that sometimes make homes uninhabitable.
An excerpt from
2024 Lennar Warranty
stating the buyer must waive the rights to any implied warranty of habitability. Source: Lisa Brown
An excerpt from a
D.R. Horton Warranty
available on its website stating the buyer must waive the rights to any implied warranty of habitability. Source: D.R. Horton
For many buyers, spotting all of these legal traps before it’s too late may be difficult — especially because homebuilders sometimes discourage buyers from hiring realtors familiar with the system.
Lennar, for instance, only compensates buyers’ agents in certain communities, and even then, only those who were present at the buyer’s first visit to the model home, according to
legal disclaimers
.
“The way they write their purchase agreement is — and I found this out after the fact… — Lennar can do anything they want,” Michael Stark, who said he didn’t have realtor representation when he purchased a Lennar home in Fort Myers, Florida, told Hunterbrook. “If we had a realtor, the realtor would have been told that there was no commission,” Stark said.
“I think if an attorney had gone through the documentation they may have said to us, yeah, this isn’t a good thing for you,” Stark added.
“It’s great for them, but it’s not good for you.”
Discourage Third-Party Inspections, Brush off Building Codes, Rush Walk-Throughs
Multiple homeowners Hunterbrook spoke with also said the builders discouraged third-party inspections, with lines like “there isn’t enough time.”
When a Hunterbook reporter visited Lennar and D.R. Horton model homes, sales representatives called third-party inspections a “waste of money” because the new construction homes had to pass local government inspection at every step to ensure building code compliance.
But code inspection isn’t a guarantee against violations, construction experts Hunterbrook spoke with suggested. According to Knowles, county or municipal inspectors often have an overwhelming number of houses to inspect, sometimes up to 80 a day, and may prioritize quick “pass” inspections over thorough checks, as flagging failures requires significantly more time and paperwork. “They miss dozens and dozens of code violations on all the homes I look at,” Knowles told Hunterbrook.
One Arizona-based third-party inspector who asked to be anonymous because he sometimes works for national builders said he’d seen county inspectors who “didn’t even get out of the car” before signing off on a job.
A
legal loophole
in states like Florida and Texas — states where D.R. Horton and Lennar sold most of their homes in recent years — even allows builders to hire their own inspection companies to sign off on building code compliance.
Under a Florida statute called the Private Provider law, builders can hire private firms like GFA to conduct inspections required for construction approvals in lieu of municipal government inspections. Proponents of the Private Provider law praise it as allowing faster building progress and saving time and money for municipalities.
Link
Biondolillo discovered through public records requests that a private inspection company called GFA International had signed off on the soil compaction test for her new D.R. Horton house in Ocean Breeze, Florida.
Soil compaction
is an important part of site preparation, where the ground is mechanically compacted into a stable mass that can safely hold up a house.
D.R. Horton’s own architectural plans, which were on file with the county, required a 98% compaction rate. But GFA attested to a 95% soil compaction rate, suggesting the land under Biondolillo’s house was too loose by the builder’s own design standards — an apparent code violation.
The Florida Building Code (§107.4) mandates that construction adhere strictly to the approved plans and specifications. Any deviation requires a formal revision approved by the building official. This would presumably cover a failure to meet the specified compaction level.
Link
Soil compaction test records on Julie Biondolillo’s parcel (left) and architectural design drawings (right) D.R. Horton applied to the Town of Ocean Breeze for approval. Source: Retrieved by Biondolillo through Florida public records requests
In fact, of the 143 homes in her subdivision, only 14 actually passed all the tests at the required 98% soil compaction rate, according to the documents Biondolillo obtained and shared with Hunterbrook. And this soil compaction issue is not trivial. It can lead to serious foundation damage,
symptoms
of which include cracks in walls and sticky doors — all of which Biondolillo has experienced in her home.
Photos of Julie Biondolillo’s D.R. Horton home showing cracks in the exterior and the flooring. Source: Julie Biondolillo
The Ocean Breeze permit office told Hunterbrook that the “town does not independently verify” the accuracy of the soil compaction certifications submitted by private firms because “the certificates are signed and sealed by a professional engineer who attests that it meets the requirement of the approved plan.”
Homeowners also said they were prohibited from inspecting the attic, roof, and crawl spaces during pre-closing inspections, ostensibly for safety reasons. D.R. Horton’s purchase agreement includes burdensome requirements for inspection access, like requiring a week’s notice and proof of over $1,000,000 in insurance by the inspector. Lennar’s agreement limits buyers’ access to the home prior to closing, allowing access only when accompanied by a Lennar representative and only at times designated by the seller.
While Hunterbrook visited a Lennar sales office, a new homeowner walked in and asked to see the home for which he’d just signed a purchase agreement. But a sales representative said no, not until the closing. “I put down $10,000 and I can’t see my house?” the homeowner demanded angrily and stormed off.
Homeowners also frequently reported the builders dismissing or downplaying defects; refusing to write them down on the punch list
A punch list identifies tasks, from touching up surface finishes to making repairs, the builder agrees to carry out for the project to be considered completed, at which point the final payment may be made.
Link
during the final walk-through; and rushing them to close with the promise that problems would be fixed after the move-in — only to find they weren’t.
This is particularly troubling in the event that a company explicitly disclaims any liability for defects not identified on a punch list during the pre-closing walk-through. A sample D.R. Horton purchase agreement obtained by Hunterbrook states that “under no circumstances shall D.R. Horton be required to repair or replace items not on the punch list.”
An excerpt from a sample D.R. Horton contract obtained by Hunterbrook during a visit to a D.R. Horton sales office in Maryland.
One Lennar homeowner said, after they pointed out a brown smudge on the wall and a weird smell during a walk-through, Lennar painted over the smudge, brushing it off as “just a spot where they missed paint.” Within days of moving in, however, they saw the brown stain had reappeared and spread. The homeowner pushed Lennar to investigate, and after denying there was an issue, Lennar opened up the wall and found the main water line was leaking.
Legally, there may be little incentive or obligation for the builders to address defects identified during a third-party inspection or walk-through. A Lennar salesperson at a Florida development told a Hunterbrook reporter asking about a home that Lennar is under no obligation to agree with the findings of a third-party inspector.
Their statement is reflected in a Lennar contract, which says it will be obligated to fix problems “if any items noted are actually defective … in seller’s opinion.” A D.R. Horton purchase agreement explicitly states that “no funds may be escrowed” to guarantee completion of those tasks, meaning buyers have no financial leverage if the builders refuse to address those items.
And the builders’ tactics for evading necessary fixes can get creative. A D.R. Horton homeowner in Florida described a superintendent telling him that he didn’t need to write down problems identified during a final walk-through because the super could “grab his toolbox from his car” to fix the issues while the homeowner was signing the closing papers. The homeowner, who wished to be anonymous to avoid jeopardizing his chances of resolving his dispute with the builder, later found not only that the problems hadn’t been fixed, but that some of his neighbors had been deceived by the same toolbox-in-my-car story.
Step 2: Filibuster Warranty Requests
The day Madelyn Awalt moved into her brand-new D.R. Horton home in Princeton, Texas, the main circuit breaker in the house tripped as soon as the movers plugged in her new refrigerator, she told Hunterbrook. D.R. Horton sent out a technician who reset the breaker. He said new houses sometimes had “some loose wiring,” Awalt recalled. He assured her it should be just fine.
But the breaker kept tripping, and each time she reported the problem, the builder just told her to reset it. Finally, Awalt said, she hired a private electrician who discovered the breaker panel, made by Schneider Electric, had been recalled due to
“thermal burn and fire hazard”
in 2022 — well within the first-year warranty period. But D.R. Horton denied the claim, telling her she should have reported the problem before the warranty expired.
“I was like, this is completely D.R. Horton’s responsibility,” Awalt recalled. “And he’s like, you know, we’re America’s builder … We closed on 65,000 homes last year. And I said, all that means is that you got 65,000 sufferers like me.”
He’s like, “We’re America’s builder. We closed 65,000 homes last year. And I said, all that means is you got 65,000 sufferers like me.”
Madelyn AwaLt, D.R. Horton Homeowner
D.R. Horton’s standard
warranty
states that the homes “shall be free of defects for a period of one year, from the date of closing.” It
offers
a one-year warranty on workmanship, as well as a two-year warranty on mechanical systems, and a 10-year warranty against structural issues. Lennar
promises
buyers “peace of mind in your new home” with a similar
“1-2-10” year warranty
. Multiple interviewees told Hunterbrook that D.R. Horton and Lennar sales staff touted the warranty as the buyer’s safety net against defects, even using it to discourage buyers from doing a pre-closing third-party inspection.
In reality, as Awalt found out, the builders appear to do everything possible to avoid meaningful repairs, including by systematically denying problems, deflecting blame to homeowners and others, and delaying service until the warranty clock runs out. Homeowners reported submitting warranty requests online and never hearing back, and having to make repeated calls to schedule a repair — only for those efforts to end in a no-show.
“You submit a request online, they say someone will call you in three to five days, but no one ever does,” Sandy Nguyen, a D.R. Horton homeowner in Mobile, Alabama, told Hunterbrook.
Other homeowners Hunterbrook interviewed and across online platforms said the builders would constantly “gaslight” homeowners, claiming the problems were within “tolerance” or “standards.”
Such tactics became so well-known that residents of one Lennar community coined the term “being Lennared”
according
to Nathaniel Klitsberg, a homeowner in Parkland, Florida, speaking to city commissioners at a 2021 meeting.
Florida homeowner Matthew French said D.R. Horton kept denying the seriousness of the problems in his home, calling some of the issues an “illusion,” even after he submitted a report from a third-party engineering company confirming severe structural defects and indicating the home should not have received a certificate of occupancy.
An excerpt from an engineering inspection report French obtained that confirms his home is “structurally deficient and not in a completed condition warranting the issuance of a Certificate of Occupancy,” in likely violation of Florida building codes.
The engineering company quoted him $117,000 in repair costs. And to top it all off, on learning the extent of the defects through French’s insurance claim, the county appraiser apparently devalued his home by $162,942.
A screenshot of a cost estimate for repairs quoted by an engineering company French hired, on file with the Hillsborough County Property Appraiser’s office and obtained by Hunterbrook via a public records request. Source: Hillsborough County Property Appraiser’s office, Hunterbrook Media
Lennar homeowner Ashley Frazier discovered a severe mold infestation in her new home, which was so full of moisture that the ceilings and walls were literally dripping water. But she said Lennar told her the mold levels were “not elevated” and offered to repair one square foot of drywall and a few base cabinets. She said a repair cost analysis arranged by her lawyers came out to $467,200.48.
A screenshot Frazier posted of mold inspection conducted by a third-party inspector she’d hired, which shows elevated mold levels in almost every corner of her home (top) vs. Lennar’s inspection results that show the levels were not elevated (middle). The inspector deemed the home “uninhabitable” citing CDC guidelines (bottom). Source:
Instagram
“You know, I got a letter in the mail not too long ago about a recall on my car that’s 10 years old. They want me to bring it in so they can change out the part,” Antonucci told Hunterbrook.
“Why is that better than my home warranty?”
Even when they finally got a repair crew to show up, homeowners said, it often didn’t help.
They described some crews as comically unprepared and unqualified to do the work. Some crews didn’t even know what they were there to fix or arrived without the replacement item or necessary tools. And the repair was often just a cheap bandaid job that either failed to fix the underlying issue or made it worse.
One issue may be that the repairs often fall to the subcontractors who performed the actual construction. The subcontractors provide warranties to the builder and “are expected to respond to us and the homeowner in a timely manner,” according to D.R. Horton’s most recent
SEC Form 10-K
. Lennar’s
10-K
similarly says that “we are primarily responsible to the homebuyers for the correction of any deficiencies,” while pointing out that subcontractors are contractually required to “repair or replace any deficient items related to their trade.”
But these subcontractors may be less than eager to come back at their own expense to fix work the builder has already paid them for. “‘I’m tired of working for free,’” one Lennar homeowner recalled a repair crew who came to fix the issues at her Lennar house as saying. “I went through that with four different subcontractors within the first month.” She added, “So, they have no incentive when they send people out to inspect a problem.”
Moreover, despite Lennar’s claims of “primary” responsibility, that hasn’t been the experience for some buyers. The same Lennar homeowner, for example, recalled a Lennar warranty representative saying “‘it’s up to the subcontractor to hold up their end of the warranty.”
The warranty policies Hunterbrook reviewed also explicitly disclaim any standard for repair work, however shoddy or inadequate. Lennar’s
2024 Homebuyers’ Warranty Guide
, for example, states the company has the “sole right to determine the repairs or replacements necessary” based on “Workmanship Standards” it defines. It also explicitly states that any repairs it performs cannot extend the warranty’s original expiration date.
Frustrated homeowners described “begging” or having to “fight tooth and nail” to get the company to address their problems. Other approaches include posting on Facebook; filing a complaint with the Better Business Bureau, the county, or the state attorney general’s office; taking the story to a local news station; or even threatening to sue.
“I honestly didn’t want this. I don’t want to be on the news. I don’t want to be in a lawsuit,” Frazier, whose
TikTok
and
Instagram
videos about her Lennar home have reached as many as 2 million viewers, told Hunterbrook. She has also appeared in multiple local news reports.
Left: Frazier’s TikTok
video
chronicling the mold infestation in her new Lennar home. Right: Frazier on local Houston news,
KHOU 11
, March 4.
“I’m in school trying to finish a dual doctorate and now living with my parents again.”
“I’m in school trying to finish a dual doctorate and now living with my parents again.”
Ashley Frazier, Lennar Homeowner
“I hounded them. I hounded them so much. In fact, they called and begged and pleaded to my husband to have me stop bitching about them on Facebook,” Bridget Smith, another Lennar homeowner in Aurora, Colorado, told Hunterbrook. She said her tactics helped get the builder to cover most of the issues covered by the warranty — except the estimated $42,500 in repair costs for water damage caused by a construction defect she discovered after the warranty period.
Others haven’t been quite as lucky. Multiple homeowners said that when they tried to escalate their cases after receiving inadequate responses from the warranty departments, they were redirected to company attorneys or legal departments — an action they said they believed was intended to intimidate them.
Some claimed the builders tried to rein in employees who went out of the way to help homeowners.
Steve Schoelman, who purchased a D.R. Horton home near San Antonio, Texas, in 2022, described how the superintendent in charge of his home was initially responsive after his roof was badly damaged by recent winds — even calling personally to apologize. But soon after, Schoelman said, the superintendent confessed he’d been told “basically not to say anything else,” that his “boss would handle this.” After that, Schoelman said, his calls and email went unanswered.
Another D.R. Horton homeowner in Lakewood Ranch, Florida, who has been dealing with defective roof shingles as well since moving in last year and chose to remain anonymous, said the company has likely spent more time and energy denying his persistent requests for warranty coverage than if they had just fixed the problem. He surmised the company might not want to set a precedent for making repairs: After going door to door to talk with his neighbors, he said he estimated nine out of ten of his neighbors had complaints about D.R. Horton.
Montgomery, who hadn’t been able to live in her mold-infested home since city officials condemned it last year, eventually sold it at a loss. She told Hunterbrook she had tried to appeal to Lennar’s ethical standards. “I want you guys to do the right thing. I want you to treat my house as if it’s your house. I want you to treat this as if your kids live here. I want you to make it safe and livable. I want you to do the right thing.”
“I’ve said that repeatedly to them, but they don’t treat you that way.”
Step 3: Keep Cases Out of Court
Christie Volkmer and her family have been drinking bottled water and showering in a makeshift camper shower for years, since watching black slime coming out of the kitchen sink, showers, and toilet in the D.R. Horton home they had bought new in Kauai, Hawaii, in 2018. Volkmer said about half of her neighbors in the 144-home community of Ho’oluana have the same problem.
Photos of black slime coming out of the showerhead and toilet in Volkmer’s home. Source: Christie Volkmer
Volkmer and other families have opted to take legal action against D.R. Horton, Volkmer said. She said the company denied responsibility, even after being presented with extensive tests run by the county showing the problem wasn’t the water source — D.R. Horton’s initial defense. Volkmer said she’d taken out a loan to pay for “devastating” legal and other fees, although “the financial burden pales in comparison to the emotional toll it has taken on us.”
A letter from the Kauai county mayor stating that multiple investigations have ruled out a possible contamination of the county’s water system and suggesting that the problems with water quality experienced by D.R. Horton residents in Ho’oluana may be confined to the subdivision. Source: Christie Volkmer
Instead of filing their complaints in a public court, the Volkmers are arbitrating privately with D.R. Horton. That’s not unusual. The
D.R. Horton
and
Lennar contracts
Hunterbrook reviewed include a
mandatory arbitration clause
that requires buyers to waive their right to go to court and instead requires that any disputes be resolved through a private arbitration system that consumer advocates
say
systematically favors the builder.
Arbitration has its roots in the Federal Arbitration Act of 1925, which was originally intended to help resolve disputes between commercial entities of equal bargaining power; but over time, corporations across a wide range of industries — including housing, credit cards, employment, and health care — began embedding arbitration clauses into consumer contracts.
Link
An excerpt from Michael Stark’s purchase agreement with Lennar. Source: Michael Stark
These agreements also require the buyers to waive their right to a class action lawsuit — “realistically the only tool citizens have to fight illegal or deceitful business practices,” as The
New York Times
put it.
An excerpt from Michael Stark’s purchase agreement with Lennar. Source: Michael Stark
Lobbying group the National Association of Home Builders
claims
that “in many cases ADR is often the most rapid, fair and cost-effective means to resolving disputes—for both the builder and the buyer,” referring to alternate dispute resolution, or arbitration.
But consumer advocates argue that arbitration firms have an incentive to deliver outcomes favorable to their most important repeat customers — the builders.
They “rely on happy customer return service,” Perez-Pedemonti said. “There’s no way they can be impartial because of the way their business model is made.”
And as Stanford Business
put it,
in arbitration, “companies have a big information advantage in fishing for arbitrators who are likely to rule in their favor.”
In an email response to Hunterbrook, a spokesperson for the
American Arbitration Association
— which both D.R. Horton and Lennar use as an arbitration service — said “fewer than one-third proceeded to an award. In those cases, when the arbitrator specified a prevailing party, the homeowner prevailed more frequently than the homebuilder.” It’s unclear if the two-thirds of cases that did not proceed to an award were dismissed or settled.
But last month, a group of Arizona residents filed a
class action lawsuit
against the AAA for systematically favoring corporations, alleging consumers lose 76% of the time in arbitrations they initiate.
The plaintiffs cited AAA’s quarterly disclosures once readily found on their website but no longer available as of the date of the filing of the complaint.
A 2017 fact sheet by the
Economic Policy Institute
suggested a similar pattern. It found that “consumers obtain relief regarding their claims in only 9 percent of disputes.” Other studies paint an even bleaker picture, with an American Association for Justice report
claiming
consumers are more likely to get struck by lightning than win in a forced arbitration.
The awards from arbitration also tend to be smaller than in courts. The
Institute
’s 2015 report showed plaintiffs’ overall economic outcomes in state court were on average 13.9 times better than in mandatory arbitration.
A spokesperson told Hunterbrook that the AAA is “firmly committed to upholding the highest ethical standards across all facets of alternative dispute resolution” and that the arbitrators are “bound by the Code of Ethics” in commercial disputes. AAA’s roster of arbitrators “includes experienced, independent professionals with deep subject matter expertise,” they said, including construction.
Moreover, compared to public court cases, arbitration typically offers very limited rights to discovery and appeal and is conducted confidentially, with no public record of proceedings or outcomes. This secrecy benefits builders by preventing legal precedent from accumulating and by shielding systemic issues — like construction defects or warranty evasions — from public scrutiny or class-wide accountability.
“Lawyers are increasingly less open to taking these cases,” said Perez-Pedemonti. “The minute you see an arbitration clause, you’re like, ‘Oh my gosh, okay, well, this is going to be a dead end.’”
“Forced arbitration is unfair and un-American,” said U.S.
Sen. Richard Blumenthal
, who co-sponsored the
Forced Arbitration Repeal Act
of 2023, prohibiting forced arbitration in consumer contracts. “One of the fundamental principles of our American democracy is that everyone gets their day in court. Forced arbitration deprives Americans of that basic right.”
That bill has been
killed
each time it was introduced in Congress since 2017.
But some homeowners may be seeing a ray of light, at least in some states. In a
landmark 2016 case
, the South Carolina Supreme Court ruled that D.R. Horton’s arbitration clause was “unconscionable” and “unenforceable,” pointing out that most homebuyers were in a significantly weaker bargaining position and that the clause was a take-it-or-leave-it agreement drafted by one party that lacked a meaningful choice. In 2022, the same court reached the same conclusion with respect to Lennar when it
ruled against
the enforceability of Lennar’s arbitration clause in a construction defect case brought by a group of homeowners.
The rulings might have opened the floodgate to lawsuits in the state: Homeowners in a Myrtle Beach community said a local law firm estimated it had an eight-month waitlist for new clients. The firm
won
a
$16.1 million settlement
for over 200 homebuyers in a class-action lawsuit against D.R. Horton for construction defects, including issues with roofs, joists, and water intrusion.
One law firm told Hunterbrook that in South Carolina alone, they’ve been contacted by around 500 homeowners with complaints — and they’re actively investigating 125 of those cases for potential legal action. According to Hunterbrook’s review of complaints filed in the state’s county courts, almost half of the 198 complaints filed against D.R. Horton in the last 10 years were filed in just the last two years.
Lawsuits filed against D.R. Horton in various county courts in South Carolina in the last 10 years, compiled and analyzed by Hunterbrook (compilation date: May 15, 2025). Sources:
South Carolina Judicial Branch Case Record Search
, Hunterbrook Media
Other states, however, including Texas and Florida, continue to enforce arbitration clauses, routinely
returning
cases submitted to the state’s justice system back to the private arbitration system, according to Hunterbrook’s review of court cases filed in those states. Texas seems to be
doubling
down, including with a 2023 reversal in court ruling that other family members or subsequent owners were also bound by the arbitration clause.
Still, homeowners are undeterred, with court records showing an exponential rise in legal complaints filed against D.R. Horton in select Florida courts in recent years.
Hunterbrook compiled and reviewed court cases in selected Florida counties, including Broward County, Duval County, Hillsborough County, Lee County, Manatee County, Miami-Dade County, Orange County, Palm Beach County, Pasco County, and Sarasota County.
French filed a lawsuit against D.R. Horton in Florida over the severe construction defects in his home. He said he believes he can get a jury trial because the contract was forced on him without his full awareness of the arbitration clause.
“If I’m able to go through trial, I that think peers would definitely see my point.”
Volkmer, who is in arbitration, says D.R. Horton has continued to launch endless legal maneuvers for years to hold up her case — all in an effort to avoid accountability.
“We are stuck with no end in sight,” Volkmer said. “We can’t sell our home — not even at a loss — because it is trapped in legal limbo.”
“I am exhausted. I am heartbroken. I am angry. But I will not give up because we deserve what we were promised in our D.R. Horton contract.”
“I am exhausted. I am heartbroken. I am angry. But I will not give up because we deserve what we were promised.”
Christie Volkmer
For homeowners unable to afford legal recourse, though, the future could be bleaker.
“Our whole life is tied up into this house. With us just moving here, I mean, we’re still trying to recoup from the move,” said Kim Goldman, who told Hunterbrook her brand-new D.R. Horton community floods when they “get a heavy rain — which is frequent.” She’s worried her house will be next.
“Every time it rains heavy, I’m like holding my breath,” she said.
Kim Goldman’s post on Facebook showing photos of flooded streets in her D.R. Horton community in Bolivia, South Carolina. Source:
Facebook
Authors
Jenny Ahn
joined Hunterbrook after serving many years as a senior analyst in the US government. She is a seasoned geopolitical expert with a particular focus on the Asia-Pacific and has diverse overseas experience. She has an M.A. in International Affairs from Yale and a B.S. in International Relations from Stanford. Jenny is based in Virginia.
Michelle Cera
is a sociologist specializing in digital ethnography and pedagogy. She received her Ph.D. in Sociology from New York University, building on her Bachelor of Arts degree with Highest Honors from the University of California, Berkeley. Currently serving as a Workshop Coordinator at NYU’s Anthropology and Sociology Departments, Michelle fosters interdisciplinary collaboration and advances innovative research methodologies.
Matthew Termine
is a lawyer with nearly five years of experience leading the legal team at a mortgage technology company. In 2017, Matt was credited by the Wall Street Journal, among others, for identifying suspicious mortgage loan transactions that led to several successful criminal prosecutions, including that of a prominent political operative and the chief executive officer of a federally chartered bank. He is a graduate of Trinity College and Fordham University School of Law. He grew up in Old Saybrook, Connecticut and now lives in Brooklyn with his wife and three children.
Editors
Wendy Nardi
joined Hunterbrook after working as a developmental and copy editor for academic publishers, government agencies, Fortune 500 companies, and international scholars. She has been a researcher and writer for documentary series and a regular contributor to The Boston Globe. Her other publications range from magazine features to fiction in literary journals. She has an MA in Philosophy from Columbia University and a BA in English from the University of Virginia.
Jim Impoco
is the award-winning former editor-in-chief of Newsweek who returned the publication to print in 2014. Before that, he was executive editor at Thomson Reuters Digital, Sunday Business Editor at The New York Times, and Assistant Managing Editor at Fortune. Jim, who started his journalism career as a Tokyo-based reporter for The Associated Press and U.S. News & World Report, has a Master’s in Chinese and Japanese History from the University of California at Berkeley.
Sam Koppelman
is a New York Times best-selling author who has written books with former United States Attorney General Eric Holder and former United States Acting Solicitor General Neal Katyal. Sam has published in the New York Times, Washington Post, Boston Globe, Time Magazine, and other outlets — and occasionally volunteers on a fire speech for a good cause. He has a BA in Government from Harvard, where he was named a John Harvard Scholar and wrote op-eds like “Shut Down Harvard Football,” which he tells us were great for his social life. Sam is based in New York.
Hunterbrook Media publishes investigative and global reporting — with no ads or paywalls. When articles do not include Material Non-Public Information (MNPI), or “insider info,” they may be provided to our affiliate Hunterbrook Capital, an investment firm which may take financial positions based on our reporting. Subscribe
here
. Learn more
here
.
Hunterbrook Media LLC is not a registered investment advisor in the United States or any other jurisdiction. We strive to ensure the accuracy and reliability of the information provided, drawing on sources believed to be trustworthy. Nevertheless, this information is provided "as is" without any guarantee of accuracy, timeliness, completeness, or usefulness for any particular purpose. Hunterbrook Media LLC does not guarantee the results obtained from the use of this information. All information presented are opinions based on our analyses and are subject to change without notice, and there is no commitment from Hunterbrook Media LLC to revise or update any information or opinions contained in any report or publication contained on this website. The above content, including all information and opinions presented, is intended solely for educational and information purposes only. Hunterbrook Media LLC authorizes the redistribution of these materials, in whole or in part, provided that such redistribution is for non-commercial, informational purposes only. Redistribution must include this notice and must not alter the materials. Any commercial use, alteration, or other forms of misuse of these materials are strictly prohibited without the express written approval of Hunterbrook Media LLC. Unauthorized use, alteration, or misuse of these materials may result in legal action to enforce our rights, including but not limited to seeking injunctive relief, damages, and any other remedies available under the law.
pgschema: Terraform-style, declarative schema migration for Postgres
Lobsters
github.com
2025-09-13 14:31:20
Tagged vibecoding because the author says they’re only able to find time for it by using Claude. https://www.reddit.com/r/PostgreSQL/comments/1newev8/pgschema_postgres_declarative_schema_migration/
Comments...
# Edit schema file declaratively
--- a/schema.sql
+++ b/schema.sql
@@ -12,5 +12,6 @@
CREATE TABLE IF NOT EXISTS users (
id SERIAL PRIMARY KEY,
- username varchar(50) NOT NULL UNIQUE
+ username varchar(50) NOT NULL UNIQUE,
+ age INT NOT NULL
);
Step 3: Generate plan
$ PGPASSWORD=testpwd1 pgschema plan \
--host localhost \
--db testdb \
--user postgres \
--schema public \
--file schema.sql \
--output-human stdout \
--output-json plan.json
Plan: 1 to modify.
Summary by type:
tables: 1 to modify
Tables:
~ users
+ age (column)
Transaction: true
DDL to be executed:
--------------------------------------------------
ALTER TABLE users ADD COLUMN age integer NOT NULL;
Step 4: Apply plan with confirmation
# Or use --auto-approve to skip confirmation
$ PGPASSWORD=testpwd1 pgschema apply \
--host localhost \
--db testdb \
--user postgres \
--schema public \
--plan plan.json
Plan: 1 to modify.
Summary by type:
tables: 1 to modify
Tables:
~ users
+ age (column)
Transaction: true
DDL to be executed:
--------------------------------------------------
ALTER TABLE users ADD COLUMN age integer NOT NULL;
Do you want to apply these changes? (yes/no): yes
Applying changes...
Changes applied successfully!
I used AOL Instant Messenger from about 1999 to 2007. For most of that time, I used AIM clients that logged my conversations, but they varied in formats. Most of the log formats are XML or HTML, which make re-reading those logs a pain.
The simplest AIM logs are the plaintext logs, which look like this:
Session Start (DumbAIMScreenName:Jane): Mon Sep 12 18:44:17 2005
[18:44] Jane: hi
[18:55] Me: hey whats up
Session Close (Jane): Mon Sep 12 18:56:02 2005
Every decade or so, I try writing a universal AIM log parser to get all of my old logs into a consistent, readable format. Unfortunately, I always get bored and give up partway through. My last attempt was
seven years ago
, when I tried doing it in Python 2.7.
Parsing logs is a great match for Gleam because some parts of the project are easy (e.g., parsing the plaintext logs), so I can do the easy parts while I get the hang of Gleam as a language and gradually build up to the harder log formats and adding a web frontend.
I’ve also heard that functional languages lend themselves especially well to parsing tasks, and I’ve never understood why, so it’s a good opportunity to learn.
I’ve been a programmer for 20 years, but I’m no language design connoisseur. I’m sharing things about Gleam I find unintuitive or difficult to work with, but they’re not language critiques, just candid reactions.
I’ve never worked in a langauge that’s designed for functional programming. The closest would be JavaScript. The
languages I know best
are Go and Python.
The first thing I wanted to do was figure out how to parse a command-line argument so I could call my app like this:
./log-parser ~/logs/aim/plaintext
But there’s no Gleam standard library module for reading command-line arguments. I found
glint
, and it felt super complicated for just reading one command-line argument. Then, I realized there’s a simpler third-party library called
argv
.
I can parse the command-line argument like this:
pubfnmain(){caseargv.load().arguments{[path]->io.println("command-line arg is "<>path)_->io.println("Usage: gleam run <directory_path>")}}
$ gleam run ~/whatever
Compiled in 0.01s
Running log_parser.main
command-line arg is /home/mike/whatever
From poking around, I think the executables are under
build/dev/erlang/log_parser/ebin/
:
$ ls -1 build/dev/erlang/log_parser/ebin/
log_parser.app
log_parser.beam
log_parser@@main.beam
log_parser_test.beam
plaintext_logs.beam
plaintext_logs_test.beam
Those appear to be BEAM bytecode, so I can’t execute them directly. I assume I could get run the BEAM VM manually and execute those files somehow, but that doesn’t sound appealing.
So, I’ll stick to
gleam run
to run my app, but I wish
gleam build
had a better explanation of what it produced and what the developer can do with it.
To start, I decided to write a function that does basic parsing of plaintext logs.
So, I wrote a test with what I wanted.
pubfnparse_simple_plaintext_log_test(){"
Session Start (DumbAIMScreenName:Jane): Mon Sep 12 18:44:17 2005
[18:44] Jane: hi
[18:55] Me: hey whats up
Session Close (Jane): Mon Sep 12 18:56:02 2005
"|>string.trim|>plaintext_logs.parse|>should.equal(["hi","hey whats up"])}
Eventually, I want to parse all the metadata in the conversation, including names, timestamps, and session information. But as a first step, all my function has to do is read an AIM chat log as a string and emit a list of the chat messages as separate strings.
That meant my actual function would look like this:
pubfnparse(contents:String)->List(String){// Note: todo is a Gleam language keyword to indicate unfinished code.
todo}
Just to get it compiling, I add in a dummy implementation:
$ gleam test Compiling log_parser
warning: Unused variable
┌─ /home/mike/code/gleam-log-parser2/src/plaintext_logs.gleam:1:14
│
1 │ pub fn parse(contents: String) -> List(String) {
│ ^^^^^^^^^^^^^^^^ This variable is never used
Hint: You can ignore it with an underscore: `_contents`.
Compiled in 0.22s
Running log_parser_test.main
F
Failures:
1) plaintext_logs_test.parse_simple_plaintext_log_test: module 'plaintext_logs_test' Values were not equal
expected: ["hi", "hey whats up"]
got: ["fake", "data"]
output:
Finished in 0.008 seconds
1 tests, 1 failures
Cool, that’s what I expected. The test is failing because it’s returning hardcoded dummy results that don’t match my test.
This is my first time using pattern matching in any language, and it’s neat, though it’s still so unfamiliar that I find it hard to recognize when to use it.
Zooming in a bit on the pattern matching, it’s here:
It evaluates the
line
variable and matches it to one of the subsequent patterns within the braces. If the line starts with
"Session Start"
(the
<>
means the preceding string is a prefix), then Gleam executes the code after the
->
, which in this case is just the empty string. Same for
"Session Close"
.
If the line doesn’t match the
"Session Start"
or
"Session Close"
patterns, Gleam executes the last line in the
case
which just matches any string. In that case, it evaluates to the same string. Meaning
"hi"
would evaluate to just
"hi"
.
This is where it struck me how strange it feels to not have a
return
keyword. In every other language I know, you have to explicitly return a value from a function with a
return
keyword, but in Gleam, the return value is just the value from the last line that Gleam executes in the function.
If I run my test, I get this:
$ gleam test Compiling log_parser
Compiled in 0.22s
Running log_parser_test.main
F
Failures:
1) plaintext_logs_test.parse_simple_plaintext_log_test: module 'plaintext_logs_test' Values were not equal
expected: ["hi", "hey whats up"]
got: ["", "[18:44] Jane: hi", "[18:55] Me: hey whats up", ""]
output:
Finished in 0.009 seconds
1 tests, 1 failures
Again, this is what I expected, and I’m a bit closer to my goal.
I’ve converted the
"Session Start"
and
"Session End"
lines to empty strings, and the middle two elements of the list are the lines that have AIM messages in them.
The remaining work is:
Strip out the time and sender parts of the log lines.
And I need to extract just the portion after the sender’s name to this:
My instinct is to use a string split function and split on the
:
character. I see that there’s
string.split
which returns
List(String)
.
There’s also a
string.split_once
function, which should work because I can split once on
:
(note the trailing space after the colon).
The problem is that
split_once
returns
Result(#(String, String), Nil)
, a type that feels scarier to me. It’s a two-tuple wrapped in a
Result
, which means that the function can return an error on failure. It’s confusing that
split_once
can fail whereas
split
cannot, so for simplicity, I’ll go with
split
.
$ gleam testwarning: Todo found
┌─ /home/mike/code/gleam-log-parser/src/plaintext_logs.gleam:10:7
│
10 │ todo
│ ^^^^ This code is incomplete
This code will crash if it is run. Be sure to finish it before
running your program.
Hint: I think its type is `String`.
Compiled in 0.01s
Running log_parser_test.main
src/plaintext_logs.gleam:9
["[18:44] Jane", "hi"]
Good. That’s doing what I want. I’m successfully isolating the
"hi"
part, so now I just have to return it.
At this point, I feel close to victory. I’ve converted the line to a list of strings, and I know the string I want is the last element of the list, but how do I grab it?
In most other languages, I’d just say
line_parts[1]
, but Gleam’s lists have no accessors by index.
Looking at the
gleam/list
module, I see a
list.last
function, so I try that:
$ gleam test Compiling log_parser
warning: Todo found
┌─ /home/mike/code/gleam-log-parser/src/plaintext_logs.gleam:12:11
│
12 │ |> todo
│ ^^^^ This code is incomplete
This code will crash if it is run. Be sure to finish it before
running your program.
Hint: I think its type is `fn(Result(String, Nil)) -> String`.
Compiled in 0.24s
Running log_parser_test.main
src/plaintext_logs.gleam:11
Ok("hi")
A bit closer! I’ve extracted the last element of the list to find
"hi"
, but now it’s wrapped in a
Result
type
.
My tests are now passing, so theoretically, I’ve achieved my initial goal.
I could declare victory and call it a day. Or, I could refactor!
I’ll refactor.
I feel somewhat ashamed of my string splitting logic, as it didn’t feel like idiomatic Gleam. Can I do it without getting into result unwrapping?
Re-reading it, I realize I can solve it with this newfangled pattern matching thing. I know that the string will split into a list with two elements, so I can create a pattern for a two-element list:
That feels a little more elegant than calling
result.last
.
Can I tidy this up further? I avoided
string.split_once
because the type was too confusing, but it’s probably the better option if I expect only one split, so what does that look like?
Okay, that doesn’t look as scary as I thought. Even though my first instinct is to unwrap the error and access the last element in the tuple (which actually is easy for tuples, just not lists), I know at this point that there’s probably a pattern-matchy way. And there is:
The
Ok(#(_, message))
pattern will match a successful result from
split_once
, which is a two-tuple of
String
wrapped in an
Ok
result. The other
case
option is the catchall that returns an empty string.
One of the compelling features of Gleam for me is its static typing, so it feels hacky that I’m abusing the empty string to represent a lack of message on a particular line. Can I use the type system instead of using empty strings as sentinel values?
The pattern in Gleam for indicating that something might fail but the failure isn’t necessarily an error is
Result(<type>, Nil)
, so let me try to rewrite it that way:
Great! I like being more explicit that the lines without messages return
Error(Nil)
rather than an empty string. Also,
result.values
is more succinct for filtering empty lines than the previous
list.filter(fn(s) { !string.is_empty(s) })
.
After spending a few hours with Gleam, I’m enjoying it. It pushes me out of my comfort zone the right amount where I feel like I’m learning new ways of thinking about programming but not so much that I’m too overwhelmed to learn anything.
The biggest downside I’m finding with Gleam is that it’s a young language with a relatively small team. It
just turned six years old
, but it looks like the founder was working on it solo
until a year ago
. There are now a handful of core maintainers, but I don’t know if any of them work on Gleam full-time, so the ecosystem is a bit limited. I’m looking ahead to parsing other log formats that are in HTML and XML, and there are Gleam HTML and XML parsers, but they don’t seem widely used, so I’m not sure how well they’ll work.
The Gleam documentation is a bit terse, but I like that it’s so example-heavy.
I learn best by reading examples, so I appreciate that so much of the Gleam standard library is documented with examples showing simple usage of each API function.
I like that the Gleam compiler natively warns about unused functions, variables, and imports. And I like that these are warnings rather than errors.
In Go, I get frustrated during debugging when I temporarily comment something out and then the compiler stubbornly refuses to do anything until I fix the stupid import, which I then have to un-fix when I finish whatever I was debugging.
One of my favorite dumb programming jokes happened at my first programming job about 15 years ago. On a group email thread with several C++ developers, my friend shared a hot tip about C++ development.
He said that if we were ever got fed up with arcane C++ compilation errors, we could just add a special line to our source code, and then even invalid C++ code would compile successfully:
Spoiler alert: it’s not a real C++ preprocessor directive.
But I’ve found myself occasionally wishing languages had something like this when I’m in the middle of development and don’t care about whatever bugs the compiler is trying to protect me from.
Gleam’s
todo
is almost like a
#pragma always_compile
. Even if your code is invalid, the Gleam compiler just says, “Okay, fine. I’ll run it anyway.”
You can see this when I was in the middle of implementing
parse_line
:
If I take out the
todo
, Gleam refuses to run the code at all:
$ gleam test Compiling log_parser
error: Type mismatch
┌─ /home/mike/code/gleam-log-parser/src/plaintext_logs.gleam:8:5
│
8 │ ╭ line -> {
9 │ │ echo string.split(line, on: ": ")
10 │ │ }
│ ╰─────^
This case clause was found to return a different type than the previous
one, but all case clauses must return the same type.
Expected type:
String
Found type:
List(String)
Right, I’m returning an incorrect type, so why would the compiler cooperate with me?
But adding
todo
lets me run the function anyway, which helps me understand what the code is doing even though I haven’t finished implementing it:
$ gleam testwarning: Todo found
┌─ /home/mike/code/gleam-log-parser/src/plaintext_logs.gleam:10:7
│
10 │ todo
│ ^^^^ This code is incomplete
This code will crash if it is run. Be sure to finish it before
running your program.
Hint: I think its type is `String`.
Compiling log_parser
Compiled in 0.21s
Running log_parser_test.main
src/plaintext_logs.gleam:9
["[18:44] Jane", "hi"]
F
[...]
Finished in 0.007 seconds
1 tests, 1 failures
I find pattern matching elegant and concise, though it’s the part of Gleam I find hardest to adjust to. It feels so different from procedural style of programming I’m accustomed to in other languages I know.
The downside is that I have a hard time recognizing when pattern matching is the right tool, and I also find pattern matching harder to read. But I think that’s just inexperience, and I think with more practice, I’ll be able to think in pattern matching.
I don’t know if this is a long-term design choice or if it’s just small for now because it’s an indie-developed language, but the first thing about Gleam that stood out to me is how few built-in features there are.
For example, there’s no built-in feature for iterating over the elements of a
List
type
, and the type itself doesn’t expose a function to iterate it, so you have to use
the
gleam/list
module
in the standard library.
Similarly, if a function can fail, it returns a
Result
type
, and there are no built-in functions for handling a
Result
, so you have to use the
gleam/result
module
to check if the function succeeded.
To me, that functionality feels so core to the language that it would be part of the language itself, not the standard library.
In addition to the language feeling small, the standard library feels pretty limited as well.
There are currently only 19 modules in
the Gleam standard library
. Conspicuously absent are modules for working with the filesystem (the de facto standard seems to be the third-party
simplifile
module).
For comparison, the standard libraries for
Python
and
Go
each have about 250 modules. Although, in fairness, those languages have about 1000x the resources as Gleam.
One year ago, the
Safe C++ proposal
was made. The goal was to add a safe subset/context into C++ that would give strong guarantees (memory safety, type safety, thread safety) similar to what Rust provides, without breaking existing C++ code. It was an extension or superset of C++. The opt-in mechanism was to explicitly mark parts of the code that belong to the safe context. The authors even state:
Code in the safe context exhibits the same strong safety guarantees as code written in Rust.
The rest remains “unsafe” in the usual C++ sense. This means that existing code continues to work, while new or refactored parts can gain safety. For those who write Rust, Safe C++ has many similarities with Rust, sometimes with adjustments to fit C++’s design. Also, because C++ already has a huge base of “unsafe code”, Safe C++ has to provide mechanisms for mixing safe and unsafe, and for incremental migration. In that sense, all of Safe C++’s safe features are opt-in. Existing code compiles and works as before. Introducing safe context doesn’t break code that doesn’t use it.
The proposal caught my interest. It seemed like a good compromise to make C++ safe, although there were open or unresolved issues, which is completely normal for a draft proposal. For example, how error reporting for the borrow checker and lifetime errors would work, or how generic code and templates would interact with lifetime logic and safe/unsafe qualifiers. These are just some of the points, the proposal is very long and elaborate. Moreover, I am not a programming language designer, so there might be better alternatives.
Anyway, today I discovered that the proposal will no longer be pursued. When I thought about the proposal again this morning, I realized I hadn’t read any updates on it for some time. So I searched and found some answers on
Reddit
.
The response from Sean Baxter, one of the original authors of the Safe C++ proposal:
The Safety and Security working group voted to prioririze Profiles over Safe C++. Ask the Profiles people for an update. Safe C++ is not being continued.
And again:
The Rust safety model is unpopular with the committee. Further work on my end won’t change that. Profiles won the argument. All effort should go into getting Profile’s language for eliminating use-after-free bugs, data races, deadlocks and resource leaks into the Standard, so that developers can benefit from it.
So I went to read the documents related to Profiles[1][2][3][4]. I try to summarize what I understood: they are meant to define modes of C++ that impose constraints on how you use the language and library, in order to guarantee certain safety properties. They are primarily compile-time constraints, though in practice some checks may be implemented using library facilities that add limited runtime overhead. Instead of introducing entirely new language constructs, profiles mostly restrict existing features and usages. The idea is that you can enable a profile, and any code using it agrees to follow the restrictions. If you don’t enable it, things work as before. So it’s backwards-compatible.
Profiles seem less radical and more adoptable, a safer-by-default C++ without forcing the Rust model that aims to tackle the most common C++ pitfalls. I think Safe C++ was more ambitious: introducing new syntax, type qualifiers, safe vs unsafe contexts, etc. Some in the committee felt that was too heavy, and Profiles are seen as a more pragmatic path. The main objection is obvious: one could say that Profiles restrict less than what Safe C++ aimed to provide.
Reading comments here and there, there is visible resistance in the community toward adopting the Rust model, and from a certain point of view, I understand it. If you want to write like Rust, just write Rust. Historically, C++ is a language that has often taken features from other worlds and integrated them into itself. In this case, I think that safety subsets of C++ already exist informally somehow. Profiles are an attempt to standardize and unify something that already exists in practice. Technically, they don’t add new fundamental semantics. Instead, they provide constraints, obligations and guarantees.
In my opinion, considering the preferences of the committee and the entire C++ community, although I appreciated the Safe C++ proposal and was looking forward to seeing concrete results, considering the C++ context I believe that standardizing and integrating the Profiles as proposed is a much more realistic approach. Profiles might not be perfect, but they are better than nothing. They will likely be uneven in enforcement and weaker than Safe C++ in principle. They won’t give us silver-bullet guarantees, but they are a realistic path forward.
Lucky kids? How AI could impact university education
Guardian
www.theguardian.com
2025-09-13 13:00:05
Artificial intelligence is changing how students learn and the world they’ll graduate into. Experts reveal how applicants can get ahead OpenAI CEO Sam Altman recently told a US podcast that if he was graduating today, “I would feel like the luckiest kid in all of history.” Altman, whose company deve...
OpenAI CEO Sam Altman recently told a US podcast that if he was graduating today, “I would feel like the luckiest kid in all of history.”
Altman, whose company developed and released
ChatGPT
in November 2022, believes the transformative power of AI offers unprecedented opportunities for young people.
Yes, there will be job displacement, but “this always happens,” says Altman, “and young people are the best at adapting to this.” New, more exciting jobs will emerge, full of greater possibilities.
For UK sixth-formers and their families looking at universities, trying to make the best possible choices about what to study – and where – in the age of generative AI, Altman’s words may offer some comfort. But in a fast-changing environment, experts say there are steps students can take to ensure they are well placed both to make the most of their university experience and to emerge from their studies qualified for the jobs of the future.
Dr Andrew Rogoyski, of the Institute for People-Centred AI at the University of Surrey, says that in many cases students will already be well versed in AI and ahead of the game. “What’s striking is the pace of change and adoption vastly outstrips the pace of academic institutions to respond. As a general truth, academic institutions are quite slow and considered and thoughtful about things. But actually this has gone from the launch of ChatGPT to ‘Should we ban it?’, to ‘OK, here are some concerns about exams’, to actually recognising it’s going to be a life skill that we have to teach in every course and that we want all our students to have equitable access to.
“So it’s gone from zero to 100 in a very short space of time, and of course, the world of work is changing accordingly as well.”
His advice to prospective students? “Be demanding. Ask the questions. I think there are some careers that are going to be very different … make sure that universities are adapting to that.”
Students who are less familiar with AI should take time to learn about it and use it, whatever their chosen subject. Rogoyski says being able to use AI tools is now equivalent to being able to read and write, and it’s important “to be resourceful, adaptable, to spend time understanding what AI is capable of and what it can and can’t do”.
He says: “It’s something you need to be able to understand no matter what course you do, and think about how it might impact your career. So read around, look at some of the speculation surrounding that.
“Then I’d start thinking about what the university’s responses are and what support there is for integration of AI. Is my course, and is the university as a whole, on the front foot with regards to the use of AI?”
There will be a lot of information online but Rogoyski recommends visiting universities to ask the academics who will be delivering your degree: “What is your strategy? What is your attitude? Am I going to get a degree that’s worth having, that will stand the test of time?”
Dan Hawes, co-founder of specialist recruitment consultancy the Graduate Recruitment Bureau, is optimistic about the future for UK graduates and says the current slowdown in the jobs market is more to do with the economy than AI. “It’s still very hard to predict what jobs there will be in three to four years, but we think it’s going to put a premium on graduates,” he says. “They are the generation growing up with AI and employers are probably very interested in getting this new breed of talent into their organisations.
“So for parents and sixth-formers deciding where to study, the first thing always to take note of is the employability of the graduates that are produced by certain universities.”
For example, maths has consistently been the top degree his clients are looking for, and he thinks this is unlikely to change. “AI is not going to devalue the skills and knowledge you get from doing a maths degree,” he says.
He agrees that AI is a concern for parents and those considering going to university, “but in the long term I think it’s going to be a good thing. It’s going to make people’s jobs more interesting, redesign their roles, create new jobs.”
Elena Simperl, professor of computer science at King’s College London, where she co-directs the King’s Institute for Artificial Intelligence, advises students to look at the AI content right across a university, in all departments. “It is changing how we do things. It’s not just changing how we write emails and how we read documents and how we look for information,” she says.
Students might wish to consider how they can set themselves up for a job working in AI. “DeepMind is proposing AI co-scientists, so entire automatic AI labs, to do research. So a university should train their students so that they can make the most out of these technologies,” she says. “It doesn’t really matter what they want to study at the university. They don’t have to study AI themselves, but they should go to a university where there is a broad expertise in AI, not just in a computer science department.”
Prof Simperl says that the evidence so far suggests it is unlikely that entire jobs will completely disappear. “So we need to stop thinking about what jobs will be killed by AI and think about what tasks can AI help with. People who are able to use AI more will be at an advantage.”
In the brave new world of AI, will it still be worth doing a degree like English literature or history? “Yes, if they’re well taught,” says Rogoyski. “They should be teaching you things that will last throughout your lifetime. The appreciation of literature, learning how to write well, learning how to think and how to communicate are enduring skills.
“The way that you might use that degree in the future will undoubtedly change, but if it’s taught well, the lessons learned will see you through. If nothing else you’ll enjoy your downtime as our AI overlords take over all the work and we’ll have more time to read books while we’re all on universal basic income.”
‘Extreme nausea’: Are EVs causing car sickness – and what can be done?
Guardian
www.theguardian.com
2025-09-13 11:01:01
Phil Bellamy’s daughters refuse to ride in his electric car without travel sickness tablets. Are there other solutions? It was a year in to driving his daughter to school in his new electric vehicle that Phil Bellamy discovered she dreaded the 10-minute daily ride – it made her feel sick in a way no...
It was a year in to driving his daughter to school in his new electric vehicle that Phil Bellamy discovered she dreaded the 10-minute daily ride – it made her feel sick in a way no other car did.
As the driver, Bellamy had no problems with the car but his teenage daughters struggled with sickness every time they entered the vehicle.
Research has shown
this is an issue – people who did not usually have
motion sickness
in a conventional car found that they did in EVs.
For Bellamy, 51, his family’s aversion to riding in his car made him wonder at the cause. He tried changing his driving style and even buying a different car but found the issues persisted. His daughters now refuse to travel with him, if possible.
“If we’re going on a journey, they’re absolutely taking travel sickness tablets immediately. They’re not even considering coming in the car without them,” he says.
Bellamy enjoys driving his electric car, which is quieter and smoother compared with the vibrations of a traditional combustion engine car, but hopes manufacturers will consider how to address the concerns of passengers who are affected by motion sickness.
The causes of sickness could include the relatively quick acceleration of EVs compared to fuel vehicles, their regenerative braking systems and a lack of sensory triggers such as engine noise and vibrations when travelling in a car.
Research carried out in China
, a big producer of electric cars, found that EVs were associated with more severe motion sickness symptoms than fuel vehicles.
Atiah Chayne, a content creator from London,
posted on TikTok
about her experiences of car sickness in EVs this summer when she booked Ubers to take her out.
Chayne says “extreme nausea” kicked in very quickly and stopped immediately after she left the vehicle, but it took her a while to realise it happened only in EVs. She now avoids using Ubers as it’s difficult to find one that is not an EV.
Chayne says: “It usually started quickly soon after we moved off. I’d say it got really bad one minute into the journey. I would put the windows down and go on my phone to distract myself,” she says. “The sickness was constantly there throughout the whole journey. If your Uber is 20 minutes away from your destination, you’re counting down the minutes until you get out.”
John Golding, a professor of applied psychology at the University of Westminster in London, says motion sickness specifically affects passengers because it is, in large part, related to being able to anticipate changes in movement.
While drivers have control of the car’s movement, passengers do not – especially those in the back seat – and he thinks this could become more of an issue with the potential introduction of self-driving cars.
He says the car industry is aware of motion sickness issues for some people in EVs and is looking at ways to help passengers anticipate changes in movement,
such as vibrations in the car seat
that warn the passenger when the car is turning.
Golding says people can also either take motion sickness medication or make behavioural changes. “The simplest thing is to sit in the front to get a view. Avoid moving your head too much, don’t look at your phone or start reading; that makes things much worse. If you can get some fresh air, that will help,” he says.
Backseat passengers have no control over the car’s movements.
Photograph: Bsip Sa/Alamy
How to cope with motion sickness
Experts and
the NHS suggest
behavioural changes, medication and acupressure bands could help.
Sit at the front
Knowing what’s going on around you is the best way to avoid motion sickness. It allows you to see and anticipate what will happen next, while in the back the view just flashes past, says Golding.
Try motion sickness medications
These come in the form of patches or tablets that can be bought from pharmacies and help control how your brain and body react to movement. They should be taken
before travelling
.
Wear
ac
upressure
bands
These are thought to work for some, though research suggests probably through a placebo effect. “Placebo effects can be very, very strong. If they work for an individual, don’t knock it,
”
Golding says.
Listen to a 100hz sound
for a minute.
Research
from Japan’s Nagoya University suggests the vibrations at this frequency could help by stimulating a part of the inner ear that detects gravity and acceleration.
In my old age I’ve mostly given up trying to convince anyone of anything. Most people do not care to find the truth, they care about what pumps their bags. Some people go as far as to believe that
perception is reality
and that truth is a construction. I hope there’s a special place in hell for those people.
It’s why the world wasted $10B+ on self driving car companies that obviously made no sense. There’s a much bigger market for truths that pump bags vs truths that don’t.
So here’s your new truth that there’s no market for. Do you believe a compiler can code? If so, then go right on believing that AI can code. But if you don’t, then AI is no better than a compiler, and arguably in its current form, worse.
The best model of a programming AI is a compiler.
You give it a prompt, which is “the code”, and it outputs a compiled version of that code. Sometimes you’ll use it interactively, giving updates to the prompt after it has returned code, but you find that, like most IDEs, this doesn’t work all that well and you are often better off adjusting the original prompt and “recompiling”.
While noobs and managers are excited that the input language to this compiler is English, English is a poor language choice for many reasons.
It’s not precise in specifying things. The only reason it works for many common programming workflows is because they are common. The minute you try to do new things, you need to be as verbose as the underlying language.
AI workflows are, in practice, highly non-deterministic. While different versions of a compiler might give different outputs, they all promise to obey the spec of the language, and if they don’t, there’s a bug in the compiler. English has no similar spec.
Prompts are highly non local, changes made in one part of the prompt can affect the entire output.
tl;dr, you think AI coding is good because compilers, languages, and libraries are bad.
This isn’t to say “AI” technology won’t lead to some extremely good tools. But I argue this comes from increased amounts of search and optimization and patterns to crib from, not from any magic “the AI is doing the coding”. You are still doing the coding, you are just using a different programming language.
That anyone uses LLMs to code is a testament to just how bad tooling and languages are. And that LLMs can replace developers at companies is a testament to how bad that company’s codebase and hiring bar is.
AI will eventually replace programming jobs in the same way compilers replaced programming jobs. In the same way spreadsheets replaced accounting jobs.
But the sooner we start thinking about it as a tool in a workflow and a compiler—through a lens where tons of careful thought has been put in—the better.
I can’t believe anyone bought those vibe coding crap things for billions. Many people in self driving accused me of just being upset that I didn’t get the billions, and I’m sure it’s the same thoughts this time. Is your way of thinking so fucking broken that you can’t believe anyone cares more about the
actual truth
than make believe dollars?
From this study
, AI makes you
feel
20% more productive but in reality makes you 19% slower. How many more billions are we going to waste on this?
Or we could, you know, do the hard work and build better programming languages, compilers, and libraries. But that can’t be hyped up for billions.
Diesel 2.3.0 contains the contributions of 95 people. More than 1142
commits were submitted over a span of 16 months.
This release includes a several large extensions to the query DSL
provided by Diesel and also helps increases the number of platforms you
can use Diesel on out of the box. Notable changes include:
This release wouldn’t be possible without the support of our
contributors and sponsors. We would like to especially call out NLNet
Foundation which funded the development of the newly added support for
window functions via their
NGI Zero Core
initiative. We
would also like to call out GitHub for providing additional training
security and resources for the Diesel project as part of their
GitHub
Secure Open Source Fund
initiative.
Nevertheless is the Diesel project always looking for support. You
can help by:
Contributing Code, Documentation or Guides. Checkout the planing for
Diesel 2.4
for open tasks.
Providing knowledge and help to maintain the MySQL/MariaDB backend.
This is currently the only in-tree backend that is not used by any
maintainers, so having someone around that actually uses this backend
would be very helpful for the Diesel project.
Diesel 2.3 provides a new
#[derive(HasQuery)]
attribute to construct query more directly from a Rust struct. This new
derives extends the existing abilities of
#[derive(Selectable)]
by associating a specific
FROM
clause and possible other
clauses with the given struct. This derive also ensures that the query
result can actually be loaded into this specific Rust type.
For the most simple case you can use this derive as follows:
usecrate::schema::users;#[derive(HasQuery)]struct User { id:i32, name:String,}let users =User::query().load(connection)?
The idea behind this feature is to provide an easier query entry
point and also make it easier to share the same base query between
different query construction sides all returning the same data
structure.
Support for Window Functions
Diesel 2.3 introduces support for constructing
Window
function expressions
via the built-in DSL. This feature is supported
on all built-in backends and allows to construct complex analytical
queries using the provided DSL.
Diesel provides a set of
built-in
Window
functions and also allows to use most of the built-in
aggregate functions as window functions. To construct a Window function
call it is required to chain at least one
WindowExpressionMethods
method after the actual function call. For example
SELECT RANK() OVER(PARTITION BY department ORDER BY salary DESC ROWS UNBOUNDED PRECEDINGFROM employees
Diesel ensures at compile time that the formed query is valid by
restricting that window only function are used with an actual WINDOW
clause, by restricting locations where you can use window functions and
by restricting which constructs are supported by which backends.
Finally you can also define your own window functions via the
#[declare_sql_function]
procedural macro like this:
We would like to thank the NLNet Foundation again for funding the
development of this feature as part of their
NGI Zero Core
initiative.
Using SQLite with WASM
Diesel 2.3 adds support for using the SQLite target with a
wasm32-unknown-unknown
compilation target. This enables
using Diesel in a web browser or in any other WASM-runtime.
This functionality works out of the box by just switching the
compilation target to
wasm32-unknown-unknown
, although you
might want to consider using a special VFS for actually store data in
your web browser.
Extended support for various
types, functions and operators in the PostgreSQL backend
We extended Diesel to provide support for the PostgreSQL specific
MULTIRANGE
type and also to provide support for a large number of
RANGE
and
MULTIRANGE
specific operators and
functions. This feature allows you to construct queries like these:
You can find a list of all supported functions in the documentation
of the
PgRangeExpressionMethods
and a list of supported type mappings for the
MULTIRANGE
types in the documentation of the corresponding
SQL
type
In addition to the extended support for
RANGE
and
MULTIRANGE
types we also worked on adding a lot new
functions for working with
ARRAY
,
JSON
and
JSONB
types. This now covers the majority of functions
provided by default by PostgreSQL for these types. See the
documentation
for a complete list.
Support for
JSON
/
JSONB
types and functions for SQLite
Finally we worked on extending the SQLite backend to support
their built-in JSON types and
functions
. As a result both the
Json
and
Jsonb
SQL types provided by Diesel are now supported in the
SQLite backend. The SQLite backend does not support these types as
column
types, but requires you to use either the
json
or
jsonb
function to turn a
TEXT
or
BINARY
value into a
JSON value. The later value then can be used for manipulating the
content of the JSON value, constructing filters or loading values from
the database.
You can use this functionality like this:
let value =diesel::select(jsonb::<Binary, _>(br#"{"this":"is","a":["test"]}"#)).get_result::<serde_json::Value>(connection)?;
to load JSON values from the database.
See the
documentation
for a list of supported functions and detailed examples.
Thanks
Thank you to everyone who helped to make this release happen through
sponsoring, bug reports, and discussion on GitHub. While we don’t have a
way to collect stats on that form of contribution, it’s greatly
appreciated.
In addition to the Diesel core team, 95 people contributed code to
this release. A huge thank you to:
More than three years in the making, with a concerted effort starting last year, my CPU-time profiler
landed
in Java with OpenJDK 25. It’s an experimental new profiler/method sampler that helps you find performance issues in your code, having distinct advantages over the current sampler. This is what this week’s and next week’s blog posts are all about. This week, I will cover why we need a new profiler and what information it provides; next week, I’ll cover the technical internals that go beyond what’s written in the JEP. I will quote the
JEP 509
quite a lot, thanks to Ron Pressler; it reads like a well-written blog post in and of itself.
Before I show you its details, I want to focus on what the current default method profiler in JFR does:
At every interval, say 10 or 20 milliseconds, five threads running in Java and one in native Java are picked from the list of threads and sampled. This thread list is iterated linearly, and threads not in the requested state are skipped (
source
).
The aggressive subsampling means that the effective sampling interval depends on the number of cores and the parallelism of your system. Say we have a large machine on which 32 threads can run in parallel. Then JFR on samples at most 19%, turning a sampling rate of 10ms into 53ms. This is an inherent property of wall-clock sampling, as the sampler considers threads on the system. This number can be arbitrarily large, so sub-sampling is necessary.
However, the sampling policy is not true wall-clock sampling, as it prioritizes threads running in Java. Consider a setting where 10 threads run in native and 5 in Java. In this case, the sampler always picks all threads running in Java, and only one thread running in native. This might be confusing and may lead users to the wrong conclusions.
Even if we gloss over this and call the current strategy “execution-time”, it might not be suitable for profiling every application. To quote from the/my JEP (thanks to Ron Pressler for writing most of the JEP text in its final form):
Execution time does not necessarily reflect CPU time. A method that sorts an array, e.g., spends all of its time on the CPU. Its execution time corresponds to the number of CPU cycles it consumes. In contrast, a method that reads from a network socket might spend most of its time idly waiting for bytes to arrive over the wire. Of the time it consumes, only a small portion is spent on the CPU. An execution-time profile will not distinguish between these cases.
Even a program that does a lot of I/O can be constrained by the CPU. A computation-heavy method might consume little execution time compared the program’s I/O operations, thus having little effect on latency — but it might consume most of the program’s CPU cycles, thus affecting throughput. Identifying and optimizing such methods will reduce CPU consumption and improve the program’s throughput — but in order to do so, we need to profile CPU time rather than execution time.
For example, consider a program,
HttpRequests
, with two threads, each performing HTTP requests. One thread runs a
tenFastRequests
method that makes ten requests, sequentially, to an HTTP endpoint that responds in 10ms; the other runs a
oneSlowRequest
method that makes a single request to an endpoint that responds in 100ms. The average latency of both methods should be about the same, and so the total time spent executing them should be about the same.
We can record a stream of execution-time profiling events like so:
You can find the program on
GitHub
. Be aware that it requires the server instance to run alongside, start it via
java HttpRequests server
At fixed time intervals, JFR records
ExecutionSample
events into the file
profile.jfr
. Each event captures the stack trace of a thread running Java code, thus recording all of the methods currently running on that thread. (The file
profile.jfc
is a JFR configuration file, included in the JDK, that configures the JFR events needed for an execution-time profile.)
We can generate a textual profile from the recorded event stream by using the
jfr
tool included in the JDK:
Here we see that the
oneSlowRequest
and
tenFastRequests
methods take a similar amount of execution time, as we expect.
However, we also expect
tenFastRequests
to take more CPU time than
oneSlowRequest
, since ten rounds of creating requests and processing responses requires more CPU cycles than just one round. If these methods were run concurrently on many threads then the program could become CPU-bound, yet an execution-time profile would still show most of the program’s time being spent waiting for socket I/O. If we could profile CPU time then we could see that optimizing
tenFastRequests
, rather than
oneSlowRequest
, could improve the program’s throughput.
Additionally, we point to a tiny but important problem in the JEP: the handling of failed samples. Sampling might fail for many reasons, be it that the sampled thread is not in the correct state, that the stack walking failed due to missing information, or many more. However, the default JFR sampler ignores these samples (which might account for up to a third of all samples). This doesn’t make interpreting the “execution-time” profiles any easier.
CPU-time profiling
As shown in the video above, sampling every thread every n milliseconds of CPU time improves the situation. Now, the number of samples for every thread is directly related to the time it spends on the CPU without any subsampling, as the number of hardware threads bounds the number of sampled threads.
The ability to accurately and precisely measure CPU-cycle consumption was added to the Linux kernel in version 2.6.12 via a timer that emits signals at fixed intervals of CPU time rather than fixed intervals of elapsed real time. Most profilers on Linux use this mechanism to produce CPU-time profiles.
Some popular third-party Java tools, including
async-profiler
, use Linux’s CPU timer to produce CPU-time profiles of Java programs. However, to do so, such tools interact with the Java runtime through unsupported internal interfaces. This is inherently unsafe and can lead to process crashes.
We should enhance JFR to use the Linux kernel’s CPU timer to safely produce CPU-time profiles of Java programs. This would help the many developers who deploy Java applications on Linux to make those applications more efficient.
Please be aware that I don’t discourage using async-profiler. It’s a potent tool and is used by many people. But it is inherently hampered by not being embedded into the JDK. This is especially true with the new stackwalking at safepoints (see
Taming the Bias: Unbiased* Safepoint-Based Stack Walking in JFR
), making the current JFR sampler safer to use. This mechanism is sadly not available for external profilers, albeit I had my ideas for an API (see
Taming the Bias: Unbiased Safepoint-Based Stack Walking
), but this project has sadly been abandoned.
Let’s continue with the example from before.
FR will use Linux’s CPU-timer mechanism to sample the stack of every thread running Java code at fixed intervals of CPU time. Each such sample is recorded in a new type of event,
jdk.CPUTimeSample
. This event is not enabled by default.
This event is similar to the existing
jdk.ExecutionSample
event for execution-time sampling. Enabling CPU-time events does not affect execution-time events in any way, so the two can be collected simultaneously.
We can enable the new event in a recording started at launch like so:
With the new CPU-time sampler, in the flame graph it becomes clear that the application spends nearly all of its CPU cycles in
tenFastRequests
:
A textual profile of the hot CPU methods, i.e., those that consume many CPU cycles in their own bodies rather than in calls to other methods, can be obtained like so:
$ jfr view cpu-time-hot-methods profile.jfr
However, in this particular example, the output is not as useful as the flame graph.
Notably, the CPU-time profiler also reports failed and missed samples, but more on that later.
Problems of the new Profiler
I pointed out all the problems in the current JFR method sampler, so I should probably point out my problems, too.
The most significant issue is platform support, or better, the lack of it: The new profiler only supports Linux for the time being. While this is probably not a problem for production profiling, as most systems use Linux anyway, it’s a problem for profiling on developer machines. Most development happens on Windows and Mac OS machines. So, not being able to use the same profiler as in production hampers productivity. But this is a problem for other profilers too. Async-profiler, for example, only supports wall-clock profiling on Mac OS and doesn’t support Windows at all. JetBrains has a closed-source version of async-profiler that might support cpu-time profiling on Windows (see
GitHub issue
). Still, I could not confirm as I don’t have a Windows machine and found no specific information online.
Another issue, of course, is that the profiler barely got in at the last minute, after Nicolai Parlog, for example, filmed his
Java 25 update video
.
Most users only use and get access to LTS versions of the JDK, so we wanted to get the feature into the LTS JDK 25 to allow people to experiment with it. To quote Markus Grönlund:
I am approving this PR for the following reasons:
We have reached a state that is “good enough” – I no longer see any fundamental design issues that can not be handled by follow-up bug fixes.
There are still many vague aspects included with this PR, as many has already pointed out, mostly related to the memory model and thread interactions – all those can, and should, be clarified, explained and exacted post-integration.
The feature as a whole is experimental and turned off by default.
Today is the penultimate day before JDK 25 cutoff. To give the feature a fair chance for making JDK25, it needs approval now.
Thanks a lot Johannes and all involved for your hard work getting this feature ready.
So, use the profiler with care. None of the currently known issues should break the JVM. But there are currently three important follow-up issues to the merged profiler:
I have already started work on the last issue and will be looking into the other two soon. Please test the profiler yourself and report all the issues you find.
The new CPUTimeSample Event
Where the old profiler had two events
jdk.ExecutionSample
and
jdk.NativeMethodSample
The new profiler has only one for simplicity, as it doesn’t treat threads in native and Java differently. As stated before, this event is called
jdk.CPUTimeSample
.
The event has five different fields:
stackTrace
(nullable): Recorded stack trace
eventThread
: Sampled thread
failed
(boolean): Did the sampler fail to walk the stack trace? Implies that
stackTrace
is
null
samplingPeriod
: The actual sampling period, directly computed in the signal handler. More on that next week.
Internally, the profiler uses bounded queues, which might overflow; this can result in lost events. The number of these events is regularly recorded in the form of the
jdk.CPUTimeSampleLoss
event. The event has two fields:
lostSamples
: Number of samples that have been lost since the last
jdk.CPUTimeSampleLoss
event
eventThread
: Thread for which the samples are lost
Both events allow a pretty good view of the program’s execution, including a relatively exact view of the CPU time used.
Configuration of the CPU-time Profiler
The emission of two events of the current sampler is controlled via the
period
property. It allows the user to configure the sampling interval. The problem now with the CPU-time profiler is that it might produce too many events depending on the number of hardware threads. This is why the
jdk.CPUTimeSample
event is controlled via the
throttle
setting. This setting can be either a sampling interval or an upper bound for the number of emitted events.
When setting an interval directly like “10ms” (as in the
default.jfc
), then we sample every thread every 10ms of CPU-time. This can at most result in 100 * #[hardware threads] events per second. On a 10 hardware thread machine, this results in at most (when every thread is CPU-bound) 1000 events per second or 12800 on a 128 hardware thread machine.
Setting, on the other hand,
throttle
to a rate like “500/s” (as in the
profile.jfc
), limits the number of events per second to a fixed rate. This is implemented by choosing the proper sampling interval in relation to the number of hardware threads. For a rate of “500/s” and a ten hardware thread machine, this would be 20ms. On a 128 hardware thread machine, this would be 0.256.
In addition to the two new events, there are two new views that you can use via
jfr view VIEW_NAME profile.jfr
:
cpu-time-hot-methods
shows you a list of the 25 most executed methods. These are methods that are on top of the stack the most (running the
example
with a 1ms throttle):
The second view is
cpu-time-statistics
which gives you the number of successful samples, failed samples, biased Samples, total samples, and lost samples:
CPU Time Sample Statistics
--------------------------
Successful Samples: 48
Failed Samples: 0
Biased Samples: 0
Total Samples: 48
Lost Samples: 14
All of the lost samples are caused by the sampled Java thread running VM internal code. This view is really helpful when checking whether the profiling contains the whole picture.
Conclusion
Getting this new profiler in JDK 25 was a real push, but I think it was worth it. OpenJDK now has a built-in CPU-time profiler that records missed samples. The implementation builds upon JFR’s new cooperative sampling approach, which also got into JDK 25 just days before. CPU-time profiling has many advantages, especially when you’re interested in the code that is actually wasting your CPU.
This is the first of a two-part series on the new profiler. You can expect a deep dive into the implementation of the profiler next week.
This blog post is part of my work in the
SapMachine
team at
SAP
, making profiling easier for everyone.
P.S.: I submitted to a few conferences the talk
From Idea to JEP: An OpenJDK Developer’s Journey to Improve Profiling
with the following description:
Have you ever wondered how profiling, like JFR, works in OpenJDK and how we can improve it? In this talk, I’ll take you on my three-year journey to improve profiling, especially method sampling, with OpenJDK: from the initial ideas and problems of existing approaches to my different draft implementations and JEP versions, with all the setbacks and friends I made along the way. It’s a story of blood, sweat, and C++.
It has sadly not been accepted yet.
Johannes Bechberger is a JVM developer working on profilers and their underlying technology in the SapMachine team at SAP. This includes improvements to async-profiler and its ecosystem, a website to view the different JFR event types, and improvements to the FirefoxProfiler, making it usable in the Java world. He started at SAP in 2022 after two years of research studies at the KIT in the field of Java security analyses. His work today is comprised of many open-source contributions and his blog, where he writes regularly on in-depth profiling and debugging topics, and of working on his JEP Candidate 435 to add a new profiling API to the OpenJDK.
New posts like these come out at least every two weeks, to get notified about new posts, follow me on
BlueSky
,
Twitter
,
Mastodon
, or
LinkedIn
, or join the newsletter:
New Bill Would Give Marco Rubio “Thought Police” Power to Revoke U.S. Passports
Intercept
theintercept.com
2025-09-13 09:00:00
Rubio has already sought to punish immigrants for speech. New legislation might let him do it for U.S. citizens.
The post New Bill Would Give Marco Rubio “Thought Police” Power to Revoke U.S. Passports appeared first on The Intercept....
In March,
Secretary of State Marco Rubio stripped Turkish doctoral student Rümeysa Öztürk’s of her visa based on what a court later found was nothing more than her
opinion piece
critical of Israel.
Now, a bill introduced by the chair of the House Foreign Affairs Committee is ringing alarm bells for civil liberties advocates who say it would grant Rubio the power to revoke the passports of American citizens on similar grounds.
The provision, sponsored by Rep. Brian Mast, R-Fla., as part of a larger State Department reorganization, is
set for a hearing
Wednesday.
Mast’s legislation says that it takes aim at “terrorists and traffickers,” but critics say it could be used to deny American citizens the right to travel based solely on their speech. (The State Department said it doesn’t comment on pending legislation.)
“Rubio has claimed the power to designate people terrorist supporters based solely on what they think.”
Seth Stern, the director of advocacy at Freedom of the Press Foundation, said the bill would open the door to “thought policing at the hands of one individual.”
“Marco Rubio has claimed the power to designate people terrorist supporters based solely on what they think and say,” Stern said, “even if what they say doesn’t include a word about a terrorist organization or terrorism.”
Mast’s new bill claims to target a narrow set of people. One section grants the secretary of state the power to revoke or refuse to issue passports for people who have been convicted — or merely charged — of material support for terrorism. (Mast’s office did not respond to a request for comment.)
Kia Hamadanchy, a senior policy counsel at the American Civil Liberties Union, said that language would accomplish little in practice, since terror convictions come with stiff prison sentences and pre-trial defendants are typically denied bail.
The other section sidesteps the legal process entirely. Rather, the secretary of state would be able to deny passports to people whom they determine “has knowingly aided, assisted, abetted, or otherwise provided material support to an organization the Secretary has designated as a foreign terrorist organization.”
The reference to “material support” disturbed advocates who have long warned that the government can misuse statutes criminalizing “material support” for terrorists — first passed after the 1996 Oklahoma City federal building bombing and toughened after the 9/11 attacks — to punish speech.
Some of those fears have been borne out. The Supreme Court
ruled in 2010
that even offering advice about international law to designated terror groups could be classified as material support.
The government even deemed a woman who was kidnapped and forced to cook and clean for Salvadoran guerrillas a material supporter of terrorism,
in order to justify her deportation.
Since the October 7 Hamas attacks, pro-Israel lawmakers and activists have ratcheted up attempts to expand the scope and use of anti-terror laws.
The
Anti-Defamation League
and the Louis D. Brandeis Center for Human Rights Under Law
suggested in a letter
last year that Students for Justice in Palestine was providing “material support” for Hamas through its on-campus activism.
Lawmakers also tried to pass a
“nonprofit killer” bill
that would allow the Treasury secretary to strip groups of their charitable status if they are deemed a “terrorist-supporting organization.” The bill was beaten back by a coalition of nonprofit groups, most recently during the debate over the
so-called Big, Beautiful Bill.
Mast’s bill contains eerily similar language, Stern said.
“This is an angle that lawmakers on the right seem intent on pursuing — whether through last year’s nonprofit killer bill, or a bill like this,” Stern said.
The provision particularly threatens journalists, Stern said. He noted that Sen. Tom Cotton, R-Ark., in November 2023
demanded a Justice Department “national security investigation”
of The Associated Press, CNN, New York Times, and Reuters over freelance photographers’ images of the October 7 attacks.
Since taking office, Rubio has also added groups to the State Department’s list of foreign terrorist organizations at a blistering pace, focusing largely on gangs and
drug cartels
that were previously the domain of the criminal legal system.
Free Speech Exception?
There is an ostensible safety valve in Mast’s bill. Citizens would be granted the right to appeal to Rubio within 60 days of their passports being denied or revoked.
That provided little comfort to the ACLU’s Hamadanchy, who is helping rally opposition to the bill.
“Basically, you can go back to the secretary, who has already made this determination, and try to appeal. There’s no standard set. There’s nothing,” he said.
Hamadanchy said the provision granting the secretary of state discretionary power over passports appeared to be an attempt to sidestep being forced to provide evidence of legal violations.
“I can’t imagine that if somebody actually provided material support for terrorism there would be an instance where it wouldn’t be prosecuted — it just doesn’t make sense,” he said.
While the “nonprofit killer” bill drew only a smattering of opposition on the right from libertarian-minded conservatives such as Rep. Thomas Massie, R-Ky., Stern said Republicans should be just as concerned about the potential infringement of civil liberties in the passport bill.
The law, he said, would also grant nearly unchecked power to a Democratic secretary, he said.
“Lately, it appears that the right is so convinced that it will never be out of power that the idea that one day the shoe might be on the other foot doesn’t resonate,” Stern said. “What is to stop a future Democratic administration from designating an anti-abortion activist, a supporter of West Bank settlements, an anti-vaxxer to be a supporter of terrorism and target them the same way? The list is endless.”
AFRICA’S DIGITAL RESET: What the continental internet exchange means for businesses and digital freedom – Business Insider
Africa has just taken one of its boldest steps yet toward digital sovereignty. On September 1, the African Union switched on the Continental Internet Exchange (CIX) – a sweeping project designed to connect Africa’s 1.4 billion people through a shared, continent-wide digital backbone.
Far from being a “separate internet,” the CIX is a unified infrastructure that keeps African data local, reduces costs, and accelerates connectivity.
For businesses across the continent, the economic implications are profound. For global tech giants, it signals a shift in the balance of power.
Lower costs, faster growth for African enterprises
For decades, most African internet traffic was routed through Europe or the US – even when a user in Nairobi was emailing someone in Johannesburg.
This created high latency, costly bandwidth, and dependency on foreign providers.
With the launch of CIX, data between African cities now travels directly, reducing latency by up to 50 percent and slashing bandwidth costs by as much as half.
For African businesses -from fintech startups in Lagos to e-commerce platforms in Dar es Salaam – this translates into faster transactions, cheaper cloud hosting, and a more reliable digital backbone.
Analysts project that the digital economy could add $180 billion to Africa’s GDP by 2025, with CIX providing the critical infrastructure to unlock that growth.
A new playing field for african startups
The move is also about leveling the competitive landscape. Africa currently spends over US$50 billion annually on foreign-owned digital services.
By localising data and cutting costs, CIX opens the door for homegrown apps, search engines, and cloud providers that can serve Africa’s 2,000+ languages and diverse markets more effectively than Silicon Valley.
“CIX creates the kind of infrastructure that allows African innovators to compete on home turf,” said a Nairobi-based venture capitalist. “For once, we are not just consumers of Western platforms — we have the tools to build our own.”
Why big tech is worried
Global platforms like Google, Amazon, and Microsoft remain accessible, but they are no longer the default gateway to Africa’s digital economy.
Data sovereignty rules under the African Digital Protocol (ADP) mean African data stays in Africa unless intentionally exported.
That not only protects citizens from foreign surveillance but also ensures that revenues generated from digital services circulate within African economies.
For Big Tech, this represents a direct challenge to their dominance.
Digital freedom or digital fragmentation?
The promise of CIX also raises questions. Who controls the flow of African data – governments, regional regulators, or private operators?
Could sovereignty tip into censorship if political leaders use the infrastructure to tighten control over information?
For businesses, digital freedom is critical. Investors and entrepreneurs will be watching closely to see whether the system fosters openness or creates new barriers to cross-border trade.
Challenges ahead
Despite the excitement, hurdles remain. Broadband penetration in Africa is still uneven, with rural regions lagging far behind urban centers.
Political coordination across 54 AU member states is another test, as divergent regulations could slow progress.
But the momentum is undeniable. Over 200 million users migrated to CIX-connected services within its first three days – one of the fastest adoption rates in global tech history.
The bottom line
The launch of the Continental Internet Exchange is a watershed moment for Africa’s digital economy.
For businesses, it promises lower costs, faster services, and a stronger foundation for innovation.
For consumers, it offers the prospect of cheaper internet and greater data security.
For policymakers, it represents both an opportunity and a responsibility: to ensure that Africa’s march toward digital sovereignty does not come at the expense of digital freedom. By 2027, when CIX aims for full coverage, Africa may no longer be at the periphery of the global internet economy. Instead, it could emerge as one of its most dynamic frontiers.
Linux 6.17 Fix Lands To Address Regression With "Serious Breakage" In Hibernation
This week's round of power management fixes for the in-development
Linux 6.17
kernel are on the more notable side with fixes for both AMD and Intel P-State drivers plus addressing a system hibernation issue that could lead to "serious breakage" and stems from a Linux 6.16 regression.
Intel engineer and power management subsystem maintainer Rafael Wysocki kicked off this week's power management pull request by noting a fix for a "nasty hibernation regression introduced during the 6.16 cycle."
The fix
elaborates on that nasty regression and ends up being a one-liner to resolve. Wysocki explained in that commit:
"Commit 12ffc3b1513e ("PM: Restrict swap use to later in the suspend sequence") incorrectly removed a pm_restrict_gfp_mask() call from hibernation_snapshot(), so memory allocations involving swap are not prevented from being carried out in this code path any more which may lead to serious breakage.
The symptoms of such breakage have become visible after adding a shrink_shmem_memory() call to hibernation_snapshot() in commit 2640e819474f ("PM: hibernate: shrink shmem pages after dev_pm_ops.prepare()") which caused this problem to be much more likely to manifest itself.
However, since commit 2640e819474f was initially present in the DRM tree that did not include commit 12ffc3b1513e, the symptoms of this issue were not visible until merge commit 260f6f4fda93 ("Merge tag 'drm-next-2025-07-30' of https://gitlab.freedesktop.org/drm/kernel") that exposed it through an entirely reasonable merge conflict resolution."
The issue was brought to light a few days ago in
a bug report
:
"The issue here is that as of 6.17.0-rc1, running hibernate (disk) more than 7 times causes instability on most machines. The hibernate can be run with /sys/power/disk set to any value. The issue is the hibernate image itself becoming corrupted. The instability appears in user space as the timeout and failure of any or all of these commands:
The system cannot be soft shutdown or rebooted, it has to be power cycled. I believe the init process memory itself is corrupted and thus anything that goes through the init process times out."
In addition to fixing that hibernation regression, there are also a few fixes too for the Intel and AMD P-State CPU frequency scaling drivers:
- Fix setting of CPPC.min_perf in the active mode with performance governor in the amd-pstate driver to restore its expected behavior changed recently (Gautham Shenoy)
- Avoid mistakenly setting EPP to 0 in the amd-pstate driver after system resume as a result of recent code changes (Mario Limonciello)"
Those fixes in
the pull request
were merged on Thursday ahead of the Linux 6.17-rc6 release coming on Sunday.
Suppose that you have a long string and you want to insert line breaks every 72 characters. You might need to do this if you need to write a public cryptographic key to a text file.
A simple C function ought to suffice. I use the letter K to indicate the length of the lines. I copy from an input buffer to an output buffer.
The
memcpy
function is likely to be turned into just a few instruction. For example, if you compile for a recent AMD processor (Zen 5), it might generate only two instructions (two
vmovups
) when the length of the lines (K) is 64.
Can we do better ?
In general, I expect that you cannot do much better than using the
memcpy
function. Compilers are simply great a optimizing it.
Yet it might be interesting to explore whether deliberate use of SIMD instructions could optimize this code. SIMD (Single Instruction, Multiple Data) instructions process multiple data elements simultaneously with a single instruction: the
memcpy
function automatically uses it. We can utilize SIMD instructions through intrinsic functions, which are compiler-supplied interfaces that enable direct access to processor-specific instructions, optimizing performance while preserving high-level code readability.
Let me focus on AVX2, the instruction set supported by effectively all x64 (Intel and AMD) processors. We can load 32-byte registers and write 32-byte registers. Thus we need a function that takes a 32-byte register and inserts a line-feed character at some location (
N
) in it. For cases where
N
is less than 16, the function shifts the input vector right by one byte to align the data correctly, using
_mm256_alignr_epi8
and
_mm256_blend_epi32
, before applying the shuffle mask and inserting the newline. When
N
is 16 or greater, it directly uses a shuffle mask from the precomputed
shuffle_masks
array to reorder the input bytes and insert the newline, leveraging a comparison with `0x80` to identify the insertion point and blending the result with a vector of newline characters for efficient parallel processing.
Can we go faster by using such a fancy function ? Let us test it out.
I wrote a benchmark
. I use a large input string on an Intel Ice Lake processor with GCC 12.
character-by-character
1.0 GB/s
8.0 ins/byte
memcpy
11 GB/s
0.46 ins/byte
AVX2
16 GB/s
0.52 ins/byte
The handcrafted AVX2 approach is faster in my tests than the
memcpy
approach despite using more instructions. However, the handcrafted AVX2 approach stores data to memory using fewer instructions.
If you're like me, you probably also have many scripts lying around which look like this:
CSV.open('jira.csv', 'w') do |csv|
[2024, 2025].each do |year|
(1..365).each do |day|
date = Date.ordinal(year, day)
break if date > Date.today
jira.metrics(date.strftime('%Y-%m-%d')).each do |result|
csv << result
end
end
end
end
Collecting massive amounts of data from various sources across your company's ecosystem. Sources such as
Jira
,
GitHub
,
Google Drive
, and
Confluence
.
This allows you to have your probes in place so that you can make data-driven decisions. Decisions to help you save costs, improve team performance, and optimize processes.
But just like me, you probably also don't have a good solution to where to store all that data.
You might have been using
CSV
files,
JSON
files, or spreadsheets. You might have later upgraded to using a database like
MySQL
or
PostgreSQL
. And if you're lucky, you might have even been using a data warehouse like
Snowflake
or
BigQuery
.
But none of these solutions work well for busy engineering managers. Files are clumsy to manage, databases can be rigid and require maintenance, and data warehouses are expensive and complex.
Until now!
DuckLake
gives you your own data lake which you can carry in your pocket. In essence, it is a data lake specification which uses
Parquet
files as the storage format and a database to store its metadata. For a more detailed (and accurate explanation), you should watch
Hannes Mühleisen
and
Mark Raasveldt
introducing
DuckLake
.
Nothing magical about it, which is another reason I love it so much. It's
simple
,
lightweight
, and
easy to use
.
Now we’re in
DuckDB
but we still need to install the DuckLake extension of DuckDB which can be done as follows:
INSTALL ducklake;
And to start using it:
ATTACH 'ducklake:metadata.ducklake' AS ducklake;
use ducklake
And we’re off to the races 🏇
First thing you want to do is to start ingesting data into your personal data lake. For the sake of the example let’s assume that you have some data you scraped from Jira which is saved in a CSV file called
jira_2024_2025.csv
.
Let’s create a table called jira and ingest the data into it.
CREATE TABLE ducklake.jira AS
SELECT * FROM read_csv_auto('jira_2024_2025.csv');
We can verify that the table has been created by running:
SHOW TABLES;
And verify that the data is in place by running:
FROM ducklake.jira;
Which will do a
SELECT
and return the results.
If we look in the file system we can see that DuckDB has created a couple of files/folders. The
metadata.ducklake
file is the actual DuckDB database which contains all the necessary metadata. Next to that we have a folder called
metadata.ducklake.files
which contains all the parquet files. At this point in time you should be seeing a folder called
main
under which you will see another folder called
jira
. This represents the table we currently have in our data lake. Under this folder, you will see a single parquet file.
You realize that there is more value in also having data from 2023 so you go ahead and generate the CSV. Now you need to import that as well. In order to do so you can do the following:
INSERT INTO ducklake.jira
SELECT * FROM read_csv_auto('jira_2023.csv');
Now when we look in the file system, you should see a 2nd parquet file appear under
metadata.ducklake.files/main/jira
.
As an engineering manager, we’ve been collecting all this data in order to be able to uncover issues and tell a story. This means that we do need to get this data into our favorite data visualization tool (
Tableau
,
Power BI
or
Livebook
). Since
DuckLake
is a relatively new technology, there aren’t that many adapters for it yet. But in our example, since we use
DuckDB
under the hood we can export the data in any format we’d like. I prefer to use
Parquet
again since it’s so lightweight and all DataViz tools I’ve mentioned above have connectors for it.
We can export a table like so:
COPY ducklake.jira TO 'jira.parquet' (FORMAT parquet);
This will create a new file called
jira.parquet
on the file system which we can point to with our favorite DataViz tool of preference.
DuckLake makes it incredibly easy for engineering managers to collect, store, and analyze data without the usual headaches. You don't need to rely on the cloud. Everything can live locally on your machine, making it both private and portable. If you ever want to scale up, you can simply move your Parquet files to blob storage and host the metadata database on a server, giving you flexibility as your needs grow (which I doubt you will need anytime soon).
DuckLake is super lightweight, fast, and efficient. You can store years of data in just a few megabytes, and querying or exporting your data is a breeze. With support for open formats like Parquet and seamless integration with popular data visualization tools, you get all the power of a modern data lake without the complexity or cost. For busy engineering managers who want actionable insights without the overhead, DuckLake is a game changer.
Discussion about this post
Nepal picks a new prime minister on a discord server days after social media ban
The operational primitives of deep learning, primarily matrix multiplication and convolution, exist
as a fragmented landscape of highly specialized tools. This paper introduces the Generalized Windowed
Operation (GWO), a theoretical framework that unifies these operations by decomposing them into three
orthogonal components: Path, defining operational locality; Shape, defining geometric structure and
underlying symmetry assumptions; and Weight, defining feature importance.
We elevate this framework to a predictive theory grounded in two fundamental principles. First, we
introduce the Principle of Structural Alignment, which posits that optimal generalization is achieved
when the GWO’s (P, S, W) configuration mirrors the data’s intrinsic structure. Second, we show that
this principle is a direct consequence of the Information Bottleneck (IB) principle. To formalize
this, we define an Operational Complexity metric based on Kolmogorov complexity. However, we
move beyond the simplistic view that lower complexity is always better. We argue that the nature of
this complexity—whether it contributes to brute-force capacity or to adaptive regularization—is
the true determinant of generalization. Our theory predicts that a GWO whose complexity is utilized to
adaptively align with data structure will achieve a superior generalization bound. Canonical operations
and their modern variants emerge as optimal solutions to the IB objective, and our experiments reveal that
the quality, not just the quantity, of an operation’s complexity governs its performance. The GWO theory
thus provides a grammar for creating neural operations and a principled pathway from data properties
to generalizable architecture design.
Files
Window_is_everythingwfwf.pdf
Files
(971.2 kB)
Social media promised connection, but it has delivered exhaustion
James O’Sullivan lectures in the School of English and Digital Humanities at University College Cork, where his work explores the intersection of technology and culture.
At first glance, the feed looks familiar, a seamless carousel of “For You” updates gliding beneath your thumb. But déjà‑vu sets in as 10 posts from 10 different accounts carry the same stock portrait and the same breathless promise — “click here for free pics” or “here is the one productivity hack you need in 2025.” Swipe again and three near‑identical replies appear, each from a pout‑filtered avatar directing you to “free pics.” Between them sits an ad for a cash‑back crypto card.
Scroll further and recycled TikTok clips with “original audio” bleed into Reels on Facebook and Instagram; AI‑stitched football highlights showcase players’ limbs bending like marionettes. Refresh once more, and the woman who enjoys your snaps of sushi rolls has seemingly spawned five clones.
Whatever remains of genuine, human content is increasingly sidelined by algorithmic prioritization, receiving fewer interactions than the engineered content and AI slop optimized solely for clicks.
These are the last days of social media as we know it.
Drowning The Real
Social media was built on the romance of authenticity. Early platforms sold themselves as conduits for genuine connection: stuff you wanted to see, like your friend’s wedding and your cousin’s dog.
Even influencer culture, for all its artifice, promised that behind the ring‑light stood an actual person. But the attention economy, and more recently, the generative AI-fueled late attention economy, have broken whatever social contract underpinned that illusion. The feed no longer feels crowded with people but crowded with content. At this point, it has far less to do with people than with consumers and consumption.
In recent years, Facebook and other platforms that facilitate billions of daily interactions have slowly morphed into the internet’s largest repositories of
AI‑generated spam
. Research has found what users plainly see: tens of thousands of machine‑written posts
now flood
public groups — pushing scams, chasing clicks — with
clickbait
headlines, half‑coherent listicles and hazy lifestyle images stitched together in AI tools like Midjourney.
It’s all just vapid, empty shit produced for engagement’s sake. Facebook is “sloshing” in low-effort AI-generated posts, as Arwa Mahdawi
notes
in The Guardian; some even bolstered by algorithmic boosts, like “
Shrimp Jesus
.”
The difference between human and synthetic content is becoming increasingly indistinguishable, and platforms seem unable, or uninterested, in trying to police it. Earlier this year, CEO Steve Huffman pledged to “
keep Reddit human
,” a tacit admission that floodwaters were already lapping at the last high ground. TikTok, meanwhile,
swarms
with AI narrators presenting concocted news reports and
“what‑if” histories
. A few creators do append labels disclaiming that their videos depict “no real events,” but many creators don’t bother, and many consumers don’t seem to care.
The problem is not just the rise of fake material, but the collapse of context and the acceptance that truth no longer matters as long as our cravings for colors and noise are satisfied. Contemporary social media content is more often rootless, detached from cultural memory, interpersonal exchange or shared conversation. It arrives fully formed, optimized for attention rather than meaning, producing a kind of semantic sludge, posts that look like language yet say almost nothing.
We’re drowning in this nothingness.
The Bot-Girl Economy
If spam (AI or otherwise) is the white noise of the modern timeline, its dominant melody is a different form of automation: the hyper‑optimized, sex‑adjacent human avatar. She appears everywhere, replying to trending tweets with selfies, promising “funny memes in bio” and linking, inevitably, to OnlyFans or one of its proxies. Sometimes she is real. Sometimes she is not. Sometimes she is a he, sitting in a
compound in Myanmar
. Increasingly, it makes no difference.
This convergence of bots, scammers, brand-funnels and soft‑core marketing underpins what might be called
the bot-girl economy
, a parasocial marketplace
fueled
in a large part by economic precarity. At its core is a transactional logic: Attention is scarce, intimacy is monetizable and platforms generally won’t intervene so long as engagement
stays high
. As more women now turn to online sex work, lots of men are eager to pay them for their services. And as these workers try to cope with the precarity imposed by platform metrics and competition, some can spiral, forever downward, into a transactional attention-to-intimacy logic that eventually turns them into more bot than human. To hold attention, some creators increasingly opt to behave like algorithms themselves,
automating
replies, optimizing content for engagement, or mimicking affection at scale. The distinction between performance and intention must surely erode as real people perform as synthetic avatars and synthetic avatars mimic real women.
There is loneliness, desperation and predation everywhere.
“Genuine, human content is increasingly sidelined by algorithmic prioritization, receiving fewer interactions than the engineered content and AI slop optimized solely for clicks.”
The bot-girl is more than just a symptom; she is a proof of concept for how social media bends even aesthetics to the logic of engagement. Once, profile pictures (both real and synthetic) aspired to hyper-glamor, unreachable beauty filtered through fantasy. But that fantasy began to underperform as average men sensed the ruse, recognizing that supermodels typically don’t send them DMs. And so, the system adapted, surfacing profiles that felt more plausible, more emotionally available. Today’s avatars project a curated accessibility: They’re attractive but not flawless, styled to suggest they might genuinely be interested in you. It’s a calibrated effect, just human enough to convey plausibility, just artificial enough to scale. She has to look more human to stay afloat, but act more bot to keep up. Nearly everything is socially engineered for maximum interaction: the like, the comment, the click, the private message.
Once seen as the fringe economy of cam sites, OnlyFans has become the dominant digital marketplace for sex workers. In 2023, the then-seven-year-old platform
generated
$6.63 billion in gross payments from fans, with $658 million in profit before tax. Its success has bled across the social web; platforms like X (formerly Twitter) now serve as de facto marketing layers for OnlyFans creators, with thousands of accounts running fan-funnel operations,
baiting
users into paid subscriptions.
The tools of seduction are also changing. One 2024 study
estimated
that thousands of X accounts use AI to generate fake profile photos. Many content creators have also
begun using AI
for talking-head videos,
synthetic voices
or endlessly varied selfies. Content is likely A/B tested for click-through rates. Bios are written with conversion in mind. DMs are automated or
outsourced
to AI impersonators. For users, the effect is a strange hybrid of influencer, chatbot and parasitic marketing loop. One minute you’re arguing politics, the next, you’re being pitched a girlfriend experience by a bot.
Engagement In Freefall
While content proliferates, engagement is evaporating. Average interaction rates across major platforms are declining fast: Facebook and X posts now scrape an average 0.15% engagement, while Instagram has dropped 24% year-on-year. Even TikTok has
begun to plateau
. People aren’t connecting or conversing on social media like they used to; they’re just wading through slop, that is, low-effort, low-quality content produced at scale, often with AI, for engagement.
And much of it
is
slop: Less than half of American adults
now rate
the information they see on social media as “mostly reliable”— down from roughly two-thirds in the mid-2010s. Young adults register the steepest collapse, which is unsurprising; as digital natives, they better understand that the content they scroll upon wasn’t necessarily produced by humans. And yet, they continue to scroll.
The timeline is no longer a source of information or social presence, but more of a mood-regulation device, endlessly replenishing itself with just enough novelty to suppress the anxiety of stopping. Scrolling has become a form of ambient dissociation, half-conscious, half-compulsive, closer to scratching an itch than seeking anything in particular. People know the feed is fake, they just don’t care.
Platforms have little incentive to stem the tide. Synthetic accounts are cheap, tireless and lucrative because they never demand wages or unionize. Systems designed to surface peer-to-peer engagement are now systematically filtering out such activity, because what counts as engagement has changed. Engagement is now about raw user attention – time spent, impressions, scroll velocity – and the net effect is an online world in which you are constantly being addressed but never truly spoken to.
The Great Unbundling
Social media’s death rattle will not be a bang but a shrug.
These networks once promised a single interface for the whole of online life: Facebook as social hub, Twitter as news‑wire, YouTube as broadcaster, Instagram as photo album, TikTok as distraction engine. Growth appeared inexorable. But now, the model is splintering, and users are drifting toward smaller, slower, more private spaces, like group chats, Discord servers and
federated microblogs
— a billion little gardens.
Since Elon Musk’s takeover, X has
shed
at least 15% of its global user base. Meta’s Threads, launched with great fanfare in 2023, saw its number of daily active users collapse within a month,
falling
from around 50 million active Android users at launch in July to only 10 million active users the following August. Twitch
recorded
its lowest monthly watch-time in over four years in December 2024, just 1.58 billion hours, 11% lower than the December average from 2020-23.
“While content proliferates, engagement is evaporating.”
Even the giants that still command vast audiences are no longer growing exponentially. Many platforms have already died (Vine, Google+, Yik Yak), are functionally dead or zombified (Tumblr, Ello), or have been revived and died again (MySpace, Bebo). Some notable exceptions aside, like Reddit and BlueSky (though it’s still early days for the latter), growth has plateaued across the board. While social media adoption continues to rise overall, it’s no longer explosive.
As of early 2025
, around 5.3 billion user identities — roughly 65% of the global population — are on social platforms, but annual growth has decelerated to just 4-5%, a steep drop from the double-digit surges seen earlier in the 2010s.
Intentional, opt-in micro‑communities are rising in their place — like Patreon collectives and Substack newsletters — where creators chase depth over scale, retention over virality. A writer with 10,000 devoted subscribers can potentially earn more and burn out less than one with a million passive followers on Instagram.
But the old practices are still evident: Substack is full of personal brands announcing their journeys, Discord servers host influencers disguised as community leaders and Patreon bios promise exclusive access that is often just recycled content. Still, something has shifted. These are not mass arenas; they are clubs — opt-in spaces with boundaries, where people remember who you are. And they are often paywalled, or at least heavily moderated, which at the very least keeps the bots out. What’s being sold is less a product than a sense of proximity, and while the economics may be similar, the affective atmosphere is different, smaller, slower, more reciprocal. In these spaces, creators don’t chase virality; they cultivate trust.
Even the big platforms sense the turning tide. Instagram has begun emphasizing DMs, X is pushing subscriber‑only circles and TikTok is experimenting with private communities. Behind these developments is an implicit acknowledgement that the infinite scroll, stuffed with bots and synthetic sludge, is approaching the limit of what humans will tolerate. A lot of people
seem to be fine
with slop, but as more start to crave authenticity, the platforms will be forced to take note.
From Attention To Exhaustion
The social internet was built on attention, not only the promise to capture yours but the chance for you to capture a slice of everyone else’s. After two decades, the mechanism has inverted, replacing connection with exhaustion. “Dopamine detox” and “digital Sabbath” have entered the mainstream. In the U.S.,
a significant proportion
of 18‑ to 34‑year‑olds took deliberate breaks from social media in 2024, citing mental health as the motivation, according to an American Psychiatric Association poll. And yet, time spent on the platforms remains high — people scroll not because they enjoy it, but because they don’t know how to stop. Self-help influencers now recommend weekly “no-screen Sundays” (yes, the irony). The mark of the hipster is no longer an ill-fitting beanie but an old-school Nokia dumbphone.
Some creators are quitting, too
. Competing with synthetic performers who never sleep, they find the visibility race not merely tiring but absurd. Why post a selfie when an AI can generate a prettier one? Why craft a thought when ChatGPT can produce one faster?
These are the last days of social media, not because we lack content, but because the attention economy has neared its outer limit — we have exhausted the capacity to care. There is more to watch, read, click and react to than ever before — an endless buffet of stimulation. But novelty has become indistinguishable from noise. Every scroll brings more, and each addition subtracts meaning. We are indeed drowning. In this saturation, even the most outrageous or emotive content struggles to provoke more than a blink.
Outrage fatigues. Irony flattens. Virality cannibalizes itself. The feed no longer surprises but sedates, and in that sedation, something quietly breaks, and social media no longer feels like a place to be; it is a surface to skim.
No one is forcing anyone to go on TikTok or to consume the clickbait in their feeds. The content served to us by algorithms is, in effect, a warped mirror, reflecting and distorting our worst impulses. For younger users in particular, their scrolling of social media can
become compulsive
, rewarding
their
developing brains with unpredictable hits of dopamine that keep them glued to their screens.
Social media platforms have also achieved something more elegant than coercion: They’ve made non-participation a form of self-exile, a luxury available only to those who can afford its costs.
“Why post a selfie when an AI can generate a prettier one? Why craft a thought when ChatGPT can produce one faster?”
Our offline reality is irrevocably shaped by our online world: Consider the worker who deletes or was never on LinkedIn, excluding themselves from professional networks that increasingly exist nowhere else; or the small business owner who abandons Instagram, watching customers drift toward competitors who maintain their social media presence. The teenager who refuses TikTok may find herself unable to parse references, memes and microcultures that soon constitute her peers’ vernacular.
These platforms haven’t just captured attention, they’ve enclosed the commons where social, economic and cultural capital are exchanged. But enclosure breeds resistance, and as exhaustion sets in, alternatives begin to emerge.
Architectures Of Intention
The successor to mass social media is, as already noted, emerging not as a single platform, but as a scattering of alleyways, salons, encrypted lounges and federated town squares — those little gardens.
Maybe today’s major social media platforms will find new ways to hold the gaze of the masses, or maybe they will continue to decline in relevance, lingering like derelict shopping centers or a dying online game, haunted by bots and the echo of once‑human chatter. Occasionally we may wander back, out of habit or nostalgia, or to converse once more as a crowd, among the ruins. But as social media collapses on itself, the future points to a quieter, more fractured, more human web, something that no longer promises to be everything, everywhere, for everyone.
This is a good thing. Group chats and invite‑only circles are where context and connection survive. These are spaces defined less by scale than by shared understanding, where people no longer perform for an algorithmic audience but speak in the presence of chosen others. Messaging apps like Signal are quietly
becoming dominant
infrastructures for digital social life, not because they promise discovery, but because they don’t. In these spaces, a message often carries more meaning because it is usually directed, not broadcast.
Social media’s current logic is designed to reduce friction, to give users infinite content for instant gratification, or at the very least, the anticipation of such. The antidote to this compulsive, numbing overload will be found in
deliberative
friction, design patterns that introduce pause and reflection into digital interaction, or platforms and algorithms that create space for intention.
This isn’t about making platforms needlessly cumbersome but about distinguishing between helpful constraints and extractive ones. Consider
Are.na
, a non-profit, ad-free creative platform founded in 2014 for collecting and connecting ideas that feels like the anti-Pinterest: There’s no algorithmic feed or engagement metrics, no trending tab to fall into and no infinite scroll. The pace is glacial by social media standards. Connections between ideas must be made manually, and thus, thoughtfully — there are no algorithmic suggestions or ranked content.
To demand intention over passive, mindless screen time, X could require a 90-second delay before posting replies, not to deter participation, but to curb reactive broadcasting and engagement farming. Instagram could show how long you’ve spent scrolling before allowing uploads of posts or stories, and Facebook could display the carbon cost of its data centers, reminding users that digital actions have material consequences, with each refresh. These small added moments of friction and purposeful interruptions — what UX designers currently optimize away — are precisely what we need to break the cycle of passive consumption and restore intention to digital interaction.
We can dream of a digital future where belonging is no longer measured by follower counts or engagement rates, but rather by the development of trust and the quality of conversation. We can dream of a digital future in which communities form around shared interests and mutual care rather than algorithmic prediction. Our public squares — the big algorithmic platforms — will never be cordoned off entirely, but they might sit alongside countless semi‑public parlors where people choose their company and set their own rules, spaces that prioritize continuity over reach and coherence over chaos. People will show up not to go viral, but to be seen in context. None of this is about escaping the social internet, but about reclaiming its scale, pace, and purpose.
Governance Scaffolding
The most radical redesign of social media might be the most familiar: What if we treated these platforms as
public utilities
rather than private casinos?
A public-service model wouldn’t require state control; rather, it could be governed through civic charters, much like public broadcasters operate under mandates that balance independence and accountability. This vision stands in stark contrast to the current direction of most major platforms, which are becoming increasingly opaque.
“Non-participation [is] a form of self-exile, a luxury available only to those who can afford its costs.”
In recent years, Reddit and X, among other platforms, have either restricted or removed API access, dismantling open-data pathways. The very infrastructures that shape public discourse are retreating from public access and oversight. Imagine social media platforms with transparent algorithms subject to public audit, user representation on governance boards, revenue models based on public funding or member dues rather than surveillance advertising, mandates to serve democratic discourse rather than maximize engagement, and regular impact assessments that measure not just usage but societal effects.
Some initiatives gesture in this direction. Meta’s Oversight Board, for example, frames itself as an independent body for content moderation appeals, though its remit is narrow and its influence ultimately limited by Meta’s discretion. X’s Community Notes, meanwhile, allows user-generated fact-checks but relies on opaque scoring mechanisms and lacks formal accountability. Both are add-ons to existing platform logic rather than systemic redesigns. A true public-service model would bake accountability into the platform’s infrastructure, not just bolt it on after the fact.
The European Union has begun exploring this territory through its Digital Markets Act and Digital Services Act, but these laws, enacted in 2022, largely focus on regulating existing platforms rather than imagining new ones. In the United States, efforts are more fragmented. Proposals such as the Platform Accountability and Transparency Act (PATA) and state-level laws in California and New York aim to increase oversight of algorithmic systems, particularly where they impact youth and mental health. Still, most of these measures seek to retrofit accountability onto current platforms. What we need are spaces built from the ground up on different principles, where incentives align with human interest rather than extractive, for-profit ends.
This could take multiple forms, like municipal platforms for local civic engagement, professionally focused networks run by trade associations, and educational spaces managed by public library systems. The key is diversity, delivering an ecosystem of civic digital spaces that each serve specific communities with transparent governance.
Of course, publicly governed platforms aren’t immune to their own risks. State involvement can bring with it the threat of politicization, censorship or propaganda, and this is why the governance question must be treated as infrastructural, rather than simply institutional. Just as public broadcasters in many democracies operate under charters that insulate them from partisan interference, civic digital spaces would require independent oversight, clear ethical mandates, and democratically accountable governance boards, not centralized state control. The goal is not to build a digital ministry of truth, but to create pluralistic public utilities: platforms built for communities, governed by communities and held to standards of transparency, rights protection and civic purpose.
The technical architecture of the next social web is already emerging through federated and distributed protocols like ActivityPub (used by Mastodon and Threads) and Bluesky’s
Authenticated Transfer (AT) Protocol
, or atproto, (a decentralised framework that allows users to move between platforms while keeping their identity and social graph) as well as various blockchain-based experiments,
like Lens
and
Farcaster
.
But protocols alone won’t save us. The email protocol is decentralized, yet most email flows through a handful of corporate providers. We need to “
rewild the internet
,” as Maria Farrell and Robin Berjon mentioned in a Noema essay. We need governance scaffolding, shared institutions that make decentralization viable at scale. Think credit unions for the social web that function as member-owned entities providing the infrastructure that individual users can’t maintain alone. These could offer shared moderation services that smaller instances can subscribe to, universally portable identity systems that let users move between platforms without losing their history, collective bargaining power for algorithm transparency and data rights, user data dividends for all, not just influencers (if platforms profit from our data, we should share in those profits), and algorithm choice interfaces that let users select from different recommender systems.
Bluesky’s AT Protocol explicitly allows users to port identity and social graphs, but it’s very early days and cross-protocol and platform portability remains extremely limited, if not effectively non-existent. Bluesky also allows users to choose among multiple content algorithms, an important step toward user control. But these models remain largely tied to individual platforms and developer communities. What’s still missing is a civic architecture that makes algorithmic choice universal, portable, auditable and grounded in public-interest governance rather than market dynamics alone.
Imagine being able to toggle between different ranking logics: a chronological feed, where posts appear in real time; a mutuals-first algorithm that privileges content from people who follow you back; a local context filter that surfaces posts from your geographic region or language group; a serendipity engine designed to introduce you to unfamiliar but diverse content; or even a human-curated layer, like playlists or editorials built by trusted institutions or communities. Many of these recommender models do exist, but they are rarely user-selectable, and almost never transparent or accountable. Algorithm choice shouldn’t require a hack or browser extension; it should be built into the architecture as a civic right, not a hidden setting.
“What if we treated these platforms as public utilities rather than private casinos?”
Algorithmic choice can also develop new hierarchies. If feeds can be curated like playlists, the next influencer may not be the one creating content, but editing it. Institutions, celebrities and brands will be best positioned to build and promote their own recommendation systems. For individuals, the incentive to do this curatorial work will likely depend on reputation, relational capital or ideological investment. Unless we design these systems with care, we risk reproducing old dynamics of platform power, just in a new form.
Federated platforms like Mastodon and Bluesky face
real tensions
between autonomy and safety: Without centralized moderation, harmful content can proliferate, while over-reliance on volunteer admins creates sustainability problems at scale. These networks also risk reinforcing ideological silos, as communities block or mute one another, fragmenting the very idea of a shared public square. Decentralization gives users more control, but it also raises difficult questions about governance, cohesion and collective responsibility — questions that any humane digital future will have to answer.
But there is a possible future where a user, upon opening an app, is asked how they would like to see the world on a given day. They might choose the serendipity engine for unexpected connections, the focus filter for deep reads or the local lens for community news. This is technically very achievable — the data would be the same; the algorithms would just need to be slightly tweaked — but it would require a design philosophy that treats users as citizens of a shared digital system rather than cattle. While this is possible, it can feel like a pipe dream.
To make algorithmic choice more than a thought experiment, we need to change the incentives that govern platform design. Regulation can help, but real change will come when platforms are rewarded for serving the public interest. This could mean tying tax breaks or public procurement eligibility to the implementation of transparent, user-controllable algorithms. It could mean funding research into alternative recommender systems and making those tools open-source and interoperable. Most radically, it could involve certifying platforms based on civic impact, rewarding those that prioritize user autonomy and trust over sheer engagement.
Digital Literacy As Public Health
Perhaps most crucially, we need to reframe digital literacy not as an individual responsibility but as a collective capacity. This means moving beyond spot-the-fake-news workshops to more fundamental efforts to understand how algorithms shape perception and how design patterns exploit our cognitive processes.
Some education systems are
beginning to respond
, embedding digital and media literacy across curricula. Researchers and educators argue that this work needs to begin in early childhood and continue through secondary education as a core competency. The goal is to equip students to critically examine the digital environments they inhabit daily, to
become active
participants in shaping the future of digital culture rather than passive consumers. This includes what some call
algorithmic literacy
, the ability to understand how recommender systems work, how content is ranked and surfaced, and how personal data is used to shape what you see — and what you don’t.
Teaching this at scale would mean treating digital literacy as public infrastructure, not just a skill set for individuals, but a form of shared civic defense. This would involve long-term investments in teacher training, curriculum design and support for public institutions, such as libraries and schools, to serve as digital literacy hubs. When we build collective capacity, we begin to lay the foundations for a digital culture grounded in understanding, context and care.
We also need behavioral safeguards like default privacy settings that protect rather than expose, mandatory cooling-off periods for viral content (deliberately slowing the spread of posts that suddenly attract high engagement), algorithmic impact assessments before major platform changes and public dashboards that show platform manipulation, that is, coordinated or deceptive behaviors that distort how content is amplified or suppressed, in real-time. If platforms are forced to disclose their engagement tactics, these tactics lose power. The ambition is to make visible hugely influential systems that currently operate in obscurity.
We need to build new digital spaces grounded in different principles, but this isn’t an either-or proposition. We also must reckon with the scale and entrenchment of existing platforms that still structure much of public life. Reforming them matters too. Systemic safeguards may not address the core incentives that inform platform design, but they can mitigate harm in the short term. The work, then, is to constrain the damage of the current system while constructing better ones in parallel, to contain what we have, even as we create what we need.
The choice isn’t between technological determinism and Luddite retreat; it’s about constructing alternatives that learn from what made major platforms usable and compelling while rejecting the extractive mechanics that turned those features into tools for exploitation. This won’t happen through individual choice, though choice helps; it also won’t happen through regulation, though regulation can really help. It will require our collective imagination to envision and build systems focused on serving human flourishing rather than harvesting human attention.
Social media as we know it is dying, but we’re not condemned to its ruins. We are capable of building better — smaller, slower, more intentional, more accountable — spaces for digital interaction, spaces where the metrics that matter aren’t engagement and growth but understanding and connection, where algorithms serve the community rather than strip-mining it.
The last days of social media might be the first days of something more human: a web that remembers why we came online in the first place — not to be harvested but to be heard, not to go viral but to find our people, not to scroll but to connect. We built these systems, and we can certainly build better ones. The question is whether we will do this or whether we will continue to drown.
Timed out getting readerview for https://www.alizila.com/qwen-ecosystem-expands-rapidly-accelerating-ai-adoption-across-industries/
UK workers wary of AI despite Starmer’s push to increase uptake, survey finds
Guardian
www.theguardian.com
2025-09-13 06:00:56
Exclusive: A third of those polled do not tell bosses about use of tools and half think AI threatens the social structure It is the work shortcut that dare not speak its name. A third of people do not tell their bosses about their use of AI tools amid fears their ability will be questioned if they d...
It is the work shortcut that dare not speak its name. A third of people do not tell their bosses about their use of AI tools amid fears their ability will be questioned if they do.
Research for the Guardian has revealed that only 13% of UK adults openly discuss their use of AI with senior staff at work and close to half think of it as a tool to help people who are not very good at their jobs to get by.
Amid widespread predictions that many workers face a fight for their jobs with AI, polling by Ipsos found that among more than 1,500 British workers aged 16 to 75, 33% said they did not discuss their use of AI to help them at work with bosses or other more senior colleagues. They were less coy with people at the same level, but a quarter of people believe “co-workers will question my ability to perform my role if I share how I use AI”.
The Guardian’s survey also uncovered deep worries about the advance of AI, with more than half of those surveyed believing it threatens the social structure. The number of people believing it has a positive effect is outweighed by those who think it does not. It also found 63% of people do not believe AI is a good substitute for human interaction, while 17% think it is.
Next week’s
state visit to the UK
by Donald Trump is expected to signal greater collaboration between the UK and Silicon Valley to make Britain an important centre of AI development.
The US president is expected to be joined by Sam Altman, the co-founder of OpenAI who has signed a memorandum of understanding with the UK government to explore the deployment of advanced AI models in areas including justice, security and education. Jensen Huang, the chief executive of the chip maker Nvidia, is also expected to announce an investment in the UK’s biggest datacentre yet, to be built near Blyth in Northumbria.
Keir Starmer
has said
he wants to “mainline AI into the veins” of the UK. Silicon Valley companies are aggressively marketing their AI systems as capable of cutting grunt work and liberating creativity.
The polling appears to reflect workers’ uncertainty about how bosses want AI tools to be used, with many employers not offering clear guidance. There is also fear of stigma among colleagues if workers are seen to rely too heavily on the bots.
A separate
US study
circulated this week found that medical doctors who use AI in decision-making are viewed by their peers as significantly less capable. Ironically, the doctors who took part in the research by Johns Hopkins Carey Business School recognised AI as beneficial for enhancing precision, but took a negative view when others were using it.
Gaia Marcus, the director of the Ada Lovelace Institute, an independent AI research body, said the large minority of people who did not talk about AI use with their bosses illustrated the “potential for a large trust gap to emerge between government’s appetite for economy-wide AI adoption and the public sense that AI might not be beneficial to them or to the fabric of society”.
“We need more evaluation of the impact of using these tools, not just in the lab but in people’s everyday lives and workflows,” she said. “To my knowledge, we haven’t seen any compelling evidence that the spread of these generative AI tools is significantly increasing productivity yet. Everything we are seeing suggests the need for humans to remain in the driving seat with the tools we use.”
A study by the Henley Business School in May found 49% of workers reported there were no formal guidelines for AI use in their workplace and more than a quarter felt their employer did not offer enough support.
Prof Keiichi Nakata at the school said people were more comfortable about being transparent in their use of AI than 12 months earlier but “there are still some elements of AI shaming and some stigma associated with AI”.
He said: “Psychologically, if you are confident with your work and your expertise you can confidently talk about your engagement with AI, whereas if you feel it might be doing a better job than you are or you feel that you will be judged as not good enough or worse than AI, you might try to hide that or avoid talking about it.”
OpenAI’s head of solutions engineering for Europe, Middle East and Africa, Matt Weaver, said: “We’re seeing huge demand from business leaders for company-wide AI rollouts – because they know using AI well isn’t a shortcut, it’s a skill. Leaders see the gains in productivity and knowledge sharing and want to make that available to everyone.”
SkiftOS: A hobby OS built from scratch using C/C++ for ARM, x86, and RISC-V
Raspberry Pi Synthesizers ·
Source:
Korg, Raspberry Pi
The Raspberry Pi microcomputer is finding its way into more and more synthesizers. Do your synths have a slice of it inside? Read on to find out.
Digital synthesizers are essentially computers in specialized housings. Rather than a keyboard with letters and numbers, their keyboards trigger notes. Custom-designed
DSP
(digital signal processing) systems can be expensive so some manufacturers are turning to ready-made computing systems to run their synths. One that’s been gaining in popularity in recent years is
Raspberry Pi
. The low-cost mini computer is now in instruments by
Korg
,
Erica Synths
and many more.
Is this cheating? Do any of
your
synths have Pi inside? Let’s find out.
DSP In Synthesizers
Digital synthesizers have existed in some form since the 1970s, with the
New England Digital Synclavier
being the first commercial release in 1977. As synthesizers became more powerful, adding
sampling
and
physical modelling
to the already existing
FM synthesis
, the DSP required to run them became more complex. Additions like
sequencers
and
effects
only compounded the expense.
New England Digital Synclavier II ·
Source: Synclavier
To run their DSP, manufacturers created
custom DSP systems
running on off-the-shelf
chips
from companies like
Motorola
and
Texas Instruments
. One example was Korg’s Pentium-based
OASYS
workstation from 2005. While incredibly powerful, it was also incredibly expensive.
How to keep the power while also lowering the cost?
Raspberry Pi: What Is It?
The
solution
for Korg – as well as other manufacturers, as we’ll see – was the
Raspberry Pi
. Essentially a complete computer processor in a small and – critically – inexpensive package, this programmable hardware get used for all sorts of applications. From robotics to home computing to (you guessed it) digital synthesizers, ready-made Raspberry Pis offer an elegant and affordable solution for custom computing systems.
Raspberry Pi ·
Source: Raspberry Pi
Korg Serves Up Some Pi
The biggest synthesizer manufacturer to make use of the Raspberry Pi is
Korg
. The Japanese synth company’s
Wavestate
,
Modwave
and
Opsix
digital synths all make use of the
Raspberry Pi Compute Module
. (They’re in the
module versions
too.)
Korg Modules ·
Source: Korg
In an article on the Raspberry Pi
home page
, Korg’s Andy Leary sites price and manufacturing scale as the main reason Korg decided on these components. He also liked that it was ready to go as is, providing CPU, RAM and storage in a single package. “That part of the work is already done,” he said in the article. “It’s like any other component; we don’t have to lay out the board, build it and test it.”
The software for each instrument is, of course, custom. The Raspberry Pi, however, generates the sound. “Not everyone understands that Raspberry Pi is actually making the sound,” said Korg’s Dan Philips in the same piece. “We use the CM3 because it’s very powerful, which makes it possible to create deep, compelling instruments.”
You are currently viewing a placeholder content from
YouTube
. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
You are currently viewing a placeholder content from
YouTube
. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
You are currently viewing a placeholder content from
YouTube
. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
You might not expect to find a Raspberry Pi inside an
analogue synthesizer
but if that synth happens to have digital functionality… Take the
Bullfrog
, for example.
Erica Synths
and
Richie Hawtin
’s educational desktop analogue has a RP2040 to handle MIDI implementation as well as functionality for the Sampler/Looper voice card. This adds additional functionality to the largely analogue synthesizer.
Bullfrog Synthesizer ·
Source: Erica Synths
Affiliate Links
Erica Synths Bullfrog
No customer rating available yet
You are currently viewing a placeholder content from
YouTube
. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
One of the
benefits
of using Raspberry Pi is the ability to make it
open source
. The DIY kit
Zynthian
is an open synth platform with a Raspberry Pi 4 at its centre. The desktop box can function as a
keyboard expander
,
effects unit
,
MIDI processor
,
groovebox
or even
micro-DAW
. “Zynthian is a community-driven project and it’s 100% open source,” the company says on its
site
. “Free software on Open hardware. Completely configurable and fully hackable!”
Zynthian ·
Source: Zynthian
Damn Fine Pi
There are plenty
more synths
making use of the Raspberry Pi. One that you may not realize is
Organelle M
by Critter and Guitari. By putting a Pi inside, they’re able to run Pure Data, meaning that you can program your own synths to use inside too.
Critter & Guitari Organelle S ·
Source: Critter & Guitari
Another fun instrument with a Raspberry Pi 3 for a soul is
Tasty Chips
’
GR-1
Granular
synthesizer
.
For something a little more esoteric, try the
Yoshimi Pi
. “Yoshimi Pi is the hardware incarnation of the software synth Yoshimi, running on a Raspberry Pi 4 in a rugged metal case with a built-in PSU and line level audio output,” according to the product page.
Of course, you don’t have to buy a commercial Raspberry Pi-based synthesizer. There are plenty of
DIY options
to run them “bare metal,” that is, without a separate operating system. Just hook up a MIDI controller to the board and you’re off and running. Try
MiniSynth Pi
or code your own!
Affiliate Links
Critter & Guitari 5 Moons
No customer rating available yet
Decksaver Tasty Chips GR-1
Raspberry Pi: Is It Cheating?
In the same way that some claim that virtual analogue is just a “VST in a box,” others complain that synths with Raspberry Pi at the core are somehow cheating. You may as well just make your own, right?
Raspberry Pi ·
Source: Raspberry Pi
“Just because something is based on a Raspberry Pi it doesn’t mean it’s trivial to make one,” said chalk_walk in a Reddit
thread
on the Organelle. “If they provide the software then you may be able to put together something equivalent, but if not: you are out of luck if you want an Organelle. Similarly, part of the complexity is in making an enclosure with appropriate controls and displays.”
As we’ve seen, all digital synthesizers have some kind of computer inside. Whether that’s a custom DSP with off-the-shelf chips or a Raspberry Pi, you still have to code the software, design the enclosure and PCBs and everything else that goes along with it. By going with a little computer like this, you can shave some money off the asking price and save on development time too.
In this tutorial, we will build an optimizer for a subset of linear algebra using egglog.
We will start by optimizing simple integer arithmetic expressions.
Our initial DSL supports constants, variables, addition, and multiplication.
# mypy: disable-error-code="empty-body"from__future__importannotationsfromtypingimportTypeAliasfromcollections.abcimportIterablefromegglogimport*classNum(Expr):def__init__(self,value:i64Like)->None:...@classmethoddefvar(cls,name:StringLike)->Num:...def__add__(self,other:NumLike)->Num:...def__mul__(self,other:NumLike)->Num:...# Support inverse operations for convenience# they will be translated to non-reversed onesdef__radd__(self,other:NumLike)->Num:...def__rmul__(self,other:NumLike)->Num:...NumLike:TypeAlias=Num|StringLike|i64Like
The signature here takes
NumLike
not
Num
so that you can write
Num(1)
+
2
instead of
Num(1)
+
Num(2)
. This is helpful for ease of use and also for compatibility when you are trying to
create expressions that act like Python objects which perform upcasting.
To support this, you must define conversions between primitive types and your expression types.
When a value is passed into a function, it will find the type it should be converted to and
transitively apply the conversions you have defined:
In this tutorial we will use the function form to define rewrites and rules, because then then we only
have to write the variable names once as arguments and they are not leaked to the outer scope.
This rule asserts that addition is commutative. More concretely, this rules says, if the e-graph
contains expressions of the form
x
+
y
, then the e-graph should also contain the
expression
y
+
x
, and they should be equivalent.
Similarly, we can define the associativity rule for addition.
This rule says, if the e-graph contains expressions of the form
x
+
(y
+
z)
, then the e-graph should also contain
the expression
(x
+
y)
+
z
, and they should be equivalent.
There are two subtleties to rules:
Defining a rule is different from running it. The following check would fail at this point
because the commutativity rule has not been run (we’ve inserted
x
+
3
but not yet derived
3
+
x
).
egraph.check_fail((x+3)==(3+x))
Rules are not instantiated for every possible term; they are only instantiated for terms that are
in the e-graph. For instance, even if we ran the commutativity rule above, the following check would
still fail because the e-graph does not contain either of the terms
Num(-2)
+
Num(2)
or
Num(2)
+
Num(-2)
.
egraph.check_fail(Num(-2)+2==Num(2)+-2)
Let’s also define commutativity and associativity for multiplication.
egglog
also defines a set of built-in functions over primitive types, such as
+
and
*
,
and supports operator overloading, so the same operator can be used with different types.
egraph.extract(i64(1)+2)
egraph.extract(String("1")+"2")
egraph.extract(f64(1.0)+2.0)
With primitives, we can define rewrite rules that talk about the semantics of operators.
The following rules show constant folding over addition and multiplication.
While we have defined several rules, the e-graph has not changed since we inserted the two
expressions. To run rules we have defined so far, we can use
run
.
In other words,
egglog
computes all the matches for one iteration before making any
updates to the e-graph. This is in contrast to an evaluation model where rules are immediately
applied and the matches are obtained on demand over a changing e-graph.
We can now look at the e-graph and see that that
2
*
(x
+
3)
and
6
+
(2
*
x)
are now in the same E-class.
We can also check this fact explicitly
egraph.check(expr1==expr2)
Show HN: wcwidth-o1 – Find Unicode text cell width in no time for JavaScript/TS
A TypeScript/JavaScript port of Markus Kuhn’s
wcwidth
and
wcswidth
implementations, optimized to O(1).
These functions are defined in IEEE Std 1003.1 (POSIX).
n
: Max characters to process (defaults to full length).
Behind Wcwidth
In fixed-width terminals, most Latin characters take up one column, while East
Asian (CJK) ideographs usually take up two. The challenge is deciding how many
“cells” each Unicode character should occupy so that text aligns correctly.
The Unicode standard defines width classes:
Wide (W)
and
Fullwidth (F)
- always 2 columns
Halfwidth (H)
and
Narrow (Na)
- always 1 column
Ambiguous (A)
- 1 column normally, but 2 in CJK compatibility mode
Neutral (N)
- treated as 1 column here for simplicity
Other rules include:
U+0000
(null) - width 0
Control characters -
-1
Combining marks - width 0
Soft hyphen (
U+00AD
) - width 1
Zero width space (
U+200B
) - width 0
This logic originates from Markus Kuhn’s reference implementation and is widely
used in terminal emulators to ensure consistent alignment.
Not really! Ingress is cheap, Cloud Run is cheap, and GCS is cheap.
To avoid paying for egress, I limit the amount of data that I'll serve directly and instead give you a command you can run on your own machine.
The most expensive part of this is actually the domain name.
Isn't this expensive for the registry?
Not really! The first time a layer is accessed, I download and index it. Browsing the filesystem just uses that index, and opening a file uses Range requests to load small chunks of the layer as needed.
Since I only have to download the whole layer once, this actually reduces traffic to the registry in a lot of cases, e.g. if you share a link with someone rather than having them pull the whole image on their machine.
In fact, Docker has graciously sponsored this service by providing me an account with unlimited public Docker Hub access. Thanks, Docker!
That can't be true, gzip doesn't support random access!
That's not a question.
Okay then, how does random access work if the layers are gzipped tarballs?
Tl;dr, you can seek to an arbitrary position in a gzip stream if you know the 32KiB of uncompressed data that comes just before it, so by storing ~1% of the uncompressed layer size, I can jump ahead to predetermined locations and start reading from there rather than reading the entire layer.
I have great news to share: After three months of fighting, our Chatbox app is finally back on the U.S. App Store! 🎉
What happened?
In April 2025, another company with an app of the same name filed a dispute with Apple, claiming they held a trademark for the word “Chatbox.”
I believe this claim was baseless, because:
"Chatbox" is a widely used, generic word across the internet, and their trademark application had already been initially rejected by the USPTO.
We were the first to use “Chatbox” as the name for AI software, starting in March 2023 right here on GitHub.
However, Apple’s legal team accepted their claim and removed our app from the store on June 17.
Taking legal action
We decided to take the matter to federal court. The court ruled in our favor and issued an order to Apple on August 29 to restore our app within 7 days. After about two weeks, Apple finally notified us that the app was back online.
Why this matters
This is a victory against trademark bullying and an important win for our community. We stood our ground, defended our rights, and made sure our app—and its name—stayed ours.
Thank you all for your ongoing support throughout this challenging journey!
— Chatbox Dev Team
I Vibe Coded an R Package and it ... actually works?!?
Learning Japanese means memorizing thousands of characters, some of which look
nearly identical. I wanted a way to visualize which kanji are similar to each
other, and for once there wasn’t an R package for this, so I built one using
Claude Code for $20 and an afternoon’s work. I got a fully-documented package
with mocked tests, complete API coverage, caching, and graph visualizations.
I won’t keep you in suspense - here’s the result:
Network diagram of kanji similar to 年 (year)
This is exactly what I wanted - a way to explore visually similar kanji in
Japanese and identify the differences between them so I could better remember
which is which. I barely wrote any of the code to make it happen, and I couldn’t
be happier with it.
How did I get here?
I’ve been learning Japanese since my daughter took it up when starting high
school at the start of last year, and I figured we could learn together. My main
resource is
Duolingo
, which I think is okay, but
not brilliant. My ‘Duolingo Japanese Score’ is 72 with a 625 day streak. I’m on
Section 5 unit 27, and there’s only one Section left after this.
My Duolingo stats
My Duolingo progress
So I’m a lot of the way through the course, but it doesn’t actively
teach
me
anything; it’s purely by example, which I suppose is how children learn their
native language, but it is pretty slow going for remembering what things mean. I
bought some grammar books which are enlightening and I frequently say “Oh,
that’s why they did that!”
I was recommended to try
WaniKani
, which uses
spaced-repetition to teach the radicals, kanji, and vocabulary and provides
mnemonics to help remember which things mean what. I’ve found that extremely
useful for my Duolingo-based learning because now I can sometimes recognise what
a word might mean based on what I’ve learned in WaniKani and that helps me at
least get closer to which word is correct.
WaniKani is free until you clear level 3 after which it’s paid, but I found it
so useful I paid for a subscription and am now on level 8. There are 60 levels,
and the first 10 are labelled “pleasant”, the next 10 “painful”, followed by
“death”, then “hell”, so I’m not nearly through even the easy bit yet.
My WaniKani progress
One of the complexities with Japanese language (and Chinese) is that part of the
writing system is logographic (meaning-based, rather than phonemic, based on
alphabets) and there are thousands of characters all with multiple meanings and
multiple readings. Some of them look extremely similar to each other even though
they refer to completely different things. The words for “dog” 犬 and “big” 大 and
“fat” 太 differ by one little stroke which is easy to confuse. In the end, you
just have to know which one is which, but I wanted to be able to compare which
things are similar.
Prior Art
Alex Chan’s excellent 2020 post
Storing language vocabulary as a
graph
explored a cool idea about linking together words that look similar or have
similar components in Chinese, and I wanted to do something like that for
Japanese. I’ve had the idea floating in the back of my mind since I read that
post, but I had no idea how I was going to build it.
Alex Chan’s network graph of Chinese text
One option was just to ask an AI model every time I was confused about things
and get it to produce the answer, but this seemed both wasteful and potentially
just slow and annoying (or at worst, plain old wrong). I gave it a go, and I
think it does some of what I need to, but I’d rather build something I have some
faith in that I can query on demand without having to go and fetch an AI answer
every time.
Claude answering a question about similar kanji
(oh, so “dog” 犬 has an extra “dot” compared to “big” 大 … that certainly won’t
come to mind when I see “fat” 太. Also, what “additional horizontal line”???)
WaniKani does give you this ‘visually similar’ information when it shows you the
kanji, and I figured maybe I can use that data to build out a graph.
WaniKani’s ‘visually similar’ panel
How Hard Could It Be?
I had a look around and couldn’t find any R packages that did what I wanted. I
didn’t search other programming languages; maybe there’s something in Python?
There was
one WaniKani API package
{wanikanir}
(on GitHub, not CRAN) that was a bit out of date from 7 years ago.
I couldn’t use that as-is and I couldn’t find another WaniKani API package.
I did find Dominic Schuhmacher’s
package
for comparing or looking
up kanji. But that doesn’t do the similarity analysis that I wanted. The
WaniKani API exists and that’s what
Smouldering Durtles
(the unofficial
Android app I occasionally use to view my progress) uses behind the scenes. All
I neeed was a way to query it.
I started by trying to fetch the data myself. I
got my API
key
, looked up the
documentation
, figured out what I needed to
fetch, and then tweaked
{wanikanir}
to use
{httr2}
(with pagination) and
query the data via the up-to-date endpoint. I got the results as a giant JSON
blob, so I needed to dive into what was in there.
I had to parse out the different pieces that I wanted, which was the internal
IDs, the kanji, their readings, their meanings, and then the similarities (which
just returns another list of IDs to look up). I ended up staying up until about
1am that night and
only just
managed to get the actual data in the form of a
table that I could look at and maybe see that it would have the pieces that I
needed, but this was going to take a very long time.
Here’s where I got to after an entire evening hacking away
# get all the kanji
kanji <- wanikanir::get_wk_data_by_name("subjects", opt_arg = "kanji")
# map meaning, readings, level, and similar ids
kanji_maps <- purrr::map_df(
seq_along(kanji),
~ data.frame(
id = kanji[[.x]]$id,
type = kanji[[.x]]$object,
kanji = kanji[[.x]]$data$characters %||% NA,
meanings = purrr::map_chr(kanji[[.x]]$data$meanings, ~ .x$meaning),
readings = paste(
purrr::map(kanji[[.x]]$data$readings, ~ .x$reading),
collapse = " / "
),
level = kanji[[.x]]$data$level,
similar = toString(unlist(
kanji[[.x]]$data$visually_similar_subject_ids
)) %||%
NA
)
) |>
dplyr::arrange(id, level)
dplyr::slice_sample(kanji_maps[kanji_maps$level < 10, ], n=10)
# id type kanji meanings readings level similar
# 1 8664 vocabulary 番組 Program ばんぐみ 8
# 2 2657 vocabulary 字 Kanji Character じ 3
# 3 644 kanji 食 Meal しょく / た / く 6 805
# 4 690 kanji 者 Somebody しゃ / もの 8
# 5 685 kanji 役 Duty やく / えき 8 686, 2304, 2118
# 6 2795 vocabulary 空気 Mood くうき 5
# 7 3109 vocabulary 名物 Famous Product めいぶつ 9
# 8 2570 vocabulary 日本 Japan にほん / にっぽん 2
# 9 2539 vocabulary 上手 Skilled At じょうず 2
# 10 2602 vocabulary 大切 Valuable たいせつ 3
Just in case that’s of use to anyone, I’ve updated my fork with these changes
and it now passes
devtools::check()
This was still a long way off from what I needed it to do, though.
Get To Work, Agent
A couple of days later, I’d been watching some videos on
Claude Code
, in particular starting it up in
the base directory of an
Obsidian
vault full of markdown
files which seems to be
very
powerful, e.g. this video which demonstrates lots
of Claude Code’s abilities
How to 10x your notes with AI agents
and decided to take it for a proper spin. I’ve previously tried using
Amp
to vibe code something – Geoff Huntley
demonstrated that
at
DataEngBytes
and while I don’t necessarily agree with everything he said, the take-away that
“developers need to be aware of (and familiar with) these tools” really stuck
with me.
The Agentic workflow works quite nicely for iterating by itself if it can run
tools (e.g. via
bash
) so I can trust that it’ll run the tests, build the docs,
and commit all of the code only when everything passes.
I loaded up Claude Code, gave it some funds, and tried to see how productive
this is going to be. The entire prompt, not even using the
/init
setup, was
just
“Build an R package that uses this API
https://docs.api.wanikani.com/
. Support
all the endpoints. Add documentation and tests and make sure it passes check”.
and off it went and started building. It made a plan for what it needed to do to
build the package:
It needed to query the API
It needed to figure out which functions to use
It needed to write documentation and tests
☒ Examine WaniKani API documentation to understand endpoints
☒ Create R package structure and configuration files
☒ Implement core API client functions
☒ Create functions for each API endpoint
☒ Add comprehensive documentation with roxygen2
☒ Write tests with mocking using testthat and httptest
☐ Create package vignette with usage examples
And then it just went step-by-step and ticked off things on its to-do list.
I’ll point out at this point that the recent videos I’d seen — as well as
Geoff’s presentation — used voice dictation. So I installed Wispr Flow, and it
gave me two weeks of pro-level usage, which means I can just talk to my computer
about what I wanted to do. I’m pretty happy with how this is working, and in
fact I dictated most of this article as a draft. I don’t let any AI write my
blog posts for me;
this is definitely me
writing all of this — even if I do
use an em-dash or two. But Wispr Flow seems pretty cool. If you want to give it
a go, here’s a referral code
that I think gives you two weeks of full pro-level usage, and in theory if I end
up keeping on using this, gives me some free weeks too.
Once it was done, Claude Code had queried the API documentation page, found all
of the endpoints, and then built an entire R package with modern approaches that
queries the endpoints using
{httr2}
. This is exactly what I would have done
myself if I’d spent a
lot
more time with it. It added documentation for all of
the functions and a set of mocked tests using
{httptest}
. It confirmed that it
passed
devtools::check()
by actually running it from the command line and
fixing any errors that came up.
I wanted to make sure that it was clear when I was “doing an AI” and when I was
doing my own work, so as part of my initial instructions to Claude Code, I told
it that any time it commits, it needs to have “(claude)” in the commit message.
This might have been redundant because it adds itself as a co-author, so all of
the commits show “co-authored by Claude” in GitHub if it was involved, but I
think it’s worthwhile calling it out in the commit messages.
I had everything that I needed to get started actually pulling some data, so I
gave it a subset of the JSON output that I’d manually extracted the other day
and said that I wanted to be able to reproduce this. It built a function that
queried the data according to all of the things that it had found/built so far
and built the exact same table that I’d spent my entire evening building, but
now the package could do everything else as well. It also wrote a vignette of
basic usage that keeps track of all the things it can do. Here I think it’s a
little less competent in terms of writing documentation that shows how to use
something, but these models aren’t necessarily great at knowing the intention of
what you’re trying to do.
One of the problems that I found from manually inspecting the data was that the
‘visually similar’ component (the bit that I actually want) returned a list of
IDs, but I wanted to get the actual kanji that were related so that I could see
them on the network graph. So again, I just asked Claude Code to resolve those
so that we have the actual names and meanings, and that’s where it did something
a little bit wrong; it started querying the API again for each character that it
needed to look up. But Claude Code lets you interrupt and change course, so I
interrupted it and told it to use the data it already had, and it cheerfully
agreed (I’m sure we’re all sick of hearing “You’re absolutely right!”). Now, for
each kanji I get any similar kanji and their meaning.
The last piece is to actually show this on a network graph. Again, I’m not going
to be writing any of the code here, so I just asked it to build me a function
that queries for any one kanji and shows all of the related ones in either
direction up to some depth. It went off and built that, and again, it works. It
added tests. It made sure that it passed check and added documentation and
examples to the README and vignette. When I started testing it out, I realised
that I had to actually paste in kanji, which isn’t the interface that I
necessarily wanted, so I asked it to modify the function to optionally use the
English word and just search for that in the meanings.
At the end of this round of development it’s a fully working package with 133
passing tests and no failures
Passing 133 tests
These tests cover functions which call out to the API, and they’re all mocked so
that an API key (or network access) isn’t actually needed - this is probably the
approach I should take for other API packages, so this also serves as a useful
example for me to learn from.
devtools::check()
succeeds with no errors or warnings and only two NOTES
related to having a bit too large data and a
figures
directory for the vignette
and if I find any new features I’ll continue to add to this.
Functionality
Now I can query for ‘water’ either by the kanji 水 or by the English word “water”.
Searching for similar kanji to ‘water’ 水
Looking through the vignette of what it was able to do, it had some extra
wrapper functions around the raw data that it fetched, so I asked it to
integrate them into the actual functions so that they didn’t just return raw
data. One of the things it can do now is fetch the worst mistakes that I’ve made
in WaniKani so that I can review those
# ---8<--- from vignette
analyze_progress()
=== Review Statistics Summary ===
Total subjects with reviews: 500
Average meaning accuracy: 91.9%
Average reading accuracy: 89.4%
Meaning Accuracy Distribution:
0-50%: 0 subjects
51-70%: 0 subjects
71-80%: 46 subjects
81-90%: 161 subjects
91-100%: 293 subjects
Subjects That Need Most Practice:
1. 出 (Exit) - 72% [Level 2 Kanji]
2. 少女 (Girl, Young Girl) - 72% [Level 3 Vocabulary]
3. 年中 (Year Round, All Year, Whole Year, Throughout The Year) - 72% [Level 4 Vocabulary]
4. 去年 (Last Year) - 72% [Level 4 Vocabulary]
5. 村人 (Villager) - 75% [Level 4 Vocabulary]
(most of my errors are “spelling”-related; is it “しゅう” or “しょう”?)
=== Level Progression Summary ===
Total levels tracked: 8
Levels completed: 0
Levels passed (not completed): 7
Levels in progress: 1
Passing Time Statistics:
Average days to pass: 25.5
Median days to pass: 21.2
Fastest level passed: 10.1 days
Slowest level passed: 48.6 days
Individual Level Progress:
Level 1 - passed in 13 days
Level 2 - passed in 10.1 days
Level 3 - passed in 24 days
Level 4 - passed in 21.2 days
Level 5 - passed in 13.7 days
Level 6 - passed in 47.8 days
Level 7 - passed in 48.6 days
Level 8 - in progress
(I’m slowing down, but it’s definitely getting harder to keep up)
Since it can query all of this from WaniKani, I thought maybe I’d do something a
bit similar to what is in there and on-demand build a card for each character
(either radical, kanji, or vocab) to show the readings and the name and the
similar kanji independent of the network graph, especially for kanji containing
radicals.
At this point, I realised that every time it went and fetched all of this data
it was fetching directly from the API, so I asked it to add a caching layer that
fetched all of the kanji once and stored it with the package, reloading whenever
necessary and referring to that. Again, it did it. So now it used the cache of
that data and could look up all of it, and when I checked what was actually in
that cache it was the full raw JSON processed data, so now it could fetch
everything, not just the pieces that I was interested in. It built the cards and
they look amazing - it would’ve taken me a while to figure out the CSS styling
of this; definitely not impossible but so much easier to just ask and have it
done.
Character cards for ‘water’ radical 水 and ‘ice’ kanji 氷
Oh, and is everyone using the new ‘Minimalist Async Evaluation Framework for R’
mirai
? みらい 未来
wk_create_character_card("未来")
Character card for ‘mirai’ (future) 未来
The package has everything that I wanted it to do:
queries all of the endpoints on WaniKani
shows the meaning and definition for kanji
shows how they’re related in a network graph
I’m
extremely
happy. I let it do all the committing and pushed it to Github. I
added the
{pkgdown}
docs to actually render things, and dealt with the issue
of having to pre-render the vignette because it needs an API key. Apart from
that, I had all of this done in a matter of hours. Now that’s not to say that
this is just free and easy for everyone; I did spend about US$20 on building
this and that included an about an hour of API time. But these things are
constantly getting better, and they’re getting cheaper. Honestly I would’ve
spent a week building these components myself, so I think that’s a pretty good
deal. A thing that didn’t exist before now exists, and all it took was one
person spending $20 and connecting up their thoughts of what they wanted to a
capable system.
So does it “work”? Well, I built a shiny app that uses just this package and a
stored copy of all the cached data (so that it doesn’t need my API key) to show
the network graph for any kanji at any depth, and you can play with it here
I’ve hosted it on shinyapps.io, and of course I got Claude (not Code this time)
to help me build it. It takes either kanji or an English word and builds the
network graph to some maximum depth, with a limitation on the level (in case you
don’t care about words you haven’t learned yet) and some settings for layout.
Kanji Explorer shiny app
FYI, it’s … not great on a phone screen. I haven’t spent a lot of time
refining this, and don’t plan to. Nonetheless, I’m absolutely going to use this
when I get confused between two kanji!
In terms of “does it work?” I think it definitely does all of the things that I
wanted it to do. The old saying of
“Make it work, make it right, make it fast”
probably applies
here. I think “work” gets a tick. Is it right? I don’t know. As far as I can
see, it uses the data from the WaniKani API and gets the same results that I did
when I got it manually, but I’m relying on that data. Sure, there’s possibly
(probably) bugs in there. But this is pretty low-risk work I’m doing here. The
very worst that could happen with any of this code is it deletes my WaniKani
account, and in that case I’ll live to tell the tale. In terms of “fast”, this
is also low-risk. It doesn’t need to do anything computationally heavy;
everything here seems to work okay. I got it to add caching, so I wasn’t
being wasteful. But apart from that, what more could I want to do?
Reflections
Should
you
build one of these packages to connect to your banking app?
Probably not. I think it would probably build something that might be capable
and “look right enough” if your bank had an API, but the risk there is just too
high to actually let a system vibe code it.
People used to say “ideas are free, implementing costs.” I think it was Geoff
Huntley who said in his DataEngBytes presentation (something along the lines of)
“implementing prototypes for ideas is now free” and I believe I’ve demonstrated
that here (provided you consider $20 close enough to free). If somebody wanted
to take this and create a business that made a “production ready” version of
this shiny app, go for it! You’re welcome to; I’ll maybe even subscribe to it.
But I built this entire prototype from an empty file in a day and twenty
dollars, so I have what I need to continue with that.
Does this mean that R programmers can stop learning and just rely on AI now?
Absolutely not. I think the only way that I managed to get this to work the way
that I wanted it to was by querying the data manually myself first and having a
look at it. That helped me understand what was available, where things were in
the response, and when the model had trouble building around that I was able to
say or point out that oh there’s a
x$data$something
field.
In terms of business requirements, that’s always been something that you need to
have before you can start building a package. Can it speed up your iteration
process? I think absolutely. I got nowhere near actually building the first
function for a package in the night that I spent working on this, and within
about an hour with an AI agent I had something that was not only documented with
tests, but fully passing
devtools::check()
. That’s so much faster than I could
have built it even if it is just a prototype.
The old advice of “throw away your first prototype” is very likely still
applicable here, and if I went through and rewrote all of these things in my own
style, I don’t think much would change in terms of the functionality. The big
difference from doing it manually is that I now
have
that first version, and
can see if it does what I actually want it to do (it does!).
I don’t mind if no one beyond me uses this. I built it exclusively for me. It
does what I want it to do. The code and the shiny app are available if that’s
helpful to you, and if there’s something else that you’d find useful, maybe I’ll
find that useful too, so feel free to send in any pull requests or issues about
what you think it could or should do. In the meantime, I’ll be using this to
enhance my own Japanese learning.
I’d love to hear what people think about this… Was this all a terrible idea?
Is there already a better tool which does this? Have I wasted $20 enjoying seeing
something come to life? As always, I can be found on
Mastodon
and the comment section below.
Just got word that the
court dismissed several of WP Engine and Silver Lake’s most serious claims
— antitrust, monopolization, and extortion have been knocked out! These were by far the most significant and far-reaching allegations in the case and with today’s decision the case is narrowed significantly. This is a win not just for us but for all open source maintainers and contributors. Huge thanks to the folks at
Gibson
and
Automattic
who have been working on this.
With respect to any remaining claims, we’re confident the facts will demonstrate that our actions were lawful and in the best interests of the WordPress community.
This ruling is a significant milestone, but our focus remains the same: building a free, open, and thriving WordPress ecosystem and supporting the millions of people who rely on it every day.
California lawmakers pass SB 79, housing bill that brings dense housing
California lawmakers just paved the way for a whole lot more housing in the Golden State.
In the waning hours of the 2025 legislative session, the state Senate voted 21 to 8 to approve
Senate Bill 79
, a landmark housing bill that overrides local zoning laws to expand high-density housing near transit hubs. The controversial bill received a final concurrence vote from the Senate on Friday, a day after passing in the California Assembly with a vote of 41 to 17.
The bill had
already squeaked through
the state Senate by a narrow margin earlier this year, but since it was amended in the following months, it required a second approval. It will head to Gov. Gavin Newsom’s desk in October.
One of the more ambitious state-imposed efforts to increase housing density in recent years, the bill was introduced in March by Sen. Scott Wiener (D-San Francisco), who stresses that the state needs to take immediate action to address California’s housing shortage. It opens the door for taller, denser housing near transit corridors such as bus stops and train stations: up to nine stories for buildings adjacent to certain transit stops, seven stories for buildings within a quarter-mile and six stories for buildings within a half-mile.
Single-family neighborhoods within a half-mile of transit stops would be subject to the new zoning rules.
Height limits are based on tiers. Tier 1 zoning, which includes heavy rail lines such as the L.A. Metro B and D lines, allows for six- to nine-story buildings, depending on proximity to the transit hub. Tier 2 zoning — which includes light rail lines such as the A, C, E and K lines, as well as bus routes with dedicated lanes — allows for five- to eight-story buildings.
An amateur map
released by
a cartographer
and fact-checked by YIMBY Action, a housing nonprofit that helped push the bill through, gives an idea of the areas around L.A. that would be eligible for development under SB 79. Tier 1 zones include hubs along Wilshire Boulevard, Vermont Avenue and Hollywood Boulevard, as well as a handful of spots in downtown L.A. and the San Fernando Valley.
Tier 2 zones are more spread out, dotting Exposition Boulevard along the E line, stretching toward Inglewood along the K line, and running from Long Beach into the San Gabriel Valley along the A line.
Assembly members debated the bill for around 40 minutes on Thursday evening and cheered after it was passed.
“Over the last five years, housing affordability and homelessness have consistently been among the top priorities in California. The smartest place to build new housing is within existing communities, near the state’s major transit investments that connect people to jobs, schools and essential services,” said Assemblymember Sharon Quirk-Silva (D-Orange County) in support of the bill.
Other Assembly members, including Buffy Wicks (D-Oakland), Juan Carrillo (D-Palmdale) and Josh Hoover (R-Folsom) voiced their support.
Proponents say drastic measures are necessary given the state’s affordability crisis.
“SB 79 is what we’ve been working towards for a decade — new housing next to our most frequently used train stations. This bill has the potential to unlock hundreds of thousands of new multifamily homes,” said
YIMBY Action
California director Leora Tanjuatco Ross.
Critics claim the blanket mandate is an overreach, stripping local authorities of their ability to promote responsible growth.
Assemblymember Rick Zbur (D-West Hollywood) argued against the bill, claiming it will affect lower-priced neighborhoods more than wealthy ones since land prices are cheaper for housing developers.
Councilmember Traci Park, who co-authored the resolution with Councilmember John Lee, called SB 79 a “one-size-fits-all mandate from Sacramento.” Lee called it “chaos.”
The resolution called for L.A. to be exempt from the upzoning since it already has a state-approved housing plan.
The bill has spurred multiple protests in Southern California communities, including
Pacific Palisades
and
San Diego
. Residents fear the zoning changes would alter single-family communities and force residents into competition with developers, who would be incentivized under the new rules to purchase properties near transit corridors.
However, support for SB 79 surged in recent days after the State Building and Construction Trades Council, a powerful labor group that represents union construction workers, agreed to reverse its opposition in exchange for amendments that add union hiring to certain projects.
In a statement after
the deal was struck
, the trades council President Chris Hannan said the amendments would provide good jobs and training to California’s skilled construction workforce.
Wiener, who has unsuccessfully tried to pass similar legislation twice before, said the deal boosted the bill’s chances.
More to Read
Life, work, death and the peasant: Rent and extraction
This is the third piece of the fourth part of our series (
I
,
II
,
IIIa
,
IIIb
,
IVa
,
IVb
) looking at the lives of pre-modern peasant farmers – a majority of all of the humans
who have ever lived
. Last time, we started looking at the subsistence of peasant agriculture by considering the productivity of our model farming families under basically ideal conditions: relatively good yields and effectively infinite land.
This week we’re going to start peeling back those assumptions in light of the very small farm-sizes and capital availability our pre-modern peasants had. Last week we found that,
assuming effectively infinite land
and reasonably high yields, our farmers produced enough to maintain their households fairly securely in relative comfort, with enough surplus over even their respectability needs to potentially support a small population of non-farmers. But of course
land isn’t infinite
and
also isn’t free
and on top of that, the societies in which our peasant farmers live are often built to extract as much surplus from the peasantry as possible.
But first, if you like what you are reading, please share it and if you
really
like it, you can support this project on
Patreon
! While I do teach as the academic equivalent of a tenant farmer, tilling the Big Man’s classes, this project is my little plot of freeheld land which enables me to keep working as a writers and scholar. And if you want updates whenever a new post appears, you can click below for email updates or follow me on
Twitter
and
Bluesky
and (less frequently) Mastodon (@bretdevereaux@historians.social) for updates when posts go live and my general musings; I have largely shifted over to Bluesky (I maintain some
de minimis
presence on Twitter), given that it has become a much better place for historical discussion than Twitter.
From the British Museum (
2010,7081.4256
), “The Rapacious Steward or Unfortunate Tenant,” a print by Haveill Gillbank (1803), showing a tenant farmer, with his family, being taken award by the estate’s steward (on horseback). A little late for our chronology, but so on point for today’s topic it was hard to let it pass.
It is also a useful reminder that tenancy wasn’t just an economic system, but a social one: it gave the Big Man and his agents tremendous power over the lives and livelihoods of the people who lives near the Big Man’s estates. For very Big Men, they might have several such estates and so be absentee landlords, in which case not only the Big Man, but his steward, might be figures of substantial power locally.
Land Holdings
Returning to where we left off last week, we found that our model families could comfortably exceed their subsistence and ‘respectability’ needs with the labor they had assuming they had enough land (and other capital) to employ all of their available farming labor.
However
, attentive readers will have noticed that the labor of these families could work
a lot of land
: 30.5 acres for The Smalls, 33.6 acres for The Middles and 56 acres for The Biggs. That may not seem large by the standards of modern commercial farms, but few peasants had anything like such large landholdings; even
rich
peasants rarely owned so much.
We might compare, for instance, the land allotments of Macedonian and Greek military settlers in the Hellenistic kingdoms (particularly Egypt, where our evidence is good). These settlers were remarkably well compensated, because part of what the Hellenistic kings are trying to do is create a new class of Greco-Macedonian rentier-elites
1
as a new ethnically defined military ruling-class which would support their new monarchies. In Egypt, where we can see most clearly, infantrymen generally received 25 or 30
arourai
(17 or 20.4 acres), while cavalrymen, socially higher up still, generally received 100
arourai
(68 acres).
2
That infantry allotment is still anywhere from two thirds to less than half of what our model families can farm and yet was still large enough, as far as we can tell, to enable Ptolemaic Greco-Macedonian soldiers to live as rentier-elites, subsisting primarily if not entirely off of rents and the labor of others.
3
Alternately, considering late medieval Europe through the study of Saint-Thibery,
4
out of 189 households in 1460 in the village just fifteen households are in the same neighborhood of landholdings as the Smalls’ 33.6 acres above (so roughly 55
setérée
and up)
5
only six as much as The Biggs (about 90
setérée
and up). In short
our assessment so far has assumed our families are
extremely
rich peasants
. But of course they almost certainly are not!
Instead, as we noted in our first part, the
average size of peasant landholdings was
extremely small
. Typical Roman landholdings were around 5-10
iugera
(3.12-6.23 acres), in wheat-farming pre-Han northern China roughly 100
mu
(4.764 acres), in Ptolemaic Egypt (for the indigenous, non-elite population) probably 5-10
aroura
(3.4-6.8 acres) and so on.
6
In Saint-Thibery in Languedoc, the average (mean) farm size was about 24
setérée
(~14.5 acres) but the more useful
median farm size
was just
five
setérée
(~3 acres); the average is obviously quite distorted by the handful of households with
hundreds
of
setérée
of land.
So we might test three different farm sizes; once again, I am going to use Roman units because that’s how I am doing my background math. We might posit a relatively a
poor household farm of roughly three
iugera
(1.85 acres). In Saint-Thibery, 68 of the 189 households (36%) had land holdings this small or smaller, so this is not an unreasonable ‘poor household’ – indeed, we could posit much poorer, but then we’re really just talking about tenant farmers, rather than freeholding peasants. Next, we can posit a
moderate household farm of roughly six
iugera
(3.8 acres); reasonably close to the median holding in Saint-Thibery and roughly what we think of as the lower-bound for ancient citizen-soldier-peasants. Finally, we can posit a
large household farm
of nine
iugera
(5.6 acres), reflective of what seems to be the upper-end of typical for those same citizen-soldier-peasants; at Saint-Thibery in 1460 there were a couple dozen families seemingly in this range.
7
For the sake of a relatively easier calculation, we can assume the same balance of wheat, barley and beans as last time, which lets us just specify an average yield after seed
per iugerum
of 81.2-189.5 kg of wheat equivalent (achieved by averaging the per-acre wheat equivalent production across all three crops, with seed removed),
8
with each
iugerum
demanding between 11 and 15 working days (averaging the labor requirements across all three crops). Finally,
we need to remember the fallow
: in this case we’re assuming about
a third of each farm is not in production in any given year
, meaning it is both not consuming any labor nor producing any crops. That lets us then quickly chart out our peasant families based on the land they might actually have (keeping in mind the household size and household land holdings aren’t going to match; the larger household
in people
won’t always be the one with more land). First, a reminder of the basic labor availability and grain requirements of our households.
The Smalls
The Middles
The Biggs
Labor Available
435 work-days
507.5 work-days
797.5 work-days
Bare Subsistence Requirement
~1,189.5kg wheat-equivalent
~1,569kg wheat-equivalent
~2,686kg wheat-equivalent
Respectability Requirement
~2,379kg wheat-equivalent
~3,138kg wheat-equivalent
~5,376kg wheat-equivalent
Then for the smallest, 3
iugera
farm, the numbers work like this:
Small Farm
(3
iugera
)
2
iugera
cropped
1 fallow
The Smalls
The Middles
The Biggs
Labor requirement
22-30 work days
22-30 work days
22-30 work days
Labor surplus
405-413 work days
477.5-485.5 work days
767.5-775.5 work days
Production after Seed
162.4-378.8kg wheat equivalent
162.4-378.8kg wheat equivalent
162.4-378.8kg wheat equivalent
Percentage of
Subsistence
:
14-32%
10-24%
6-14%
And then for the medium-sized farm:
Medium Farm
(6
iugera
)
4
iugera
cropped
2 fallow
The Smalls
The Middles
The Biggs
Labor requirement
44-60 work days
44-60 work days
44-60 work days
Labor surplus
375-391 work days
447.5-463.5 work days
737.5-753.5 work days
Production after Seed
324.8-757.6kg wheat equivalent
324.8-757.6kg wheat equivalent
324.8-757.6kg wheat equivalent
Percentage of
Subsistence
:
27-64%
21-48%
12-28%
And the larger (but not
rich peasant
) farm:
Large Farm
(9
iugera
)
6
iugera
cropped
3 fallow
The Smalls
The Middles
The Biggs
Labor requirement
66-90 work days
66-90 work days
66-90 work days
Labor surplus
345-369 work days
417.5-441.5 work days
707.5-731.5 work days
Production after Seed
487.6-1,136.5kg wheat equivalent
487.6-1,136.5k wheat equivalent
487.6-1,136.5k wheat equivalent
Percentage of
Subsistence
:
41-96%
31-72%
18-42%
And we immediately see the problem:
only
the Smalls manage to get close to subsistence on very favorable (8:1) fertility assumptions on the small farm they own. Now it
is
possible for the peasants to push a little bit on these numbers. The most obvious way would be focusing as much as possible on wheat cultivation, which has higher labor demands but also the highest yield-per-acre (or
iugerum
), producing around 50% more calories than beans and 35% more calories than barley per-acre (see
last week’s post for specifics
).
But there’s a limit to going ‘all in’ on wheat to meet food shortfalls
: the land might not be suitable for it and wheat exhausts the soil, so our farmers would need
some
sort of rotation. That said, peasant diets were
overwhelmingly
grains (wheat and barley) for this reason:
they provide the most calories for a favorable balance of land and labor
.
Our farmers might also try to supplement production with high-labor, high-density horticulture
; a kitchen garden can take a lot of work but produce a lot of nutrition in a small space. But hitting household nutrition demands
entirely
with a kitchen garden isn’t going to work both because of the labor demands but also because the products of a kitchen garden tend not to keep well.
Instead the core problem
is that our peasant households are
much too large
as units of labor for the farmland they own
. When we say that, what we mean is that given these households are both units of consumption (they have to provide for their members) and units of production (they are essentially agricultural small businesses), an
efficient
allocation of them would basically have each household on something like 30 acres of farmland, farming all of it (and thus using most of their labor) and selling the excess.
But the lack of economically sustainable social niches
– that is, jobs that provide a reliable steady income to enable someone to obtain subsistence –
means that these families are
very reluctant
to leave members without any land at all
, so the holdings ‘fractionalize’ down to these tiny units, essentially the smallest units that
could
conceivably support one family (and sometimes not even that).
I’ve already seen folks in the comments realizing almost immediately why these conditions might make conquest or resettlement into areas of land easily brought under cultivation so attraction: if you
could
give each household 30-40 acres instead of 3-6, you could realize
substantial
improvements in quality of life (and the social standing of the farmers in question). And of course that kind of ‘land scarcity’ problem seems to have motivated both
ancient
and
early modern
settler-colonialism: if you put farmers next to flat, open ground owned by another community, it won’t be too long before they try to make it farmland (violently expelling the previous owners in the process). This is also, I might add, part of the continual friction in areas where nomads and farmers meet: to a farmer, those grazing fields look like
more land
and
more land
is really valuable (though the response to
getting new land
is often not to create a bunch of freeholding large-farm homesteaders, but rather to replicate the patterns of tenancy and non-free agricultural labor these societies already have to the point of – as in the Americas – forcibly trafficking
enormous
numbers of enslaved laborers at great cost, suffering and horror, to create a non-free dependent class whose exploitation can enable those patterns. Most conquering armies dream of becoming landlords, not peasants).
9
Alternately
as
farms
these holdings could be a lot more efficient if they had
fewer people
on them
and indeed when we read, for instance, ancient agricultural writers, they recommend estates with significantly fewer laborers per-unit-land-area than what we’d see in the peasant countryside. But that’s because the
Big Man is farming for profit
with a large estate that lets him tailor his labor force fairly precisely to his labor needs; the peasants are farming
to survive
and few people are going to let their brother, mother, or children starve and die in a ditch because it makes their farm modestly more productive per unit labor. Instead, they’re going try to do anything in their power to get enough income to have enough food for their
entire family
to survive.
There is no real way around it:
our peasants need access to more land
. And that land is going to come with
conditions
.
From the British Museum (
1850,0713.91
), “La Conversation,” an etching by David Teniers and Andrew Lawrence (1742) showing three peasants having a conversation outside of a farmhouse, with a peasant woman in the doorway.
The Big Man’s Land
Now before we march into talking about farming
someone else’s land
, it is worth exploring why our farmers don’t get more land by just
bringing more land under cultivation
. And the answer here is pretty simple:
in most of the world, preparing truly ‘wild’ land for cultivation takes a
lot
of labor
. In dry areas, that labor often comes in the form of irrigation demands: canals have to be dug out from water sources (mainly rivers) to provide enough moisture for the fields as the most productive crops (like wheat) demand a lot of moisture to grow well. In climates suitable for rainfall agriculture, the problem is instead generally forests: if there’s enough rain to grow grain, there’s enough rain to grow
trees
and those trees have had quite the head start on you. Clearing large sections of forest
by land
is a slow, labor-intensive thing and remember, you don’t just need the trees cut down,
you need the stumps pulled or burned
. Fields also need to be relatively
flat
– which might demand terracing on hilly terrain – and for the sake of the plow they need to be free of large stones to the depth of the plow (at least a foot or so).
In short, clearing farmland was both
slow
and
expensive
and all of this assumes the land
can
be made suitable and that no one has title to it. Of course if the forest is the hunting preserve of the local elite, they’re going to object quite loudly to your efforts to cut it down. And a lot of land is simply going to be too dry or too hilly or too marshy to be made usable for farming ona practical time-scale for our peasants. Such land simply cannot be brought usefully into cultivation; you can’t farm wheat in a swamp.
10
So it is quite hard and often impractical to bring new land into cultivation
.
That doesn’t mean new land
wasn’t
brought into cultivation, it absolutely was. We can sometimes track population pressures archaeologically by watching this process: forests retreat, new villages pop up, swamps are drained and so on as previously marginal or unfarmable land is brought into cultivation. Note, of course, if you bring a bunch of marginal fields into cultivation – say, a drier hillside not worth farming before – your average yield is going to go down because that land simply isn’t as productive (but demands the same amount of labor!). But that process is generally slow, taking place over generations in response to population pressures. It isn’t a solution available on the time-scale that most of our households are operating.
In the moment
,
the supply of land is mostly fixed for our peasants
.
Which means our peasants need access to
more land
(or another way of generating income). There are a range of places that land could come from:
Peasant Households without enough labor to farm their own land
. In order to make our households relevant at every part of the process, I haven’t modeled the substantial number of very small households we talked about in the first section, households with just 1 or 2 members. If none of those householders were working-age males (e.g. a household with an elderly widow, or a young widow and minor children, etc.) they might seek to have other villagers help farm their land and split the production. For very small households, that might be enough to provide them subsistence (or at least help). Consequently
those small, often ‘dying’ households provide a (fairly small) source of land for other households
.
Rich peasants likewise might have more land than their household could farm or cared to farm
. Consider the position The Smalls would be if they were a rich peasant household with, say, 25 acres of land (in Saint-Thibery, 26 households (of 189) had this much or more land). That’s enough land that, under good harvest conditions it would be easy enough to shoot past the household’s respectability requirements. At which point
why work so hard
? Why not sharecrop out a large chunk of your land to small farmers and split the production, so you still make your respectability basket in decent years, but don’t have to work so darn hard?
The Big Man
. Another part of this ecosystem is invariably
large
landowners, who might have estates of hundreds of acres. Columella , for instance, thinks of farm planning (he is thinking about large estates) in units of 100
iugera
(62.3 acres) and 200
iugera
(124.6 acres; Col.
Rust
. 12.7-9). An estate of several hundred acres would hardly be unusual. Likewise in the Middle Ages, the Big Man might be a local noble whose manor estate might likewise control a lot of land. The Big Man might also be a religious establishment: temples (in antiquity) and monasteries and churches (in the Middle Ages) often controlled large amounts of productive farmland worked by serfs or tenants to provide their income.
Naturally, the Big Man isn’t doing his own farming
; he may have some ‘built in’ labor force (workers in his household, enslaved workers, permanent wage laborers, etc.) but
often the Big Man is going to rely substantially on the local peasantry for tenant labor
.
In practice, the Big Man is likely to represent the bulk of opportunities here, but by no means all of them. As I noted before, while local conditions vary
a lot
, you won’t be too far wrong in thinking about landholdings as a basic ‘rule of thirds’ with one third of the land controlled by small peasants, one third by rich peasants and one third by the Big Man (who, again, might be a lord or a big landowner or a church, monastery or temple (in the latter case, the land is owned
by the god
in most polytheistic faiths) or even the king). But of course only a little bit of the small peasant land is going to be in search of workers, since most peasant households have too many hands for too little land;
some
of the rich peasant land will be looking for workers (either tenants or hired hands), but rich peasants are still
peasants
– they do some of their farming on their own. By contrast, the Big Man is marked out by the fact that he doesn’t do his own farming: he needs
some
kind of labor supply – wage laborers, enslaved/non-free laborers or tenants – for all of it.
But that also means that something like
half
(or more!) of the land around our peasant village might be owned by a household that needs outside labor to farm it.
So we have peasant households with surplus labor that need more land to farm and richer households with surplus land that needs labor
. The solution here
generally
was some form of tenancy which in the pre-modern world generally came in the form, effectively of sharecropping: the landowner agreed to let the poorer household farm some of his land in exchange for a percentage of the crop that resulted. That ‘rent-in-kind’ structure is useful for the peasants who after all are not generally keeping
money
with which to pay rent. At the same time, it limits their liability: if the harvest on tenant land fails, they may suffer a shortfall, but they aren’t
in debt
some monetary quantity of rent (though they may end up in debt in some other way).
Now the question is:
on what terms
?
Tenancy
And the answer here won’t surprise: bad terms. The terms are bad.
There’s a useful discussion of this in L. Foxhall, “The Dependent Tenant”
JRS
980 (1990), which in turn leans on K. Finkler, “Agrarian Reform and Economic Development” in
Agricultural Decision Making
, ed. P.F. Barlett (1980) to get a sense of what the terms for tenant farmers might normally look like. Foxhall notes in this and a few other studies of modern but largely non-industrial farming arrangements that almost no households in these studies were entirely uninvolved in sharecropping or tenancy arrangements, but that the terms of tenancy arrangements varied a lot based on the inputs supplied.
The key inputs were labor, traction (for our pre-industrial peasants, this is “who supplies the plow-team animals”), water and seed. The most common arrangement, representing almost a third of all arrangements, was where the tenant supplied labor only, while traction, water and seed were supplied by the landlord; the tenants share in these arrangements was a measly 18.75%. A number of arrangements had the tenant supplying not only labor but also some mix of traction, water or seed (but not all) and often the tenant’s share of the production hovered between 40 and 60%, with exact 50/50 splits occurring in about a quarter of the sample. In just one case did the tenant supply
everything
but the land itself; in that case the tenant’s share was 81.25%.
One thing that is obvious from just this example is that arrangements varied
a lot
and are going to depend on need and bargaining power. A ‘landlord’ who has land they want under cultivation but can supply basically nothing else may be relatively easy to negotiate into a fairly generous deal; a peasant who is absolutely destitute save for the labor of their hands is easy to exploit. An even 50/50 landholder, tenant split seems to have been the norm in much of Europe though, reflected in terms for sharecropper (
métayer
in French,
mezzadro
in Italian,
mitateri
in Sicilian,
mediero
in Spanish) which all mean ‘halver,’ though again the terms (and the share split) varied, typically based on demand but also on what exactly the landlord was providing (seed, plow teams, tools, physical infrastructure (like a farmhouse), etc).
For the sake of simplicity in our model, we can assume something like a 50/50 split, with our tenants supplying half of the seed, so that our net yield is exactly half of what it would have been. We can then take those assumptions back to our model. To establish a baseline, let’s run the numbers assuming first a ‘medium’ sized (6
iugera
, 3.8 acres, with 4
iugera
cropped and 2 fallowed) farm, with our fertility estimate set modestly to 6:1, a ‘good but not great’ yield. We’re going to ’round up’ to the nearest even
iugerum
and assume an average of 13 days per
iugerum
of labor, just to make our calculations a bit simpler.
How hard is it for our peasants to meet their needs if they have to sharecrop the added land they need
?
Tenancy
with a medium farm
The Smalls
The Middles
The Biggs
Total Labor
435 work-days
507.5 work-days
797.5 work-days
Freehold Labor Demand
52 work-days
52 work-days
52 work-days
Freehold Production
541kg wheat equivalent
541kg wheat equivalent
541kg wheat equivalent
Shortfall to Subsistence
648.5kg wheat equivalent
1,028kg wheat equivalent
2,145kg wheat equivalent
Net Production Per
iugera
farmed as tenant
67.65kg wheat equivalent
67.65kg wheat equivalent
67.65kg wheat equivalent
Tenant Land Required for Subsistence
10
iugera
(6.23 acres)
(plus another ~5
iugera
fallowed)
16
iugera
(9.97 acres)
(plus another ~8
iugera
fallowed)
32
iugera
(19.94 acres)
(plus another ~16
iugera
fallowed)
Labor Demand for Subsistence
130(+52) work days
Total: 182
208(+52) work days
Total: 260
416(+52) work days
Total: 468
Subsequent
Shortfall to Respectability (over subsistence)
1,189.5kg wheat equivalent
1,569kg wheat equivalent
2,686kg wheat equivalent
Tenant Land Required for Respectability
18
iugera
(11.2 acres)
(plus another ~9
iugera
fallowed)
24
iugera
(14.95 acres)
(plus another ~12
iugera
fallowed)
40
iugera
(24.9 acres)
(plus another ~20
iugera
fallowed)
As we can see, tenancy
dramatically
changes the picture for our peasants
. Under these relatively typical assumptions, of our three families all can make subsistence in a normal year but
only
the Smalls have the right combination of a lot of labor and a relatively small family to have a shot at getting all of their respectability needs (in practice, they’d probably fall short once you consider necessary farm labor not in the fields – fence repair, tool maintenance, home repair and the like). It also isn’t hard to see how we might alter this picture to change our assumptions. Changing the size of the owned farmland has a significant impact (even though it is already so small) because our peasants realize
twice
the production per unit-land-area for land they own over land they rent (again, terms might vary). Put another way, under these assumptions, somewhat marginal owned farmland that gives an OK-but-not-great yield of 4:1 is of the same use to our peasants as
really good
tenant-farmed farmland giving a 7:1 yield (both offer 81.2kg of wheat equivalent per
iugerum
after rent is paid).
That said,
the fact that our peasants end up with enough labor to comfortable exceed their subsistence requirements, but not their comfort requirements is favorable
for extraction
, which we’ll discuss below. These are households with spare labor who can’t fulfill all of their wants entirely on their own, giving the state or local Big Men both a lot of levers to squeeze more labor out of them and
also
giving the households the available above-subsistence labor to squeeze. By contrast if these peasants had enough land to meet all of their needs themselves, there would be fewer opportunities to compel them to do additional labor
beyond that
.
But even before we get to extraction, tenancy is also
changing our peasants’ incentives
. Economics has the concept of
diminishing marginal returns
, the frequent phenomenon where adding one more unit of a given input produces less and less output per additional input-unit. You will find more errors in the first hour of proofreading than the fiftieth hour, for instance. There’s also the concept of
diminishing marginal utility
: beyond a certain point, getting more of something is less valuable per unit added. Getting one bar of chocolate when you have none? Fantastic. Getting one bar of chocolate when you have ten thousand? Solidly meh.
Both are working on our farmers to press their natural production inclination not to
maximum labor
or even
hitting that respectability basket
but just
subsistence and a little bit more
. On the diminishing marginal returns front, naturally when it comes to both owned land and rented land, our peasants are going to farm the most productive land
first
. This is why when we talk about expanding population and expanding agriculture, we often talk about
marginal
land (less productive land) coming under cultivation; because all of the really great land was
already being farmed
. But poor farmland doesn’t demand less labor time (indeed, it may demand more), it just produces less. So while we’ve been working here with averages, you should imagine that the first few acres of farmland will be
more
productive and the latter few
less
productive.
Tenancy puts this into even more sharp contrast because it creates a really significant discontinuity in the value of farming additional land: the rents are
so high
that sharecropped or tenant land is
much less useful
(per unit labor) to the peasant than their own land. So you have a slow downward slope of ‘land quality’ and somewhere in that slope there is the point at which the peasants have farmed all of their own land and so suddenly the effective yield-per-labor-after-rent drops
by half
(or more!). So the first few hundred kilograms of wheat equivalent are probably fairly easy to get: you have a few good fields you own and your net out of them might be 130-190kg of wheat equivalent per
iugerum
. Put in a couple dozen days on those two good
iugera
and The Smalls have just over a quarter of their subsistence needs. But then they have their more marginal fields, which might only yield 80-100kg. Still not terrible but the next couple of dozen days of labor don’t get them as far: not to half but just 44% or so. But now you are out of your own land, so you go to your rich neighbor or the Big Man to get access to some more and suddenly even on their best fields your yield-per-
iugerum
is 80-95kg so another couple of dozen working days gets you just from 44% to just 57% of what you need. So you need to line up a lot more land, but
now
you might be starting to look at the worse fields the Big Man has. He still wants them farmed, after all, his choice is between doing nothing and earning money or doing nothing and not earning money; he’d rather earn money. But suddenly you’re looking at maybe as little as 50-60kg of wheat equivalent per
iugerum
and the labor demands have not gone down.
Meanwhile, the comfort you get from each kilogram of wheat equivalent is
also going down
. The first 80% or so of your subsistence needs is necessary simply to not starve to death; a bit more makes the household sustainable in the long term. But then – and remember, these choices are coming as you are facing
diminishing marginal returns
on each day of labor you put in – is it
really
worth your time to cultivate a couple more fields in order to just get a bit more meat in your diet and have slightly nicer household goods? Wouldn’t you rather rest?
And so what you see is most peasant households aiming not for the full respectability basket, but that “subsistence – and a little bit more” because as each day of labor produces less product and each product produces less joy, at some point you’d rather not work.
And as we’ve seen
in theory
, our households might hit that crossover point – subsistence and a little bit more – fairly quickly in their labor supply. We haven’t yet, but should now, account for labor spent on things like maintaining tools, fixing fences and other capital investments. If we allocate, say, 45 days, for that and assume that our farmers also want to have
some
cushion on subsistence (say, another 10%), we might expect The Smalls to be more or less satisfied (on that medium landholding, average 6:1 yields) with something like 245 working days (56% of total), the Middles with 331 working days (65%) and the Biggs with 560 (70%). Working like that, they won’t be rich and won’t ever become rich (but they were never going to become rich regardless), but they’ll mostly survive – some years will be hard – and they’ll have a little bit more time to rest.
Some families, a bit more industrious, might push towards achieving most or all of the respectability basket, at least in good years; others might be willing to stick closer to subsistence
(or unable to do otherwise).
Of course in areas where the farmland is meaningfully more marginal – average yields around 4:1 rather than 6:1 – our peasants are going to need to work quite a lot more, about 60% more. That pushes the Smalls to about 84% of their available labor, the Middles to
99%
and the Biggs actually slightly into deficit, demanding roughly 110% of their available labor.
We should keep in mind that each peasant household is going to exist somewhere along the spectrum
: some with larger amounts of property or access to better land, some with less. We’ll come back to this in a moment, but this is part of why the poorest of the peasantry were often exempt from things like military service: positioned on marginal land in poor communities, they had little excess labor available. Most peasant households would have been somewhere in between these two, so a labor utilization rate ranging from 50 to 100%, with a lot of households in that 60-80% labor utilization range.
And now you might think, “doesn’t this take us back to peasants actually not working all that much compared to modern workers?” and first I would want to point out that these peasants are also experiencing a quality of living
way
below workers in modern industrial countries but also
no because we haven’t talked about
extraction
.
Because of course the problem here, from the perspective of everyone who
isn’t
our peasants is that if the peasantry only does the amount of agricultural labor necessary to subsist themselves and just a little more, the society doesn’t have economic room for much else in the way of productive (or unproductive) economic activity.
Remember: our peasants are the only significant population actually
doing farming
. Sure the Big Men and the gentry and temples and monasteries may
own land
, but they are mostly renting that land out
to peasants
(or hiring peasants to work it, or enslaving peasants and forcing them to work it).
And those landholding elites, in turn,
want to do things
. They want to build temples, wage wars, throw fancy parties, employ literate scribes to write works of literature and of course they also want to
live in leisure
(not farming) while doing this. And the activities they want to do – the temples, wars, fancy parties, scribes and so on – that requires a lot of food and other agricultural goods to sustain the people doing those things. It also requires a bunch of surplus labor – some of that surplus labor are specialists, but a lot of it is effectively ‘unspecialized’ labor.
To do those things, those elites need to draw both agricultural surplus and surplus labor out of the countryside. And we should that of course, obviously, this is an exploitative relationship, but it is also worth noting that for pre-modern agrarian economies, the societies where elites can centralize and control the largest pile of labor and surplus tend to use it to
conquer the societies that don’t
so ‘demilitarized peasant utopia’ is not a society that is going to last very long (but ‘highly militarized landowner republic’ might).
It is thus necessary to note that when we see the emergence of complex agrarian societies – cities, writing, architectural wonders, artistic achievements and so on –
these achievements are mostly elite projects, ‘funded’
(in food and labor, if not in money)
out of extraction from the peasantry
.
Exactly
how
this extraction worked varied a lot society to society and even within regions and ethnic and social classes within society. As noted above, in areas where agriculture was not very productive, extraction was limited. By contrast, highly productive regions didn’t so much producer richer peasants as they tended to produce far higher rates of extraction. In some society, where the freeholding farming peasantry (or part of that peasantry) formed an important political constituency (like some Greek
poleis
or the Roman Republic), the small farmers might manage to preserve relatively more of their surplus for themselves, but often in exchange for significant demands in terms of military and civic participation.
To take perhaps the simplest direct example of removing labor from the countryside, from 218 to 168, the Romans averaged around 10-12 legions deployed in a given year, 45,000-54,000 citizen soldiers.
11
Against an adult-male citizen population of perhaps ~250,000 implies that the Roman army was consuming something like a quarter of all of the available citizen manpower in the countryside, though enslaved laborers and males under 17 wouldn’t be captured by this figure. Accounting for those groups we might imagine the Roman
dilectus
is siphoning off something like 15% of the labor capacity of the countryside on average (sometimes spiking far higher, as much as
half
of it). On top of that, the demand of these soldiers that they supply their own arms and armor would have pushed farmers to farm a little bit more than subsistence-and-a-little-more to afford the cost of the arms (traded for or purchased with that surplus; at least initially these transactions are not happening in coined money).
We see similar systems in the
Carolingian levy system or the Anglo-Saxon fyrd
, where households might be brigaded together –
in the Carolingian system, households were grouped into
mansi
– based on agricultural production (you can see how that works above as a proxy for ‘available surplus labor!’) with a certain number – three or four
mansi
in the Carolingian system – required to furnish one armed man for either a regional levy or the main field army. The goal of such systems is to take the surplus labor above and make it available for military service.
Alternately, the elites might not want their peasants as
soldiers
but as
workers
. Thus the very frequent appearance of
corvée
labor
: a requirement of a certain amount of intermittent, unpaid forced labor. This might be labor on the local lord’s estate (a sort of unpaid tenancy arrangement) or labor on public works (walls, castles, roads) or a rotating labor force working in state-owned (or elite-owned) productive enterprises (mines, for instance). As with military service, this sort of labor demand could be shaped to what the local populace would bear and enforced by a military aristocracy against a largely disarmed peasantry. Once again looking at the statistics above, even a few weeks a year
per man
(rather than per household) would drain most of the surplus labor out of our households. Adding, for instance, a month of
corvée
labor of per work-capable male (an age often pegged around
seven
for these societies) under our favorable (6:1) assumptions above bring our work totals to 305 days (70% of total) for the Smalls, 373 (77%) for the Middles and 650 (81.5%) for the Biggs.
Corvée
labor demands could be less than this, but also often quite a bit more (expectations varied a lot by local laws and customs.
Alternately, elites might just
crank up the taxes
. In the Hellenistic states (the Ptolemaic and Seleucid kingdoms especially), the army wasn’t a peasant levy, but rather a core of Greco-Macedonian rentier elites (your ‘rich peasants’ or ‘gentlemen farmers’), regional levies and mercenaries. To pay for that (and fund the lavish courts and public works that royal legitimacy required), the indigenous Levantine, Egyptian, Syrian, Mesopotamian (etc. etc.) underclasses were both made to be the tenants on the estates of those rentier elites (land seized from those same peasants in the initial Macedonian conquest or shortly thereafter) but also to pay
very high taxes
on their own land.
12
So while tax rates on military-settler (that is, Greco-Macedonian rentier elites) land might have been around 10% – 1/12th (8.3%) seems to have been at least somewhat common – taxes on the land of the indigenous
laoi
could run as high as 50%, even before one got to taxes on markets, customs duties, sales taxes, a head tax and state monopolies on certain natural resources including timber and importantly
salt
.
13
So the poor
laoi
might be paying extortionate taxes on their own lands, lighter taxes on settler (or temple) lands, but then also paying extortionate rents of those tenant-farmed lands.
Another micro-scale option was
debt
. We’ve been assuming our farmers are operating at steady-state subsistence, but as we keep noting,
yields in any given year were highly variable
. What peasants were forced to do in bad years, almost invariably as
go into debt to the Big Man
. But as noted, they’re simply not generating a lot in the way of
surplus
to ever pay off that debt. That in turn makes the debt itself a tool of control, what we often call
debt peonage
. Since the Big Man sets the terms of the debt (at a time when the peasant is absolutely desperate) it was trivially easy to construct a debt structure that the peasant could never pay off, giving the Big Man leverage to demand services – labor, tenancy on poor terms, broad social deference, etc. – in perpetuity. And of course, if the Big Man ever wants to expand his land holdings, all he would need to do would be to call in the un-payable debt and – depending on the laws around debt in the society – either seize the peasant’s land in payment or reduce the peasant into debt-slavery.
14
In short,
elites had a
lot of mechanisms
to sop up the excess labor in the countryside and they generally used them
.
Consequently, while peasants, unencumbered by taxes, rents, elites, debt, conscription and so on might have been able to
survive
working only a relatively small fraction of their time (probably around 100 days per year per-working-age male (again, age 7 or so and up) would suffice),
they did not live in that world.
Instead, they lived in a world where their own landholdings were extremely small – too small to fully support their households, although their small holdings might still provide a foundation of income for survival. Instead, they had to work on land owned or at least controlled by Big Men: local rentier-elites, the king, temples, monasteries, and so on. Those big institutions which could wield both legal and military force in turn extracted high rents and often demanded additional labor from our peasants, which soaked up much of their available labor,
leading to that range of 250-300 working days a year, with 10-12 hour days each, for something on the order of 2,500-3,600 working hours for a farm-laboring peasant annually
.
Which is quite a lot less than the c. 250 typical work days (261 weekdays minus holidays/vacation) in the United States – just by way of example of a modern industrial economy – at typically eight hours a day or roughly 2,000 working hours a year.
Of course it is also the case that those roughly 2,000 modern hours buy
a much better standard of living
than what our medieval peasants had access to – consider that a single unimpressive car represents more value just in worked metal (steel) than even many ancient or medieval
elites
could muster.
No, you do not work more than a medieval or ancient peasant: you work somewhat less, in order to obtain
far
more material comfort
. Isn’t industrialization grand?
That said, our picture of labor in peasant households is not complete!
Indeed, we have only seen to half of our subsistence basket – you will recall we broke out textiles separately – because we haven’t yet even really introduced the workload of
probably the most fully employed people in these households: the women
. And what’s where we’ll go in the next post in this series.
That is, landholders with enough land to subsist off of the rents without needing to do much or any actual agricultural labor themselves.
For scale with the cavalrymen we are talking about just a few thousand households lording over a country of perhaps five
million
; these fellows are honestly closer to something like a medieval knightly elite than the peasantry.
On these allotments, see P. Johstono,
The Army of Ptolemaic Egypt, 323-204 BC
(2020), 158-160 and C. Fischer-Bovet,
Army and Society in Ptolemaic Egypt
(2014), 212-217. On the rentier-self-sufficiency of these parcels, at a 5:1 yield, 30
aroura
should yield something like 3,500kg wheat equivalent (almost 12 million calories), more than enough to support the settler’s household at a 50% rent (see below) using labor from the much smaller adjacent farms of indigenous Egyptians. Indeed, to me it seems very likely the land allotments were calculated precisely on this basis, with infantrymen receiving the smallest allotment that could reliably support a household in leisure.
Le Roy Ladurie,
Les Paysans de Languedoc
(1966)
A reminder that the
setérée
is an exact unit, about 1/5th to 1/4th of a hectare, so about 0.49-0.62 acres.
Rosenstein (2004), 75, n.68; Erdkamp, (2005), 47-8; Cho-yun Hsu,
Han Agriculture: The Formation of Early Chinese Agrarian Economy
(1980); Johstono,
The Army of Ptolemaic Egypt
(2020), 101; Fischer-Bovet,
Army and Society in Ptolemaic Egypt
(2014), 121
It’s hard to tell precisely from what I have because Le Roy Ladurie groups households in brackets.
The average yield-per-
iugerum
at each fertility level in wheat equivalent are: 4:1, 81.2kg; 5:1, 108.2kg; 6:1, 135.3kg; 7:1, 162.4kg; 8:1, 189.5kg.
I would argue that the Roman approach to Italy from 509 to 218 BC appears to be an exception to this rule: the Romans do tend to use conquered land to set up large numbers of small landholding farms. Not rich peasants, but the Roman military class – the
assidui
farmer-citizen-soldiers – were also clearly not utterly impoverished either. It’s striking that the Romans
could
have set up a system of rents and tribute extraction in Italy but didn’t, instead effectively terraforming the Italian countryside into a machine for the production of heavy infantry. That heavy infantry in turn bought the Romans stunning military superiority, which they then used in the second and first centuries BC to create an
enormous
system of tribute and extraction (rather than extending the approach they had used in Italy).
Of course you can drain a swamp, but such drainage efforts are the kinds of things large, well-administered
states
do, not the sort of thing your local peasants can summon the labor for.
On Roman deployments, see Taylor,
Soldiers and Silver
(2020).
The way this was structurally, legally, was that the king, directly or indirectly
owned all the land
(‘spear-won’) and so many taxes were instead technically ‘rents’ paid to the king.
On the Seleucid taxation system, see Aperghis,
The Seleucid Royal Economy
(2004). For an overview of the relatively similar Ptolemaic system, see von Reden,
Money in Ptolemaic Egypt
(2007), Préaux,
.
L’économie royale des Lagides
(1979).
The abolition of this specific form of slavery (but not others) is a key political moment in the development of both Rome and Athens (and we may assume, many other Greek
poleis
) that signals the political importance of the smallholding farmer-citizens and their ability to compel major reforms. But the Big Man can still seize your farm!
gpt-5 and gpt-5-mini rate limit updates
Simon Willison
simonwillison.net
2025-09-13 00:14:46
gpt-5 and gpt-5-mini rate limit updates
OpenAI have increased the rate limits for their two main GPT-5 models. These look significant:
gpt-5
Tier 1: 30K → 500K TPM (1.5M batch)
Tier 2: 450K → 1M (3M batch)
Tier 3: 800K → 2M
Tier 4: 2M → 4M
gpt-5-mini
Tier 1: 200K → 500K (5M batch)
GPT-5 rate limi...
As a reminder,
those tiers
are assigned based on how much money you have spent on the OpenAI API - from $5 for tier 1 up through $50, $100, $250 and then $1,000 for tier
For comparison, Anthropic's current top tier is Tier 4 ($400 spent) which provides 2M maximum input tokens per minute and 400,000 maximum output tokens, though you can contact their sales team for higher limits than that.
Gemini's top tier is Tier 3 for $1,000 spent and
currently gives you
8M TPM for Gemini 2.5 Pro and Flash and 30M TPM for the Flash-Lite and 2.0 Flash models.
So OpenAI's new rate limit increases for their top performing model pulls them ahead of Anthropic but still leaves them significantly behind Gemini.
GPT-5 mini remains the champion for smaller models with that enormous 180M TPS limit for its top tier.
Windows 98 runs surprisingly well in QEMU via
UTM SE,
but it requires some care in setting it up. It’s a great way to run old 90s Windows and DOS software on your iPad (and Mac too, though you have other options available to you, or an iPhone if you don’t mind the HID difficulties).
This post provides some suggestions and tips for installing Windows and selecting the best emulated devices. The guidance is intended for UTM users on Apple platforms, but should apply to anything QEMU based (or QEMU itself). The advice might also be useful for other operating systems in UTM/QEMU as well.
Plug and play BIOS issues (or: how to install with ACPI)
When you install Windows 9x, PCI devices might be broken, and you’ll see a Plug and Play BIOS device with problems in the device manager:
This seems to be a bug in SeaBIOS or QEMU; I haven’t yet seen an issue tracking this. Many guides (i.e.
this one
or
this one
) suggest changing the device and hoping devices re-enumerate correctly. However, there’s a simpler method available when using Windows 98 SE. (If you’re using Windows 95, you won’t be able to do this.)
Windows 98 can use ACPI to enumerate devices instead of the legacy PnP BIOS. Unfortunately, it doesn’t use ACPI by default. (There seems to be an allowlist of known good ACPI BIOSes, as it was early days for ACPI.) To make it use ACPI anyways, boot with CD-ROM support from the Windows 98 CD instead of running the installer, then run Windows setup with the
/p j
flag, like so:
C:\> D:
D:\> cd WIN98
D:\WIN98> setup /p j
It’s possible to
convert an existing system to ACPI
, but it’s much easier to do this from the start. When Windows is installed this way, it should correctly enumerate all devices.
Device selection
System
QEMU can emulate devices Windows 98 supports out of the box, which is good as there are no VirtIO drivers. Make sure you’re using the i440-based “
pc
” rather than the Q35 based system, as it’ll be better supported for legacy systems. You don’t need to worry about selecting i386 vs. x86_64, as Windows 98 will obviously never touch 64-bit mode, so they’ll be the same.
(As a tip, if you’re running NT 4, you’ll need to select a different CPU to make sure it’s happy with the CPU flags as the default one is too new. A Pentium II should be sufficiently old.)
Input
You may need to disable USB (or at least, USB input devices) to avoid hanging on startup, at least with UTM (It’s possible the ‘Force PS/2 Controller’ option might work, but I haven’t had much luck with it. Unfortunately, this means you won’t have absolute mouse input (through the USB tablet) and must capture your cursor. With UTM SE on an iPad, this doesn’t hurt as much, as it can automatically capture the trackpad or external mouse, while leaving the touchscreen for interacting with iOS.
Video
The most sensible video option for Windows 98 is the Cirrus VGA (
-vga cirrus
). There are unfortunately some bugs (flashing in 16 bit colour modes, blitting issues in 8 bit colour modes), but it’s the only option with accelerated drivers out of the box. (Of course, there is no 3D acceleration with such a card.)
Apparently, Rage 128 emulation is being worked on (
ati-vga
), but currently only works for Power Mac emulation, and is in rough shape so far.
Networking and getting files in
For getting files into the VM easily, you’ll want a network. SLiRP NAT works fine for using a browser or SMB shares, for example. (Note this works better on Windows 98 than 95; 95 has issues with mounting SMB shares by IP and doesn’t come with a browser.) QEMU can emulate a variety of network cards. The tulip (DC2114x), NE2000 (PCI and ISA), and PCNet should all work out of the box with older Windows. I’d recommend using a PCI card if possible, since it saves you the ISA setup headache unless you need it for something old. If you do need to set up an ISA NE2000, it’s at address 300h, IRQ 9, which might require manual configuration in some cases.
Sound
For sound hardware, there are a few options available, with different tradeoffs.
If you want to run DOS software, the SoundBlaster 16 (
sb16
) emulation works out of the box, but there is no OPL3 or MPU-401, so MIDI won’t work correctly, just PCM. Games will have a hard time with this unless they’re entirely PCM. For
setting up your SB16 for DOS games
, use
SET BLASTER=A220 I5 D1 H5 P330 T5
(that’s address 220h, interrupt 5, 8-bit DMA 1, our non-existent MPU-401 at 330h, 16-bit DMA 5).
Note that QEMU supports adding an AdLib (OPL2 based) separately, which might help with some software.
The CS4321A I haven’t tested, but might work with WSS or Crystal-specific drivers. As with the SoundBlaster 16, there is no OPL3. QEMU sets this up at 534h, IRQ 9, DMA 3.
The Gravis UltraSound (
gus
) emulation works
surprisingly well
, but the Windows 95 drivers are crusty for the version of the card it emulates (GF1/GUS Classic), so use it only if you want to run old trackers or demoscene stuff. Note you may need to turn off the LPT port (
-parallel none
) to free up an interrupt used for the UltraSound.
Because of this, the ES1370 might be the best card to emulate for plain Windows usage, as it has relatively few quirks and I believe has drivers on the Windows 98 CD. However, it’s not ideal for DOS software as it requires TSRs to make it work right.
The AC97 emulation will require Realtek drivers. I haven’t tested this.
Potpourri
In UTM, you may want to turn off the entropy device, to reduce unknown device clutter in Device Manager, though it’s harmless. The VirtIO console device will still be present in Device Manager with UTM’s default flags.
Other quirks
In UTM SE, sometimes rebooting might hang when switching video modes. If this happens, it seems safe to shut down the machine and start it again. Avoiding reboots in favour of shutting down seems wise.
Performance characteristics
While TCG in QEMU doesn’t have the best reputation for performance, it might be good enough for your needs. On my MacBook Pro with an M1 Pro, benchmarks show performance somewhat around about a 750 MHz Pentium III, albeit with worse floating point performance. This is pretty usable, although most 3D games won’t be usable as even software rendering will be a bit sketchy.
If you’re using UTM SE on iOS, the interpreter is slower, but not unusable for 90s software. On my M1 iPad Pro, I get Pentium 100 performance, with similar penalties for FP. This is good for games up to about 1995 or 1996; titles like MechWarrior 2, Widget Workshop, many edutainment titles, and SimCity 2000 are playable this way, though MIDI or CD music will be missing. Non-game software like Office 97 or Visual C++ will run fine, of course. For OSes, this also puts things like Windows 2000 and beyond just out of reach performance wise – stick with Windows 98 for the best compatibility.
Beyond the Vibe: Five Conversations For Building Coalitions
The trick with Claude Code is to give it large, but not too large, extremely well defined problems.
(If the problems are too large then you are now vibe coding… which (a) frequently goes wrong, and (b) is a one-way street: once vibes enter your app, you end up with tangled, write-only code which fun...
The trick with Claude Code is to give it large, but not too large, extremely well defined problems.
(If the problems are too large then you are now vibe coding… which (a) frequently goes wrong, and (b) is a one-way street: once vibes enter your app, you end up with tangled, write-only code which functions perfectly but can no longer be edited by humans. Great for prototyping, bad for foundations.)
—
Matt Webb
,
What I think about when I think about Claude Code
Afro-Chilean artist
Nekki
has been spreading her anti-racist message through her reggae-rap lyrics for years, but recently, she feels like she is being blocked from reaching a wider audience.
She blames artificial intelligence.
Music streaming platforms have become so crowded with bot-built beats that it’s becoming harder for humans to stand out, she said, leaving artists fewer listeners and less money.
Nekki, a musician from Chile, has been struggling to compete with AI-generated music on streaming platforms. Andie Borie
“It’s a new form of danger disguised as technological innovation,” she told
Rest of World.
Breaking through the noise to find an audience has always been tough in music, but AI is now making it nearly impossible, musicians in Latin America say.
AI-generated tunes are crowding streaming platforms, and they don’t discriminate — there is AI-generated music for all genres of Latin music, including bachata, merengue, and
dembow
. The music isn’t great, but it still sucks away limited streaming income from real artists.
The speed and volume of new AI music is now exhausting human artists and distracting listeners, said people in the Latin music industry.
Even reggaeton superstar Bad Bunny has had to fight back AI-generated music. Someone
cloned his voice
to create a song that reached a top 100 ranking temporarily on Spotify in Chile before it was removed from the platform.
Excerpt of an original track by
Tito Molina 👁️
featuring AI-generated vocals in the style of Bad Bunny.
(License: CC BY-NC-SA 3.0)
Musicians are struggling to be heard above the growing cacophony of AI-generated songs on Spotify, Deezer, and YouTube Music. The platforms are weeding out certain AI content, but at the same time, they are promoting other types of AI.
Spotify, for example, offers a Spanish-speaking AI DJ that provides users with a personalized stream featuring commentary.
For the individuals programming pure AI music, it is a quantity rather than a quality game. They’re trying anything to see if it will get enough listens to generate a few pesos. The AI tunes don’t have to go viral to be profitable, as they cost next to nothing to produce.
AI music producers don’t care about music or culture, said Mark Meyer, founder of Paraguayan aggregator
Random Sounds
— a company that has distributed human-made music to streaming platforms for 11 years.
“Most people generating AI music aren’t musicians,” he told
Rest of World
. “They’re people who want to do business on the internet.”
From Mexico to Argentina, independent musicians no longer just compete against mainstream artists with big platforms, marketing teams, and millions of streams, like Karol G or Shakira. Now they must also compete with AI-generated music that is produced in minutes — not months, like theirs.
18%
The share of uploads to Deezer in April that were AI-generated.
A
study
by the International Confederation of Societies of Authors and Composers projects this will worsen, with AI-powered generative music accounting for approximately 20% of music streaming platform revenue by 2028.
Approximately 100,000 tracks are uploaded to Deezer daily. In January, the company detected that around 10% of the content was generated by AI, and by April, the percentage had increased to 18%.
“We are seeing this trend continue to increase,” Aurélien Hérault, chief innovation officer at Deezer, told
Rest of World
.
Paraguayan musician
Sari Carri
believes AI could jeopardize her chances of becoming a pop star. Her laid-back vocals over tracks that blend indie and electronica with Latin rhythms have a niche following in her hometown of Asunción but have yet to gain traction on streaming platforms.
She just
finished a new single
and loved wrapping up the creative process — writing, rewriting, recording, listening. However, the artist dreads the exhausting task of getting her music heard.
Sari Carri, a musician from Paraguay, believes AI will limit her chances of becoming a pop star. Leonor Blas for Rest of World
Creating new music is the fun part; now she has to keep it alive.
“What wears me out most is trying to make sure the life of those singles doesn’t end in a month and doesn’t get forgotten in two weeks,” she said. “Instead of using my creativity in producing songs, I’m using it to make reels launch campaigns before, during, and after, so they won’t be forgotten.”
Like many independent artists, Carri invests hundreds of dollars and weeks of effort into each single. Her returns from streaming platforms are just pennies on the dollar. In five years on Spotify, she has earned around $100.
Her AI competitors can make music with simple subscriptions to tools like Suno or Udio.
Meanwhile, the AI has often been trained on music others have produced. The U.S. music industry
sued
both AI music companies for training their models on copyrighted works without permission.
Carri said the amount of time her new songs get attention has been shrinking in recent months as the number of AI songs has surged.
Even as she works harder and spends more to promote her work, listeners are moving on sooner, she said. Listeners used to enjoy her songs for months. Today, they stop clicking on her latest songs within weeks.
“Songs’ lifespans are shorter, and information is retained less and less,” she said. “Competing against that is costing me a lot. I have to constantly remind people that I’m doing something so my listener numbers don’t drop.”
Artists across the region face the same struggle.
Claudia Lizardo
, a Venezuelan pop and folk artist based in Mexico, describes the pressure: “You have to have presence and sustain it so you don’t expire,” she told
Rest of World
. “That’s work.”
Sara Curruchich
, a Maya artist from Guatemala who sings in Kaqchikel, thinks AI makes it more difficult to discover unique songs amid the surge in the volume of new songs.
“Wonderful musical projects become invisible,” she said.
Heartgaze
, an Argentine artist and producer of urban music, said it’s getting tougher to stand out amid all the AI noise.
“It took me 10 years to be able to make a living from music,” he told
Rest of World
. “Now we also have to compete with music created like this.”
Paraguayan aggregator founder Meyer said streaming artists just aren’t making money as much as they used to. “If there are royalties from bots or AI music, they’re stealing from everyone else.”
Streaming platforms, artists, and their managers are adjusting.
Deezer has developed AI content detection tools and systems, excluding some AI-generated music from editorial and algorithmic recommendations. Spotify is part of the
Music Fights Fraud Alliance
, an industry initiative aimed at combating streaming fraud. Meyer said his aggregator Random Sounds also hires staff to listen to all the songs he uploads to streaming platforms.
Still, experts say human tools and hearing aren’t sophisticated enough to catch all the AI.
Artists are fighting back.
Carri has earned around $100 from Spotify over the past five years. Leonor Blas for Rest of World
Artists were already adapting to streaming platform algorithms before AI became a major concern. A 2024
study
by sociologist Arturo Arriagada of independent musicians in Chile revealed that around 67% of artists have changed their publishing habits for Spotify, adjusting song durations, collaborating strategically, and creating content for social media. They’re also taking on curatorial roles themselves, creating regional and collaborative playlists to gain visibility.
Musicians are putting out single songs every four to six weeks instead of full albums. A growing number are giving up on ever making money from their music online. Some are focusing on live performances as well as old-school vinyl albums and cassettes.
Several collective management associations in Latin America are advocating for the regulation of AI in the creative industry.
“Artificial intelligence is trained with our work, with existing musical and vocal arrangements, without our authorization and without paying us,” said
Alberto Laínez
, a Honduran singer-songwriter who creates music about environmental conservation and Indigenous communities.
But leaving streaming platforms doesn’t feel like a real option for most artists. Streaming numbers have become a metric for accessing markets, venues, and even public funding in several Latin American countries.
Many musicians are utilizing AI tools for administrative tasks and to enhance their online presence.
Carri said she uses artificial intelligence for planning posts or for developing communication ideas. She also has a side hustle: teaching. She won’t let the bots touch her beats, though.
“It’s a tool that shouldn’t be allowed to influence one’s art,” she said.
Nekki is embracing small performances and interacting directly with her fans.
“AI can make music very similar to mine, but it will never capture the experience, the tenderness, and the context behind it,” she said.
Assessing the Quality of Dried Squid
Schneier
www.schneier.com
2025-09-12 22:05:12
Research:
Nondestructive detection of multiple dried squid qualities by hyperspectral imaging combined with 1D-KAN-CNN
Abstract: Given that dried squid is a highly regarded marine product in Oriental countries, the global food industry requires a swift and noninvasive quality assessment of this prod...
Nondestructive detection of multiple dried squid qualities by hyperspectral imaging combined with 1D-KAN-CNN
Abstract:
Given that dried squid is a highly regarded marine product in Oriental countries, the global food industry requires a swift and noninvasive quality assessment of this product. The current study therefore uses visiblenear-infrared (VIS-NIR) hyperspectral imaging and deep learning (DL) methodologies. We acquired and preprocessed VIS-NIR (4001000 nm) hyperspectral reflectance images of 93 dried squid samples. Important wavelengths were selected using competitive adaptive reweighted sampling, principal component analysis, and the successive projections algorithm. Based on a Kolmogorov-Arnold network (KAN), we introduce a one-dimensional, KAN convolutional neural network (1D-KAN-CNN) for nondestructive measurements of fat, protein, and total volatile basic nitrogen….
Interesting analysis:
When cyber incidents occur, victims should be notified in a timely manner so they have the opportunity to assess and remediate any harm. However, providing notifications has proven a challenge across industry.
When making notifications, companies often do not know the true iden...
When cyber incidents occur, victims should be notified in a timely manner so they have the opportunity to assess and remediate any harm. However, providing notifications has proven a challenge across industry.
When making notifications, companies often do not know the true identity of victims and may only have a single email address through which to provide the notification. Victims often do not trust these notifications, as cyber criminals often use the pretext of an account compromise as a phishing lure.
[…]
This report explores the challenges associated with developing the native-notification concept and lays out a roadmap for overcoming them. It also examines other opportunities for more narrow changes that could both increase the likelihood that victims will both receive and trust notifications and be able to access support resources.
The report concludes with three main recommendations for cloud service providers (CSPs) and other stakeholders:
Improve existing notification processes and develop best practices for industry.
Support the development of “middleware” necessary to share notifications with victims privately, securely, and across multiple platforms including through native notifications.
Improve support for victims following notification.
While further work remains to be done to develop and evaluate the CSRB’s proposed native notification capability, much progress can be made by implementing better notification and support practices by cloud service providers and other stakeholders in the near term.
Meow aims to blend modal editing into Emacs with minimal interference
with its original key-bindings, avoiding most of the hassle introduced
by key-binding conflicts. This leads to lower necessary configuration and
better integration. More is achieved with fewer commands to remember.
Key features compared to existing solutions:
Minimal configuration – build your own modal editing system
Proton Mail Suspended Journalist Accounts at Request of Cybersecurity Agency
Intercept
theintercept.com
2025-09-12 21:56:29
The journalists were reporting on suspected North Korean hackers. Proton only reinstated their accounts after a public outcry.
The post Proton Mail Suspended Journalist Accounts at Request of Cybersecurity Agency appeared first on The Intercept....
The company behind
the Proton Mail email service, Proton,
describes itself
as a “neutral and safe haven for your personal data, committed to defending your freedom.”
But last month, Proton disabled email accounts belonging to journalists reporting on security breaches of various South Korean government computer systems following a complaint by an unspecified cybersecurity agency. After a public outcry, and multiple weeks, the journalists’ accounts were eventually reinstated — but the reporters and editors involved still want answers on how and why Proton decided to shut down the accounts in the first place.
Martin Shelton, deputy director of digital security at the Freedom of the Press Foundation, highlighted that numerous newsrooms use Proton’s services as alternatives to something like Gmail “specifically to avoid situations like this,” pointing out that “While it’s good to see that Proton is reconsidering account suspensions, journalists are among the users who need these and similar tools most.” Newsrooms like The Intercept, the Boston Globe, and the Tampa Bay Times all rely on Proton Mail for
emailed tip submissions
.
Shelton noted that perhaps Proton should “prioritize responding to journalists about account suspensions privately, rather than when they go viral.”
On Reddit, Proton’s official account
stated
that “Proton did not knowingly block journalists’ email accounts” and that the “situation has unfortunately been blown out of proportion.” Proton did not respond to The Intercept’s request for comment.
The two journalists
whose accounts were disabled were working on an
article
published in the August issue of the long-running hacker zine Phrack. The story described how a sophisticated hacking operation — what’s known in cybersecurity parlance as an APT, or advanced persistent threat — had wormed its way into a number of South Korean computer networks, including those of the Ministry of Foreign Affairs and the military Defense Counterintelligence Command, or DCC.
The journalists, who published their story under the names Saber and cyb0rg, describe the hack as being consistent with the work of Kimsuky, a notorious North Korean state-backed APT
sanctioned
by the U.S. Treasury Department in 2023.
As they pieced the story together, emails viewed by The Intercept show that the authors followed cybersecurity best practices and conducted what’s known as responsible disclosure: notifying affected parties that a vulnerability has been discovered in their systems prior to publicizing the incident.
Saber and cyb0rg created a dedicated Proton Mail account to coordinate the responsible disclosures, then proceeded to notify the impacted parties, including the Ministry of Foreign Affairs and the DCC, and also notified South Korean cybersecurity organizations like the Korea Internet and Security Agency, and
KrCERT/CC
, the state-sponsored Computer Emergency Response Team. According to emails viewed by The Intercept, KrCERT wrote back to the authors, thanking them for their disclosure.
A note on cybersecurity jargon: CERTs are agencies consisting of cybersecurity experts specializing in dealing with and responding to security incidents. CERTs exist in over 70 countries — with some countries having multiple CERTs each specializing in a particular field such as the financial sector — and may be government-sponsored or private organizations. They adhere to a set of formal technical
standards
, such as being expected to react to reported cybersecurity threats and security incidents. A high-profile example of a CERT agency in the U.S. is the Cybersecurity and Infrastructure Agency, which has recently been
gutted
by the Trump administration.
A week after the print issue of Phrack came out, and a few days before the digital version was released, Saber and cyb0rg found that the Proton account they had set up for the responsible disclosure notifications had been suspended. A day later, Saber discovered that his personal Proton Mail account had also been suspended. Phrack posted a timeline of the account suspensions at the top of the published article, and later highlighted the timeline in a viral social media
post
. Both accounts were suspended owing to an unspecified “potential policy violation,” according to screenshots of account login attempts reviewed by The Intercept.
The suspension notice instructed the authors to fill out
Proton’s abuse appeals form
if they believed the suspension was in error. Saber did so, and received a reply from a member of Proton Mail’s Abuse Team who went by the name Dante.
In an email viewed by The Intercept, Dante told Saber that their account “has been disabled as a result of a direct connection to an account that was taken down due to violations of our terms and conditions while being used in a malicious manner.” Dante also provided a link to
Proton’s terms of service
, going on to state, “We have clearly indicated that any account used for unauthorized activities, will be sanctioned accordingly.” The response concluded by stating, “We consider that allowing access to your account will cause further damage to our service, therefore we will keep the account suspended.”
On August 22, a Phrack editors reached out to Proton, writing that no hacked data was passed through the suspended email accounts, and asked if the account suspension incident could be deescalated. After receiving no response from Proton, the editor sent a follow-up email on September 6. Proton once again did not reply to the email.
On September 9, the official Phrack X account made a
post
asking Proton’s official account asking why Proton was “cancelling journalists and ghosting us,” adding: “need help calibrating your moral compass?” The post quickly went viral, garnering over 150,000 views.
Proton’s official account replied the following day,
stating
that Proton had been “alerted by a CERT that certain accounts were being misused by hackers in violation of Proton’s Terms of Service. This led to a cluster of accounts being disabled. Our team is now reviewing these cases individually to determine if any can be restored.” Proton then
stated
that they “stand with journalists” but “cannot see the content of accounts and therefore cannot always know when anti-abuse measures may inadvertently affect legitimate activism.”
Proton did not publicly specify which CERT had alerted them, and didn’t answer The Intercept’s request for the name of the specific CERT which had sent the alert. KrCERT also did not reply to The Intercept’s question about whether they were the CERT that had sent the alert to Proton.
Later in the day, Proton’s founder and CEO Andy Yen
posted
on X that the two accounts had been reinstated. Neither Yen nor Proton explained why the accounts had been reinstated, whether they had been found to not violate the terms of service after all, why had they been suspended in the first place, or why a member of the Proton Abuse Team reiterated that the accounts had violated the terms of service during Saber’s appeals process.
Phrack noted that the account suspensions created a “real impact to the author. The author was unable to answer media requests about the article.” The co-authors, Phrack pointed out, were also in the midst of the responsible disclosure process and working together with the various affected South Korean organizations to help fix their systems. “All this was denied and ruined by Proton,” Phrack stated.
Phrack editors said that the incident leaves them “concerned what this means to other whistleblowers or journalists. The community needs assurance that Proton does not disable accounts unless Proton has a court order or the crime (or ToS violation) is apparent.”
I used standard Emacs extension-points to extend org-mode
A battery plant co-owned by Hyundai Motor is facing a minimum startup delay of two to three months following an immigration raid last week, Hyundai CEO Jose Munoz said on Thursday.
The Georgia plant, which is operated through a joint venture between Hyundai and South Korea's LG Energy Solution, was at the center of the largest single-site enforcement operation in the U.S. Department of Homeland Security's history last week.
Munoz, in his first public comments since the raid, said he was surprised when he heard the news and immediately inquired if Hyundai workers were involved. He said the company discovered that the workers at the center of the raid were mainly employed by suppliers of LG.
In a time of both misinformation and too much information,
quality journalism is more crucial than ever.
By subscribing, you can help us get the story right.
Grimoire CSS
is a comprehensive CSS engine crafted in Rust,
focusing on unmatched flexibility, reusable dynamic styling, and optimized performance for every environment. Whether you need filesystem-based CSS generation or pure in-memory processing, Grimoire CSS adapts to your needs without compromising on performance or features.
Everything in Its Place
True CSS engine.
Exceptionally powerful and endlessly flexible. Independent and self-sufficient. No bundlers, optimizers, deduplicators, preprocessors or postprocessors required — it outputs final CSS on its own.
Performance.
Every part is optimized for maximum efficiency, outperforming any specialized tools. It processes almost
200k classes per second
. It is
5× faster
and
28× more efficient
than TailwindCSS v4.x. All while truly generating CSS.
Universality.
The native parser handles any source file without plugins or configuration, running in both filesystem-based and in-memory modes. Available as a standalone binary, a Rust crate, and an npm library.
Intelligent CSS Generation.
Respects the CSS cascade and applies necessary vendor prefixes. Uses your
.browserslistrc
to guarantee target browser support.
Spell and Scroll Systems.
Turn CSS into your personal styling language — no arbitrary class names, no hidden abstractions. Write clear
property=value
Spells, complete with
area__
,
{focus}
and
effect:
modifiers for breakpoints, selectors and pseudo-classes. Bundle them into Scrolls — named, parameterized, inheritable style modules for consistent systems at any scale.
Color Toolkit.
A powerful module compliant with CSS Color Module Level 4, enabling precise and high-performance color manipulations. Grimoire CSS also serves as a standalone color toolkit, with its color features available via a public API.
Configuration.
Grimoire CSS uses a single JSON configuration file per repository. Its format is straightforward yet robust, supporting monorepos with hundreds of projects or individual configurations — variables, scrolls, generation modes, shared and critical CSS, external files — all out of the box.
Effortless Migration.
The
Transmutator
available as a CLI tool or Web UI, simplifies converting any CSS to the Spell format. Migrate entire projects to Grimoire CSS without changing your component class names, benefiting from the engine’s power immediately, even with a gradual transition.
A Spell System
At the heart of Grimoire CSS lies the
Spell
, the foundational entity of the system.
Spell
takes a different approach from traditional utility classes, like those you’d find in Tailwind. While utilities in Tailwind feel like slightly enhanced Bootstrap classes, Grimoire CSS takes things to a new level. In Tailwind, you’re expected to memorize arbitrary names like
rounded-md
for
border-radius: 0.375rem
- which doesn’t even make things look rounded. And then there’s
tracking-tight
for
letter-spacing: -0.025em
. How are you supposed to know that’s related to letter spacing?
Grimoire CSS cuts through that confusion by introducing
Spell
- an approach that is both simple and infinitely flexible. At its core, a
Spell
is just a CSS declaration, written in a format everyone understands:
property=value
. For example,
border-radius: 0.375rem
in Grimoire CSS becomes
border-radius=0.375rem
. If you prefer something shorter,
bd-rad=0.375rem
works too, or even
bd-rad=.375rem
(yes, Grimoire CSS respects CSS’s own shorthand capabilities). Unlike pre-baked utility classes,
Spells
follow the natural structure of CSS:
property: value
becomes
component=target
.
If you don’t know any shorthands yet, you can always write out full components (each full component directly maps to its corresponding CSS property) and then run the
shorten
command to convert all full components in your files (that defined in config) into their shorthand forms. Easy as it should be!
This isn’t just another syntax. It’s the whole system reimagined. You’re free to write any value in the target, whether it’s custom units, functions, or even complex animations. Everything CSS supports is fair game, and all you need to do is escape spaces with underscores (
_
). That’s it. Of course, we didn’t stop at the basics. Spells also introduce
optional enhancements
:
area
,
focus
, and
effects
, which give you deeper control over media queries, pseudo-classes, attributes, and more.
area
: The
area
defines conditions like screen size and sits at the start of your spell, separated from the rest by double underscores (
__
). For example,
(width>=768px)__bd-rad=0.375rem
will activate the rule only for screens wider than 768px. Prefer a shorthand? You can use built-in names like
md__bd-rad=0.375rem
. It’s still valid CSS, but with all the magic of
Spell
.
focus
: Sometimes, you need more than a class or a media query.
focus
lets you wrap anything - attributes, pseudo-classes, or nested selectors - inside your spell. Placed as the second part of the spell (or first if there’s no
area
), it’s enclosed in curly brackets. For example:
{[hidden]_>_p:hover:active}color=red
becomes this CSS:
... [hidden] > p:hover:active { color: red;}
It’s not just readable - it’s intuitive. What you see is exactly what you get.
effects
: Sometimes, you need quick pseudo-classes without the full complexity of
focus
. That’s where
effects
come in. Just add pseudo-classes directly in the spell like this:
hover,active:color=blue
. With
effect
, you keep it compact without losing any power. Simply separate it from the
component
and
target
with a colon (
:
).
The entire
Spell
system is built on clarity and explicitness. There are no magical, arbitrary strings for targets like you find in other systems. And we don’t compromise on clarity for the sake of brevity. Targets are full, valid CSS values - because that’s how it should be. Components mirror actual CSS properties, but they can be shortened if you want. In this way, Grimoire CSS is both a
CSS declaration
and a
methodology
. It’s so powerful because every
Spell
is valid CSS - there’s no abstraction that gets in the way of what you need to achieve.
So, why call it a
Spell
? Because, like magic, it’s composed of multiple elements:
area
,
focus
,
effect
,
component
, and
target
. And each of these pieces works together to create something far greater than the sum of its parts. With Grimoire CSS, you’re not just writing styles - you’re casting spells. The name
Grimoire
comes from ancient magical texts. Just as those books hold the knowledge to perform spells, Grimoire CSS provides you the knowledge and tools to perform CSS magic - without relying on pre-baked solutions. You’re in full control.
Recap
The structure of a spell follows this format:
area__{focus}component=target
or
area__effect:component=target
.
Use dashes (
-
) to separate words.
Use underscores (
_
) to escape spaces.
A
Scroll
is like a
Spell
, but with one crucial difference - it’s something you build from scratch. Think of it as a customized collection of styles, bundled into one reusable class. Sometimes, you need to combine multiple styles into a single class for consistency, reusability, or just to make your life easier. With
Scroll
, you can do just that. Combine spells, give your new creation a name, and you’ve got a
Scroll
ready to use across your projects.
And here’s the best part: everything you love about
Spells
works seamlessly with
Scrolls
too -
area
,
focus
,
effect
, and even
target
. But there’s even more: when you define a
Scroll
, you can introduce
variables
to make your styles dynamic. Just use the
$
symbol, and the
target
becomes a placeholder, waiting for the actual value to be filled in. Want to create a button class that accepts variable values? No problem. Here’s an example:
This
btn
scroll expects four target values, and if you pass fewer or more, Grimoire CSS will kindly let you know. The targets are applied in order, giving you incredible flexibility. But we’re not done yet.
One of the most exciting aspects of
Scrolls
is
inheritance
. Yes, you can extend a
Scroll
with another
Scroll
. Combine and compose them endlessly to create complex, reusable styles. Let’s take a look:
In this example,
danger-btn
extends
btn
, meaning it inherits all of
btn
’s spells plus its own. So,
danger-btn.spells
will look like
btn.spells
+
danger-btn.spells
, with the parent scroll’s styles taking priority at the top.
But the fun doesn’t stop there -
danger-btn-rnd
extends both
danger-btn
and
round
. This means that
danger-btn-rnd.spells
equals
btn.spells
+
danger-btn.spells
+
round.spells
, combined in the correct order. And yes, the order matters. This layered inheritance allows you to build complex style structures effortlessly.
The real magic of
Scrolls
lies in their
unlimited possibilities
. You can chain styles together, extend them endlessly, and define variables as placeholders to create flexible, reusable patterns across your entire project. With
Scrolls
, Grimoire CSS goes far beyond being Yet Another CSS Framework. In fact, you could even recreate the entire structure of Tailwind or Bootstrap using nothing but the flexibility of Spells and Scrolls.
It’s pure, beautiful madness - without limits.
Variables and Built-in Functions: Total Control Over Styles and Sizes
Grimoire CSS allows you to define your own variables within its settings, making your styling even more dynamic and customizable. Unlike custom properties, these variables don’t compile and remain in your settings and are only compiled when used - keeping your CSS clean and efficient.
How to Use Variables
You can define
any value
as a variable - font sizes, colors, dimensions, anything. To reference them in your styles, just add the
$
symbol before the variable name (you’ll remember this from the
Scroll
section). Here’s how you define and use a variable:
Defining a Variable
{ "variables": { "hero-fs": "42px" }}
Using the Variable
<h1 class="font-size=$hero-fs">Hero text</h1>
In this example, the
hero-fs
variable holds the value
42px
, which is then applied to the
font-size
of the
<h1>
element. Variables in Grimoire CSS offer a simple and effective way to maintain consistency across your styles, while keeping your code flexible and DRY.
Built-in Areas: Responsive Design, Simplified
Grimoire CSS follows a mobile-first approach and comes with
built-in responsive areas
, including
sm
,
md
,
lg
,
xl
, and
2xl
. When you define a spell with one of these areas, like
md__width=100px
, the spell will apply only when the screen width is equal to or greater than the specified area.
For example,
md__width=100px
is equivalent to this media query:
(width>=768px)__width=100px
.
Of course, you’re not limited to the built-in areas. You can define your own media queries just as easily, like this:
(width>666px)__width=100px
With these areas, you have full control over your responsive design, but without the hassle of constantly writing and rewriting media queries.
Adaptive Size Functions:
mrs
and
mfs
Grimoire CSS takes responsive design even further with built-in functions like
mrs
(
M
ake
R
esponsive
S
ize) and
mfs
(
M
ake
F
luid
S
ize). These functions allow you to adapt font sizes, widths, and more based on the viewport size.
mrs
: Make Responsive Size
This function dynamically adjusts the size of an element between a minimum and maximum value, depending on the viewport width. Here are the arguments:
min_size
: The minimum size for the element.
max_size
: The maximum size for the element.
min_vw
: (Optional) The minimum viewport width.
max_vw
: (Optional) The maximum viewport width.
Example Usage of
mrs
<p class="font-size=mrs(12px_36px_480px_1280px)"> Font size of this text will dynamically change based on the screen size</p>
In this example, the font size will automatically adjust between 12px and 36px, depending on the screen size, with fluid adjustments in between. This makes responsive design not only easier but more precise, without the need for complex calculations or multiple breakpoints.
mfs
: Make Fluid Size – Creates fully fluid sizes without media queries for seamless scaling
Here are the arguments:
min_size
: The minimum size for the element.
max_size
: The maximum size for the element.
Example Usage of
mfs
<p class="font-size=mfs(12px_36px)"> Font size smoothly scales between 12px and 36px based on the viewport size.</p>
The Power of Grimoire’s Variables and Functions
With Grimoire CSS, you don’t just write styles - you take control of them. By leveraging variables, responsive areas, and adaptive size functions, you can make your CSS dynamic, scalable, and ready for any device or screen size. It’s flexibility without the fuss, and it’s all built right in.
Grimoire CSS doesn’t just give you the tools to build powerful styles from scratch - it also comes with a set of
predefined scrolls
to help you get started right away. All predefined scrolls follow the same convention: they begin with the prefix
g-
. This makes it easy to distinguish built-in scrolls from the ones you define yourself.
Built-In Animations: Ready When You Are
Grimoire CSS comes loaded with
hundreds of built-in animations
(700+ at the moment). These animations are lightweight and efficient - they are only compiled if you actually use them. To trigger one, simply use its name in either the
animation-name
or
animation
CSS rule. But Grimoire CSS doesn’t stop at just applying animations; it also simplifies the process of adding associated rules.
For example, the predefined scroll
g-anim
allows you to apply an animation and its associated rules at the same time. Here,
g-
is the prefix, and
anim
is a short version of the spell
animation
. With this scroll, you can quickly inject an animation along with the necessary rules - saving time and keeping your styles clean and organized.
Create Your Own Animations
Even though Grimoire CSS comes packed with animations, it also gives you the power to add your own, seamlessly integrating them into your projects. It’s as simple as creating a new subfolder called
animation
inside the
grimoire
folder, then adding your custom CSS file using the format
<name-of-animation>.css
.
Within that file, you define your animation using
@keyframes
, along with any custom styles. You can also use the class placeholder
GRIMOIRE_CSS_ANIMATION
to add specific styles tied to the animation itself. Let’s take a look at an example with a custom pulse animation:
In this example, you’ve defined the pulse animation and set it up with ease using the
GRIMOIRE_CSS_ANIMATION
placeholder. Once this file is in your project, you can invoke the pulse animation as easily as any built-in animation, giving you complete control over custom animations.
In addition to defining scrolls and variables within
grimoire.config.json
, Grimoire CSS allows you to extend your configuration with external JSON files for scrolls and variables. These external JSON files follow the same structure as their corresponding properties in the config file. These files should be stored alongside your main configuration file.
Before generating CSS, Grimoire CSS checks for any external scrolls or variables and merges them into the main config (with the main config taking priority, so external scrolls/variables won’t override your primary configuration settings). This adds flexibility, scalability, and convenience to your workflow.
This feature enables sharing your scrolls/spells independently from your main configuration, as well as using those created by others. For example, you can use the Tailwind CSS implementation via external scrolls. More information about where to share and find scrolls, variables, or complete configurations will be detailed below.
Grimoire CSS isn’t just tied to traditional CSS, JavaScript, or HTML files. The beauty of its
language-agnostic parser
is that it can parse spells from any file or extension. Whether you’re working with
.html
,
.tsx
,
.mdx
, or something else entirely, it can handle it.
This means you’re not limited by file types or formats - you define the
inputPaths
, and Grimoire takes care of the rest. Whether your project is built with React, Vue, or something entirely different, it seamlessly integrates and extracts the styles you need.
Spells in Plain Text with Template Syntax
If you want to use spells outside the traditional
class
or
className
attributes, Grimoire CSS provides a clever solution with its
template syntax
:
g!<spell>;
. This syntax lets you wrap your spell in a template, enabling the parser to collect spells from any text-based content.
Let’s say you have both a classic spell and a templated spell that are essentially the same. Don’t worry - Grimoire CSS is smart enough to combine them into one, as long as it doesn’t affect the CSS cascade. The result? Clean, efficient CSS output like this:
.classic,.templated { /* CSS declaration */}
Template syntax also supports multiple spells in a single template using the
&
symbol as a spells separator:
g!color=violet&display=flex;
. This enables CSS-in-JS–like scenarios in absolutely any files.
This flexibility means you can integrate Grimoire CSS in non-traditional environments, using it across various file types and even in plain text. It’s not just tied to the web - it’s ready for any project, anywhere.
CSS Optimization: Minification, Vendor Prefixes, and Deduplication - All with CSS Cascade in Mind
Grimoire CSS doesn’t just help you manage your styles - it ensures that only the CSS you actually need is generated. No duplicates, no wasted space. Whether it’s shared across multiple projects or inlined for critical loading, Grimoire makes sure your CSS is lean, efficient, and optimized for performance.
Grimoire CSS takes optimization seriously. It generates only the CSS that’s actually used, and it monitors for duplicates right from the start, ensuring no unnecessary styles sneak through. This happens at the very
early stages
of generation, so by the time the process finishes, you’ve got a lean, clean stylesheet.
But it doesn’t stop there. Just take a look:
Minification
: It shrinks your CSS without sacrificing readability or maintainability.
Vendor Prefixes
: Automatically adds necessary prefixes for cross-browser compatibility based on your browserslist configuration:
Uses
.browserslistrc
if it exists in your project
Falls back to ‘defaults’ if no configuration is found
Supports custom browserslist configuration in in-memory mode
Deduplication
: Duplicate CSS? Not here. Grimoire keeps a close watch and ensures that only the needed CSS is generated.
Modern CSS Features
: Automatically transforms modern CSS features for better browser compatibility
All of this happens while preserving the
CSS cascade
- no unintentional overwrites, no broken styles. Just clean, optimized CSS that’s ready for any environment.
Grimoire CSS introduces a comprehensive suite of built-in color manipulation functions, compliant with the CSS Color Module Level 4 specification. These functions enable precise and dynamic color transformations:
g-grayscale(color)
: Converts a color to grayscale by setting its saturation to 0%.
g-complement(color)
: Generates the complementary color by adding 180° to the hue.
g-invert(color_weight?)
: Inverts a color. Optionally, the
weight
parameter controls the intensity of the inversion (default: 100%).
g-mix(color1_color2_weight)
: Blends two colors based on a specified weight (0% - 100%).
g-adjust-hue(color_degrees)
: Rotates the hue of a color by a specified number of degrees (positive or negative).
g-adjust-color(color_red?_green?_blue?_hue-val?_sat-val?_light-val?_alpha-val?)
: Adjusts individual components of a color using delta values for RGB or HSL channels.
g-change-color(color_red?_green?_blue?_hue-val?_sat-val?_light-val?_alpha-val?)
: Sets absolute values for RGB or HSL components.
g-scale-color(color_red?_green?_blue?_sat-val?_light-val?_alpha-val?)
: Scales RGB or HSL components by percentage values (positive to increase, negative to decrease).
g-rgba(color_alpha)
: Updates the alpha (opacity) of a color.
g-lighten(color_amount)
: Increases the lightness of a color by a specified percentage.
g-darken(color_amount)
: Decreases the lightness of a color by a specified percentage.
g-saturate(color_amount)
: Increases the saturation of a color by a specified percentage.
g-desaturate(color_amount)
: Decreases the saturation of a color by a specified percentage.
g-opacify(color_amount)
(Alias:
g-fade-in
): Increases the opacity of a color by a specified amount.
g-transparentize(color_amount)
(Alias:
g-fade-out
): Decreases the opacity of a color by a specified amount.
Example Usage
Usage Rules:
All arguments are positional, and any optional arguments can be omitted if they are not being changed.
Do not include
%
,
deg
, or other units in the values - Grimoire handles these internally.
<div class="bg=g-grayscale(#ff0000)">Grayscale Red Background</div><div class="bg=g-complement(#00ff00)">Complementary Green Background</div><div class="bg=g-invert(#123456)">Fully Inverted Background</div><div class="bg=g-invert(#123456_50)">Partially Inverted Background</div><div class="bg=g-mix(#ff0000_#0000ff_50)">Purple Background</div><div class="bg=g-adjust-hue(#ffcc00_45)">Hue Adjusted Background</div><div class="bg=g-adjust-color(#123456_0_0_12)">Adjust Blue Component</div><div class="bg=g-adjust-color(#123456_0_0_12_5)"> Adjust Blue and Saturation</div><div class="bg=g-change-color(#123456_255_0)">Set Red and Green Components</div><div class="bg=g-change-color(#123456_0_0_0_180)">Set Hue Only</div><div class="bg=g-scale-color(#123456_10_-10)">Scale Red Up, Green Down</div><div class="bg=g-scale-color(#123456_0_0_0_20)">Scale Saturation Up</div><div class="bg=g-rgba(#123456_0.5)">Half Transparent Background</div><div class="bg=g-lighten(#123456_10)">Lightened Background</div><div class="bg=g-darken(#123456_10)">Darkened Background</div><div class="bg=g-saturate(#123456_20)">More Saturated Background</div><div class="bg=g-desaturate(#123456_20)">Less Saturated Background</div><div class="bg=g-opacify(#123456_0.2)">More Opaque Background</div><div class="bg=g-transparentize(#123456_0.2)">More Transparent Background</div>
These functions provide developers with an extensive toolkit for creating vibrant, dynamic, and flexible styles with ease.
Projects
In Grimoire CSS, managing your projects is as flexible as the spells themselves. You define exactly which files need to be parsed (
inputPaths
, supporting glob patterns) and specify where the built CSS should go (
outputDirPath
).
You also have two powerful options for compiling your CSS:
Single Output File
: Where all parsed spells from various files are compiled into a single CSS file.
Individual Output Files
: Where each input file has its own corresponding CSS file.
For single output mode, you’ll just need to define the name of the final CSS file with
singleOutputFileName
. The flexibility here allows you to control the output method depending on your project’s needs. Every project configuration contains a
name
property and can include as many projects as you want. Whether you’re building a single-page application (SPA) or managing multiple projects, Grimoire CSS has you covered.
In essence, the
projects
section of your config is a list of projects, each with its own unique input and output settings. Here’s how that might look:
In the
first
and
third
projects, we use the single output mode, where all the spells are compiled into one file. This is ideal for
SPAs
or projects that need consolidated CSS for optimization.
In the
second
project, a static site, each page will have its own CSS file. This approach is perfect for projects where you want isolated styles for different parts of the website, ensuring that each page only loads what it needs.
Projects on Your Terms
Grimoire CSS gives you full control over how you manage and compile your styles. You can configure projects for different output strategies depending on whether you’re building large, single-page applications or static sites with multiple pages. The flexibility to switch between single or multiple output files means you’re never locked into one approach. Grimoire adapts to your needs, not the other way around.
Locking
Grimoire CSS supports a
locking
mechanism for efficient builds. By enabling the
lock
option in
grimoire.config.json
, you can automatically track and clean up outdated built files
Grimoire CSS makes it easy to define
shared
and
critical
CSS alongside your project-specific styles, allowing you to optimize how styles are applied across your entire application.
Shared CSS is exactly what it sounds like - a set of styles that you can build into a separate file and reuse across multiple projects or pages in your application. By defining shared styles, you ensure consistency and reduce repetition, improving performance and maintainability.
Critical CSS: Inline for Faster Rendering
Critical CSS goes a step further. It automatically inlines essential styles directly into your HTML files, ensuring that key styles are loaded instantly. And here’s the clever part: if some spells are already used in your components or files, Grimoire won’t regenerate them - because they’re now part of your critical CSS. No duplicates, no unnecessary bloat - just efficient, fast-loading styles.
How It Works
Both the
shared
and
critical
sections of the config are similar in structure. Each has:
styles
: An optional list of styles that are used in the shared or critical configuration. You can include any spells, scrolls, or even paths to existing CSS files. Grimoire will extract and optimize the content during compilation.
cssCustomProperties
: An optional list of custom CSS properties, which gives you the flexibility to define your own properties and pair them with specific elements or themes.
For
shared CSS
, you’ll define an
outputPath
- the file where your shared styles will be stored. For
critical CSS
, you’ll define
fileToInlinePaths
- a list of HTML files (or glob patterns) where these essential styles should be inlined.
Let’s take a look at some examples:
Defining Custom Properties
In the
cssCustomProperties
section, you can define custom properties and their key-value pairs for any DOM elements in your app. Here are the key parts of this configuration:
element
: The optional DOM element associated with the CSS variable (e.g.,
tag
,
class
,
id
, or even
:root
).
dataParam
: The parameter name used in your CSS configuration.
dataValue
: The corresponding value for that parameter.
cssVariables
: A set of CSS variables and their values that will be applied to the element.
Shared CSS
includes a simple style (
font-size=20px
) and outputs to
shared.css
.
Critical CSS
will be inlined into all HTML files under the
about
and
blog
directories, ensuring essential styles like
reset.css
, padding, colors, and animations load immediately.
Performance By Design: Built for Speed and Efficiency
Grimoire CSS achieves exceptional performance through architectural decisions, algorithmic optimizations, and efficient implementation in Rust. The system is built from the ground up with performance as a core principle:
Single-Pass Processing
: Processing styles in a single efficient pass
Smart Memory Management
: Careful memory handling and efficient data structures minimize resource usage
Optimized File I/O
: Reduced system calls and efficient file handling
Rust Implementation
: Taking advantage of zero-cost abstractions and predictable performance
Grimoire CSS isn’t just fast—it’s blazingly efficient:
Class Processing Speed
: Processes an incredible
~200,000 classes per second
Memory Efficiency
: Handles
~4,000 classes per MB
of memory
Output Optimization
: Generates optimized CSS with minimum overhead
Remember that Grimoire CSS is a complete CSS engine that goes far beyond simple class collection and CSS generation. It handles parsing, optimization, vendor prefixing, project management, and provides powerful features like variables, functions, animations, and component composition—all while maintaining this exceptional performance profile.
Benchmark
Grimoire CSS is lightning-fast and highly efficient. While its absolute performance is unquestionable, side-by-side comparisons often offer better perspective. This benchmark is designed to compare Grimoire CSS and Tailwind CSS by accurately measuring build time, memory usage, CPU load, and output file size.
Overview
The benchmark creates a series of standardized test projects, each containing a large number of HTML files with utility classes for both Grimoire CSS and Tailwind CSS. Then each framework is run to process these projects, and various performance metrics are recorded and analyzed.
Measured Metrics
Build Time
— total time required to process all projects
Class Processing Speed
— number of processed classes per second
Memory Usage
— peak and average memory consumption
Memory Efficiency
— number of processed classes per MB of used memory
I/O Operations
— volume of data read and written
Output File Size
— total size of generated CSS files
Results
When compared to the latest version of Tailwind CSS (v4.x) processing the same workload of 400,000+ classes across 100,000 files, Grimoire CSS demonstrates significant advantages:
Metric
Grimoire CSS
Tailwind CSS
Difference
Build Time
2.10s
10.58s
5.0x faster
Peak Memory Usage
111.2 MB
344.97 MB
3.1x less memory
Average Memory Usage
45.76 MB
182.31 MB
4.0x less memory
CPU User Time
755.11ms
7.77s
10.3x less
CPU System Time
1.33s
60.89s
45.7x less
Class Processing Speed
190,684 cls/s
37,824 cls/s
5.0x faster
Memory Efficiency
3,597 cls/MB
1,160 cls/MB
3.1x more efficient
Output Size
5.05 MB
5.66 MB
1.1x smaller
These performance advantages translate into:
Dramatically improved development experience, even on resource-limited machines.
Faster CI/CD pipelines and reduced cloud infrastructure costs.
Efficient scaling for projects of any size.
Reduced energy consumption for more sustainable development.
A Streamlined CLI with a Strict and Straightforward API
Grimoire CSS comes with a minimal but powerful
CLI
(Command Line Interface) that’s designed for simplicity and efficiency. Whether you’re integrating it into your build process or running it manually, the CLI gets the job done without unnecessary complexity.
There are only 3 commands you need to know:
init
: Initializes your Grimoire CSS configuration, either by loading an existing config or generating a new one if none is found. This is your starting point.
build
: Kicks off the build process, parsing all your input files and generating the compiled CSS. If you haven’t already run
init
, the
build
command will handle that for you automatically.
shorten
: Automatically converts all full-length component names in your spells (as defined in your config) to their corresponding shorthand forms. This helps keep your code concise and consistent. Run this command to refactor your files, making your spell syntax as brief as possible without losing clarity or functionality.
Grimoire CSS’s CLI is built for developers who want power without bloat. It’s direct, no-nonsense, and integrates smoothly into any project or bundler.
Here’s a refined version of the remaining parts, keeping the technical depth and making them more engaging and polished:
Easy Migration with Transmutator
Migrating to Grimoire CSS is simple thanks to the Grimoire CSS Transmutator. You can use it as a CLI tool or as a Web UI
With the CLI, provide paths to your compiled CSS files (or pass raw CSS via a command-line flag).
In the Web UI, either write CSS in the editor and view the JSON output in a separate tab or upload your CSS files and download the transmuted JSON.
In both modes, the Transmutator returns JSON that conforms to the external Scrolls convention by default, so you can immediately leverage your existing CSS classes as Grimoire CSS Scrolls.
You can also run the compiled CSS from Tailwind or any other framework through the Transmutator, include the produced JSON as external scrolls alongside your config, and keep using your existing class names powered by Grimoire CSS.
Grimoire CSS is built to integrate seamlessly into a wide range of ecosystems. It supports both filesystem-based and in-memory operations, making it perfect for traditional web development and dynamic runtime environments. It’s distributed in three ways to give you maximum flexibility:
Single Executable Application
: A standalone binary for those who prefer a direct, no-nonsense approach.
NPM Library
: A Node.js-compatible interface, perfect for JavaScript and web developers.
Rust Crate
: For developers building in Rust or those who want to integrate Grimoire CSS at the system level.
Working Modes
Grimoire CSS offers two primary modes of operation:
Filesystem Mode
(Traditional):
Works with files on disk
Reads input files and writes CSS output to specified locations
Perfect for build-time CSS generation
Uses the standard configuration file approach
In-Memory Mode
:
Processes CSS entirely in memory
No filesystem operations required
Ideal for runtime CSS generation or serverless environments
Accepts configuration and content directly through API
Returns compiled CSS without writing to disk
Example of using In-Memory mode in Rust:
use grimoire_css_lib::{core::ConfigInMemory, start_in_memory};let config = ConfigInMemory { content: Some("your HTML/JS/any content here".to_string()), // Optional: provide custom browserslist configuration browserslist_content: Some("last 2 versions".to_string()), // ... other configuration options};let result = start_in_memory(&config)?;// result contains Vec<CompiledCssInMemory> with your generated CSS
The core of Grimoire CSS is architected entirely in Rust, ensuring top-notch performance and scalability. The main repository compiles both into a standalone executable (SEA) and a Rust crate, meaning you can use it in different environments with ease.
The
grimoire-css-js
takes the core crate and wraps it into a Node.js-compatible interface, which is then compiled into an npm package. Whether you’re working with Rust, Node.js, or need a direct CLI, Grimoire CSS is ready to integrate into your workflow and bring powerful CSS management wherever you need it.
Desk
Here is the Desk. Experiment with spells right here in the web UI version of the Grimoire CSS Playground. Or try the web UI Grimoire CSS Transmutator and convert existing CSS to Grimoire CSS format. Upload CSS files or paste code directly. Download JSON configurations that work immediately with your existing class names.
NOTE: Desk uses a in-memory version of the Grimoire CSS engine, so it does not have all features of the full engine yet.
Installation
Rust crate:
If you’re using Rust, simply add Grimoire CSS to your Cargo.toml, and follow the link for documentation about crate:
docs.rs
.
cargo install grimoire_css
or
cargo add grimoire_css
Single Executable Application (SEA):
Download the binary for your operating system from the
releases page
.
Add the binary to your system’s $PATH (optional for easier usage).
NPM Library:
npm i @persevie/grimoire-css-js
Once installed, you can run the following commands:
Initialize a Grimoire CSS config in your project:
grimoire_css init
or if you are using NPM library:
grimoire-css-js init
Build your CSS using the Grimoire CSS config:
grimoire_css build
or if you are using NPM library:
grimoire-css-js build
The Arcane Circle
Grimoire CSS gives you the freedom to create styles that work exactly the way you want them to - no rigid rules or constraints. Whether you’re crafting dynamic interactions or fine-tuning layouts, Grimoire adapts to your needs, making each step straightforward and rewarding.
So, come join us. Share your work, exchange your thoughts, and help us keep pushing CSS to be more flexible and enjoyable.
The Arcane Circle, or simply the Circle, is a place where you can share your configs, scrolls, variables, components, or UI kits. It’s where you can catch the latest news, follow development, influence the project, and interact with other members of the Circle.
The Circle is currently under development.
The First Member
Hello! My name is
Dmitrii Shatokhin
, and I am the creator of Grimoire CSS. I invented the Spell concept and all the other ideas behind the project. Grimoire CSS is the result of countless hours of work and dedication, and I am proud to have made it open source.
But this is just the beginning. I am committed to the ongoing development of Grimoire CSS and its entire ecosystem - there are many plans and tasks ahead, which I strive to manage transparently on GitHub. My vision is to grow the Arcane Circle community and bring all these ideas to life.
I would be truly grateful for any support you can offer - whether it’s starring the project on GitHub, leaving feedback, recommending it to others, contributing to its development, helping to promote Grimoire CSS, or even sponsoring the project or my work.
Thank you!
Groundbreaking Brazilian Drug, Capable of Reversing Spinal Cord Injury
From a tender wrapper of human life, the placenta, comes the extraction of a protein that points to a solution for something that, until now, science had no clear path to—and never one so celebrated: restoring the spinal cord in people who suffered injuries and lost body movement.
Brazilian researcher Tatiana Coelho de Sampaio, PhD professor at the Federal University of Rio de Janeiro, has quietly worked with a team of biologists for the past 25 years on the repairing and multiplying power of the protein laminin, which acts on the nervous system.
The studies ultimately produced the current drug polylaminin, a world first, presented on Tuesday (9) by Cristália laboratory as capable of regenerating the spinal cord in people who suffered organ rupture in accidents of various kinds, leading to paraplegia—paralysis of the lower limbs—or quadriplegia—paralysis of both lower and upper limbs.
During the experimental phase of the antidote, which is applied directly to the spine, patients experienced full recovery of their conditions, with no aftereffects and resuming a routine without restrictions.
The first time I learned about UTF-8 encoding, I was fascinated by how well-thought and brilliantly it was designed to represent millions of characters from different languages and scripts, and
still be backward compatible with ASCII.
Basically UTF-8 uses 32 bits and the old ASCII uses 7 bits, but UTF-8 is designed in such a way that:
Every ASCII encoded file is a valid UTF-8 file.
Every UTF-8 encoded file that has only ASCII characters is a valid ASCII file.
Designing a system that scales to millions of characters and still be compatible with the old systems that use just 128 characters is a brilliant design.
Note: If you are already aware of the UTF-8 encoding, you can explore the
UTF-8 Playground
utility that I built to visualize UTF-8 encoding.
How Does UTF-8 Do It?
UTF-8 is a variable-width character encoding designed to represent every character in the Unicode character set, encompassing characters from most of the world's writing systems.
It encodes characters using
one to four bytes
.
The first 128 characters (
U+0000
to
U+007F
) are encoded with a
single byte,
ensuring backward compatibility with ASCII, and this is the reason why a file with only ASCII characters is a valid UTF-8 file.
Other characters require two, three, or four bytes. The
leading bits of the first byte
determine the total number of bytes that represents the current character. These bits follow one of four specific patterns, which indicate how many continuation bytes follow.
1st byte Pattern
# of bytes used
Full byte sequence pattern
0xxxxxxx
1
0xxxxxxx
(This is basically a regular ASCII encoded byte)
110xxxxx
2
110xxxxx
10xxxxxx
1110xxxx
3
1110xxxx
10xxxxxx
10xxxxxx
11110xxx
4
11110xxx
10xxxxxx
10xxxxxx
10xxxxxx
Notice that the second, third, and fourth bytes in a multi-byte sequence
always start with 10
. This indicates that these bytes are continuation bytes, following the main byte.
The remaining bits in the main byte, along with the bits in the continuation bytes, are combined to form the character's code point. A code point serves as a unique identifier for a character in the Unicode character set. A code point is typically represented in hexadecimal format, prefixed with "U+". For example, the code point for the character "A" is
U+0041
.
So here is how a software determines the character from the UTF-8 encoded bytes:
Read a byte. If it starts with
0
, it's a single-byte character (ASCII). Show the character represented by the remaiing 7 bits on the screen. Continue with the next byte.
If the byte didn't start with a
0
, then:
If it starts with
110
, it's a two-byte character, so read the next byte as well.
If it starts with
1110
, it's a three-byte character, so read the next two bytes.
If it starts with
11110
, it's a four-byte character, so read the next three bytes.
Once the number of bytes are determined, read all the remaining bits except the leading bits, and find the binary value (aka. code point) of the character.
Look up the code point in the Unicode character set to find the corresponding character and display it on the screen.
The Hindi letter "अ" (officially "Devanagari Letter A") is represented in UTF-8 as:
11100000
10100100
10000101
Here:
The first byte
11100000
indicates that the character is encoded using 3 bytes.
The remaining bits of the three bytes:
xxxx0000
xx100100
xx000101
are combined to form the binary sequence
00001001
00000101
(
0x0905
in hexadecimal). This is the code point of the character, represented as
U+0905
.
The code point
U+0905
(
see official chart
) represents the Hindi letter "अ" in the Unicode character set.
Example Text Files
Now that we understood the design of UTF-8, let's look at a file that contains the following text:
1. Text file contains:
Hey👋 Buddy
The text
Hey👋 Buddy
has both English characters and an emoji character on it. The
text file
with this text saved on the disk will have the following
13 bytes
in it:
Let's evaluate this file byte-by-byte following the UTF-8 decoding rules:
Byte
Explanation
01001000
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1001000
represent the letter 'H'. (
open in playground
)
01100101
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1100101
represent the letter 'e'. (
open in playground
)
01111001
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1111001
represent the letter 'y'. (
open in playground
)
11110000
Starts with
11110
, indicating it's the
first byte of a four-byte character.
10011111
Starts with
10
, indicating it's a continuation byte.
10010001
Starts with
10
, indicating it's a continuation byte.
10001011
Starts with
10
, indicating it's a continuation byte.
The bits from these four bytes (excluding the leading bits) combine to form the binary sequence
00001 11110100 01001011
, which is
1F44B
in hexadecimal, corresponds to the code point
U+1F44B
. This code point represents the waving hand emoji "👋" in the Unicode character set
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
0100000
represent a whitespace character. (
open in playground
)
01000010
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1000010
represent the letter 'B'. (
open in playground
)
01110101
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1110101
represent the letter 'u'. (
open in playground
)
01100100
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1100100
represent the letter 'd'. (
open in playground
)
01100100
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1100100
represent the letter 'd'. (
open in playground
)
01111001
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1111001
represent the letter 'y'. (
open in playground
)
Now this is a valid UTF-8 file, but it doesn't have to be "backward compatible" with ASCII because it contains a non-ASCII character (the emoji). Next let's create a file that contains only ASCII characters.
2. Text file contains:
Hey Buddy
The
text file
doesn't have any non-ASCII characters. The file saved on the disk has the following
9 bytes
in it:
Let's evaluate this file byte-by-byte following the UTF-8 decoding rules:
Byte
Explanation
01001000
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1001000
represent the letter 'H'. (
open in playground
)
01100101
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1100101
represent the letter 'e'. (
open in playground
)
01111001
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1111001
represent the letter 'y'. (
open in playground
)
00100000
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
0100000
represent a whitespace character. (
open in playground
)
01000010
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1000010
represent the letter 'B'. (
open in playground
)
01110101
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1110101
represent the letter 'u'. (
open in playground
)
01100100
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1100100
represent the letter 'd'. (
open in playground
)
01100100
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1100100
represent the letter 'd'. (
open in playground
)
01111001
Starts with
0
, so it's a single-byte ASCII character. The remaining bits
1111001
represent the letter 'y'. (
open in playground
)
So this is a valid UTF-8 file,
and it is also a valid ASCII file
. The bytes in this file follows both the UTF-8 and ASCII encoding rules. This is how UTF-8 is designed to be backward compatible with ASCII.
Other Encodings
I did a quick research on any other encoding that are backward compatible with ASCII, and there are a few, but they are not as popular as UTF-8, for example GB 18030 (a Chinese government standard). Another one is the ISO/IEC 8859 encodings are single-byte encodings that extend ASCII to include additional characters, but they are limited to 256 characters.
The siblings of UTF-8, like UTF-16 and UTF-32, are not backward compatible with ASCII. For example, the letter 'A' in UTF-16 is represented as:
00 41
(two bytes), while in UTF-32 it is represented as:
00 00 00 41
(four bytes).
Bonus: UTF-8 Playground
When I was exploring the UTF-8 encoding, I couldn't find any good tool to interactively visualize how UTF-8 encoding works. So I built
UTF-8 Playground
to visualize and play around with UTF-8 encoding.
Give it a try!
.
Measles and Polio Down in the Schoolyard | Marsh Family Parody
Paul Simon's 'Me and Julio Down in the Schoolyard' provides a fun vibe for the start of a new school year, as misinformation spreads, vaccination rates plummet, and new outbreaks of measles kill again
The Real Problems with America’s Health
MAHA? How unhealthy is the United States? Some key figures show how the US compares to other countries in life expectancy, early deaths and chronic disease.
European Security Depends Upon Ukraine
It is a mistake to consider Ukrainian security as an issue separate from the construction of a new European security order, where Europe can defend itself without the United States
We Got 18,000 of Jeffrey Epstein’s Emails
Bloomberg’s Jason Leopold and Ava Benny-Morrison with host David Gura on the massive trove of emails — what they tell us about Epstein, his powerful network, and Ghislaine Maxwell
When I launched
Dear Greenpeace
with my fellow youth climate activists alongside WePlanet two years ago, I had no idea just how quickly the anti-nuclear dominoes would fall across Europe.
In 2023, and what seems like a lifetime ago, Austria launched their legal action against the European Commission for the inclusion of nuclear energy in the EU Sustainable Finance Taxonomy. At the time they were supported by a bulwark of EU countries and environmental NGOs that opposed nuclear energy. Honestly, it looked like they might win.
But today, that whole landscape has changed.
Germany, long a symbol of anti-nuclear politics, is beginning to shift. The nuclear phase-outs or bans in the Netherlands, Belgium, Switzerland, Denmark, and Italy are now history. Even Fridays for Future has quietened its opposition, and in some places, embraced nuclear power.
This moment matters.
It shows what’s possible when we stick to the science. The evidence only gets clearer by the day that nuclear energy has an extremely low environmental impact across its lifecycle, and strong regulations and safety culture ensure that it remains one of the safest forms of energy available to humanity.
The European Court of Justice has now fully dismissed Austria’s lawsuit. That ruling doesn’t just uphold nuclear energy’s place in EU green finance rules. It also signals a near-certain defeat for the ongoing Greenpeace case – the very lawsuit that inspired me to launch Dear Greenpeace in the first place.
But instead of learning from this, Greenpeace is doubling down. Martin Kaiser, Executive Director of Greenpeace Germany, called the court decision “a dark day for the climate”.
Let that sink in. The highest court in the EU just reaffirmed that nuclear energy meets the scientific and environmental standards to be included in sustainable finance, and Greenpeace still refuses to budge.
Meanwhile, the climate crisis gets worse. Global emissions are not falling fast enough. Billions of people still lack access to clean, reliable electricity. And we are forced to spend time defending proven solutions instead of scaling them.
Announcing our inclusion in the case between Greenpeace and the EU Commission
It’s now up to the court whether we will get our time in court to outline the evidence in support of nuclear energy and the important role it can play in the global clean energy transition. Whether in court, on the streets, or in the halls of parliaments across the globe, we will be there to defend the science and ensure that nuclear power can spread the advantages of the modern world across the planet in a sustainable, reliable and dignified way.
Austria stands increasingly isolated among a handful of countries that still cling to their opposition to nuclear energy. Their defeat in this vital high stakes topic is a success not just for the nuclear movement, but for the global transition as a whole.
We have made real progress. Together, we’ve helped defend nuclear power in the EU, overturned outdated policies at the World Bank, and secured more technology-neutral language at the UN. These wins are not abstract. They open the door to real investment, real projects, and real emissions cuts.
But the work is not done.
We still need to overturn national nuclear bans, unlock more funding, and push democratic countries to support clean energy development abroad: especially where it is most needed to compete with Russia’s growing influence.
The fight will not be done until every single country in the world can boast a clean, reliable energy grid, ready to maintain a modern dignified standard of living, for everyone, everywhere.
This is a great success for the movement and it would not have been possible without the financial support, time and energy given by people like you.
In Solidarity,
Ia Aanstoot
An embarrassing failure of the US patent system: Nintendo's latest patents
The last 10 days have brought a string of patent wins for Nintendo. Yesterday, the company was granted
US patent 12,409,387
, a patent covering riding and flying systems similar to those Nintendo has been criticized for claiming in its Palworld lawsuit (via
Gamesfray
). Last week, however, Nintendo received a more troubling weapon in its legal arsenal:
US patent 12,403,397
, a patent on summoning and battling characters that the United States Patent and Trademark Office granted with alarmingly little resistance.
According to videogame patent lawyer Kirk Sigmon, the USPTO granting Nintendo these latest patents isn't just a moment of questionable legal theory. It's an indictment of American patent law.
"Broadly, I don't disagree with the many online complaints about these Nintendo patents," said Sigmon, whose opinions do not represent those of his firm and clients. "They have been an embarrassing failure of the US patent system."
(Image credit: Nintendo, USPTO)
Sigmon, who we spoke with last year about the
claims and potential consequences of Nintendo's Palworld lawsuit
, said both this week's '387 patent and last week's '397 represent procedural irregularities in the decisionmaking of US patent officials. And thanks to those irregularities, Nintendo has yet more tools to bully its competitors.
The '387 patent granted this week, Sigmon told PC Gamer, "got a bit of push-back, but barely." After its initial application was deemed invalid due to similarities to existing Tencent and Xbox-related patents, Nintendo amended its claims based on interviews with the USPTO, which then determined that the claims were allowable "for substantially the same reasons as parent application(s)."
"That parent case," Sigmon said, "had an even weirder and much less useful prosecution history."
(Image credit: Nintendo, USPTO)
Most of the claims made in the '387 patent's single parent case,
US Pat. No. 12,246,255
, were immediately allowed by the USPTO, which Sigmon said is "a
very
unusual result: most claims are rejected
at least
once." When the claims were ultimately allowed, the only reasoning the USPTO offered was a block quote of text
from the claims themselves.
"This seems like a situation where the USPTO essentially gave up and just allowed the case, assuming that the claims were narrow or specific enough to be new without evaluating them too closely," Sigmon said. "I strongly disagree with this result: In my view, these claims were in no way allowable."
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
To Sigmon, an IP attorney with extensive experience in prosecuting and teaching patent law, the '387 patent and its parent case rely on concepts and decisions that would have been obvious to a "Person of Ordinary Skill in the Art"—a legal construct that holds if a patent's claims would reasonably occur to a practitioner in the relevant field based on prior art, those claims aren't patentable.
(Image credit: Nintendo, USPTO)
The '397 patent granted last week is even more striking. It's a patent on summoning and battling with "sub-characters," using specific language suggesting it's based on the
Let's Go! mechanics
in the Pokémon Scarlet and Violet games. Despite its relevance to a conceit in countless games—calling characters to battle enemies for you—it was allowed without any pushback whatsoever from the USPTO, which Sigmon said is essentially unheard of.
"Like the above case, the reasons for allowance don't give us even a hint of why it was allowed: the Examiner just paraphrases the claims (after block quoting them) without explaining
why
the claims are allowed over the prior art," Sigmon said. "This is extremely unusual and raises a large number of red flags."
According to Sigmon, USPTO records show that the allowance of the '397 patent was based on a review of a relatively miniscule number of documents: 16 US patents, seven Japanese patents, and—apparently—one article from
Pokemon.com
.
(Image credit: Nintendo, USPTO)
"I have no earthly idea how the Examiner could, in good faith, allow this application so quickly," Sigmon said.
Admittedly, the '397 case was originally filed as a Japanese patent application, which
would
allow the Examiner to use the existing progress in the Japanese case as a starting point for their review. But, Sigmon said, "even that doesn't excuse this quick allowance."
"This allowance should not have happened, full stop," he said.
On paper, the patent might not seem like a threat to Nintendo's competitors: The claims as constructed in the '397 outline a very specific sequence of events and inputs, and patent claims must be met word-for-word to be infringed.
"Pragmatically speaking, though, it's not impossible to be sued for patent infringement even when a claim infringement argument is weak, and bad patents like this cast a massive shadow on the industry," Sigmon said.
For a company at Nintendo's scale, the claims of the '397 patent don't need to make for a strong argument that would hold up in court. The threat of a lawsuit can stifle competition well enough on its own when it would cost millions of dollars to defend against.
(Image credit: Nintendo, USPTO)
"In my opinion, none of the three patents I've discussed here should have been allowed. It's shocking and offensive that they were," Sigmon said. "The USPTO dropped the ball big time, and it's going to externalize a lot of uncertainty (and, potentially, litigation cost) onto developers and companies that do not deserve it."
Sigmon, who says he's helped inventors protect their inventions from IP theft perpetrated by major companies, insists that the patent system still has merit. "That's the kind of thing that patents are meant to do," he said. "They were not made to allow a big player to game the system, get an overly broad patent that they should have never received in the first place, and then go around bullying would-be competition with the threat of a legally questionable lawsuit."
Unfortunately, Nintendo has gained these patents at a moment when the USPTO has made challenging bad patents more difficult. Currently, US patent officials under USPTO Acting Director Coke Morgan Stewart have been
refusing to hear a huge number of Inter Partes Review cases
—special proceedings in which parties can argue that a patent should never have been granted—for "discretionary" reasons.
"Realistically, this means that patent validity issues are being relegated to lawsuits: not a good situation, as that often entails millions of dollars in costs and a lot of risk," Sigmon said. "In practice, this means that bad patents get to fester on the market for longer and provide a bigger threat for the industry as a whole."
Lincoln has been writing about games for 11 years—unless you include the essays about procedural storytelling in Dwarf Fortress he convinced his college professors to accept. Leveraging the brainworms from a youth spent in World of Warcraft to write for sites like Waypoint, Polygon, and Fanbyte, Lincoln spent three years freelancing for PC Gamer before joining on as a full-time News Writer in 2024, bringing an expertise in Caves of Qud bird diplomacy, getting sons killed in Crusader Kings, and hitting dinosaurs with hammers in Monster Hunter.
What Happens After I'm Gone? The Future of the Online Me
Let me start off by saying that I am doing well and in good health! This post is focused more on the concept of “future-proofing” my open source projects and stupid little mini-sites once I am no longer around to keep the wheels turning, so to speak. It’s something that most of us probably don’t think about. That makes sense, since it isn’t the most joyful thing to focus on. But it does impact our individual “homes” on the internet, no matter how small.
So, I decided to write-up my
current
online fail-safes, along with my plans for keeping most things running smoothly after I’m gone.
Let’s get this out of the way first. The internet doesn’t matter compared to
real-life
. Obviously if all my projects / sites disappeared from the web tomorrow it wouldn’t be a big deal
at all
. Family, friends, and those directly impacting your life should always take precedence over online communities (even if those communities are awesome!). Before you consider wasting any effort future-proofing your online “stuff”, take the time to
write up a will
. It’s worth the cost (heck, even online services exist for this now) and once complete it will allow you to focus on more stupid things, like your online stuff!
If you takeaway one thing from the post, it should be to get yourself a will.
With that out of the way, let’s move on to the less important things in life!
Services
Domains and Web Hosting
I believe most domain owners and web masters don’t think much about their custom domains or hosting that often. I mean, they
think
about them but not how to handle their management once they, as the sole owners, pass on. If you own any domains or use some form of web hosting, ask yourself the following:
Is auto-renew setup? If so, will the payment source ever fail / expire?
Are notifications setup? If so, will anyone else have access to these?
Is there an emergency contact?
Are there any advanced settings (DNS, forwarding, etc.) that should be documented somewhere?
Shared login details?
What happens if the current company shutters? Is there a migration plan?
Domains are critical for long-running projects or even personal sites that represent you as a person. You don’t want renewals to lapse and have your domains scooped up by online scalpers. They could hold the domain hostage or possibly use it for nefarious purposes (sending out malicious content or collecting user data).
As for hosting, the web moves fast and you can’t assume anything will last forever. What you
can
do is place your bets on providers that have predictable longevity. The two hosting providers I always recommend are
NearlyFreeSpeech.NET
(NFSN) and
RamHost
VPS. NFSN is more inline with your standard “shared hosting” setup, where RamHost is a VPS with full root access.
This is extremely helpful to keep the balance “topped up” while waiting for an account transfer / takeover
Aligns well with free speech core values (vital for an open web)
Online since 2002
Great community, helpful members in the forums
Optional paid support (I’ve never needed, but nice to have)
No bullshit UI, gets out of your way
Bonus: runs on FreeBSD
Even
jdw
(founder/owner of NFSN) has previously talked about the
bus factor
in the
members forum
(2024), in which he responds to talk of retirement:
Not to worry, you’ve got a good 20 years before that’s a remotely realistic possibility.
That’s solid enough for me.
Reasons why I like
RamHost
:
Supports OpenBSD (and many others)
Online since 2009
No bullshit UI, gets out of your way
Costs just
$15/year
for 384 MB RAM / 10 GB Storage / 500 GB Traffic
The only downside is that there is no direct “contributions” funding for your account. Fortunately, you can add multiple points of contact that can be notified about upcoming renewals. Their support is quite good as well, so I would be hopeful they would provide assistance for non-techy individuals taking over an existing account.
I use RamHost for just a few of my projects and mini-sites that need to be running on top of OpenBSD. For most people, NFSN would be the easier option.
Email
This one is a little more difficult. From my research I have yet to find an email provider which allows for a similar “account contribution” setup similar to that of NFSN. You
could
fallback back to a “free” service like Gmail, and utilize NFSN’s
email forwarding service
. But Gmail is pretty terrible (for many reasons).
It seems the best option is to simply provide your executor with direct login access to your email registrar. Some providers I recommend are:
Migadu
Tutanota
Proton
Mailbox.org
I myself don’t have many “social” accounts online, but most people do. There isn’t much of a
plan
for continuing to interact with social network communities, nor should there be since that would be creepy. The best practice would be to simply notify online friends and followers about what happened, along with plans to eventually shutdown these accounts.
Passwords
All of these above services
should
have their own secure logins. Expecting loved ones to memorize all of your individual passwords is a lot to ask, so it makes sense to utilize a password manager. Any will do (Bitwarden, 1Password,
pass
) just so long as the master password is handed over to the one in charge of the “digital you” after you’re gone. That will make things much easier when tackling items such as web hosting and domain configuration.
It’s important to note that some logins require, or were maybe were setup with, authentication services. Make sure access to these authentication codes is provided as well for those taking over.
Open Source Projects
Most open source projects that provide any form of public access or version controlling should be fairly future-proof by default. If you self-host a git forge, just be sure to have at least one mirror of you projects on a public git service. I personally recommend
Codeberg
. That way, anyone can fork and tweak and re-share your awesome work.
1
End on a Happy Note
I apologize if this post made you more aware of your own mortality - that was not my intention! At the end of the day, online
stuff
doesn’t really matter and you don’t need to preserve anything if you don’t feel like it. I just think it would be nice to keep your online lights on when all other lights go out.
2
Life is wonderful and you should enjoy every moment of it!
Timed out getting readerview for https://www.math.columbia.edu/~woit/wordpress/?p=15206
China Didn’t Want You to See This Video of Xi and Putin. So Reuters Deleted It.
Intercept
theintercept.com
2025-09-12 18:45:06
Following a copyright takedown request, Reuters removed a hot mic video of Putin and Xi discussing life extension and immortality.
The post China Didn’t Want You to See This Video of Xi and Putin. So Reuters Deleted It. appeared first on The Intercept....
When two world
leaders were caught on a hot mic having a bizarre conversation about living forever, the news agency Reuters realized it was a big story.
Reuters reported on and aired the footage of Russia’s Vladimir Putin and China’s Xi Jinping discussing organ transplantation as a means of life extension and perhaps immortality during a September 3 Victory Day Parade in China, a procession celebrating the end of the Second Sino-Japanese War.
But two days later, Reuters yanked the video off its website, retracted the footage from its wire service, and erased clips from its social media feeds.
The reason: a takedown letter from a China Central Television, China’s state-controlled television network, which had licensed footage of the event to Reuters.
Last Friday, CCTV lawyer HE Danning wrote to Reuters demanding the video be taken down. “The editorial treatment applied to this material has resulted in a clear misrepresentation of the facts and statements contained within the licensed feed.”
Reuters, whose parent company Thomson Reuters conducts a variety of business operations in China, complied.
“Reuters removed the video from its website and issued a ‘kill’ order to its clients on Friday,” the media company
wrote
in a statement published on its website, explaining its decision to withdraw the footage from a portal used by other news organizations that rely on Reuters as a wire service.
The initial Reuters
article
about the hot mic moment now contains a note that “story has been corrected to withdraw videos, with no changes to text.”
Reuters didn’t just remove the full four-minute event video from its systems, but also a 38-second annotated clip of the exchange that it had previously posted across its social media platforms, including
TikTok
,
Facebook
, and
LinkedIn
. A Reuters World News
podcast episode
, which features a short audio clip of the exchange, is still online.
Footage of the event remains online elsewhere — but not all clips capture the conversation between Xi and Putin as clearly as the Reuters recording. A
version
of the event footage on CCTV’s official YouTube channel includes audio of an announcer speaking and music playing, obscuring the conversation about life extension between Xi and Putin.
In the version of the video Reuters posted to TikTok (and later deleted), Xi and Putin stroll around like old chums as they discuss, through translators, “immortality in a conversation caught on a hot mic,” as Reuters summarized in the opening title card.
During the conversation, as seen in the Reuters clip, Xi says: “In the past people rarely lived longer than 70 years, but today they say that at 70 you are still a child.”
“Human organs can be continuously transplanted. The longer you live, the younger you become and even achieve immortality.”
The Russian state-funded outlet RT later
posted
Bloomberg’s
version of the video
, which remains online and features a similar translation of Xi’s remarks over the same 38-second sequence, which Bloomberg credits to CCTV’s “live transmission” of the parade; RT’s thread also featured an English-dubbed video of Putin confirming the exchange at a press conference.
Putin responds, “Human organs can be continuously transplanted. The longer you live, the younger you become and even achieve immortality.” Xi then says, “Some predict this century humans may live up to 150 years old.”
In a statement, Reuters expressed that they “stand by the accuracy of what we published” and that “we have carefully reviewed the published footage, and we have found no reason to believe Reuters longstanding commitment to accurate, unbiased journalism has been compromised.”
“Reuters withdrew these videos because it no longer held the legal permission to publish this copyrighted material, and as a global news agency, we are committed to respecting the intellectual property rights of others,” Reuters spokesperson Heather Carpenter told The Intercept.
Thomson Reuters, headquartered in Toronto, engages in an assortment of
business ventures in China
, such as an AI-based legal “
co-counsel
” bot, “
global trade solutions
,” and legal research on Chinese law through its Westlaw product. The company maintains several offices in China, including in Shanghai, Beijing, and a Reuters news bureau in
Shenzhen
. The news organization is currently hiring for a
researcher
position at its Beijing bureau.
Reuters did not respond specifically when asked if its business interests played any interest in complying with the removal request.
This isn’t the first time Reuters has taken down content at the behest of international authorities. In 2023, Reuters published an
exposé
about the Indian cyber-espionage firm Appin. An Indian court deemed the article to be “indicative of defamation” and ordered that the article be removed. As the Freedom of the Press Foundation
highlighted
, even though Indian courts don’t have jurisdiction outside of India, Reuters removed the article not just in India but also worldwide. Once the injunction expired, Reuters reinstated the article.
Seth Stern, director of advocacy at Freedom of the Press Foundation, said Reuters’ decision to remove the video is a blow to press freedom at a critical juncture.
“International news outlets have a responsibility to uphold press rights internationally, especially in times like these where press freedom is backsliding almost everywhere. Otherwise, journalism’s independence sinks to the lowest common denominator whenever news of global importance breaks in a country governed by a repressive regime.”
He cautioned that compliance with takedown requests is a slippery slope.
“What makes [Reuters] think the next censorial regime that might not like what it prints isn’t taking notes?” he asked.
Even if you’ve been doing JavaScript for a while, you might be surprised to learn that
setTimeout(0)
is not really
setTimeout(0)
. Instead, it could run 4 milliseconds later:
Nearly a decade ago when I was on the Microsoft Edge team, it was explained to me that browsers did this to avoid “abuse.” I.e. there are a lot of websites out there that spam
setTimeout
, so to avoid draining the user’s battery or blocking interactivity, browsers set a special “clamped” minimum of 4ms.
This also explains why some browsers would bump the throttling for devices on battery power (
16ms
in the case of legacy Edge), or throttle even more aggressively for background tabs (
1 second
in Chrome!).
One question always vexed me, though: if
setTimeout
was so abused, then why did browsers keep introducing new timers like
setImmediate
(
RIP
), Promises, or even new fanciness like
scheduler.postTask()
? If
setTimeout
had to be nerfed, then wouldn’t these timers suffer the same fate eventually?
I wrote a long post about JavaScript timers
back in 2018
, but until recently I didn’t have a good reason to revisit this question. Then I was doing some work on
fake-indexeddb
, which is a pure-JavaScript implementation of the IndexedDB API, and this question reared its head. As it turns out, IndexedDB wants to auto-commit
transactions
when there’s no outstanding work in the event loop – in other words, after all microtasks have finished, but before any tasks (can I cheekily say “macro-tasks”?) have started.
To accomplish this,
fake-indexeddb
was using
setImmediate
in Node.js (which shares some similarities with the legacy browser version) and
setTimeout
in the browser. In Node,
setImmediate
is kind of perfect, because it runs
after microtasks but immediately before any other tasks
, and without clamping. In the browser, though,
setTimeout
is pretty sub-optimal:
in one benchmark
, I was seeing Chrome take 4.8 seconds for something that only took 300 milliseconds in Node (a 16x slowdown!).
Looking out at the timer landscape in 2025, though, it wasn’t obvious what to choose. Some options included:
setImmediate
– only supported in legacy Edge and IE, so that’s a no-go.
MessageChannel.postMessage
– this is the technique used by
afterframe
.
window.postMessage
– a nice idea, but kind of janky since it might interfere with other scripts on the page using the same API. This approach is used by
the
setImmediate
polyfill
though.
scheduler.postTask
– if you read no further, this was the winner. But let’s explain why!
To compare these options, I wrote
a quick benchmark
. A few important things about this benchmark:
You have to run several iterations of
setTimeout
(and friends) to really suss out the clamping. Technically, per the
HTML specification
, the 4ms clamping is only supposed to kick in after a
setTimeout
has been nested (i.e. one
setTimeout
calls another) 5 times.
I didn’t test every possible combination of 1) battery vs plugged in, 2) monitor refresh rates, 3) background vs foreground tabs, etc., even though I know all of these things can affect the clamping. I have a life, and although it’s fun to don the lab coat and run some experiments, I don’t want to spend my entire Saturday doing that.
In any case, here are the numbers (in milliseconds, median of 101 iterations, on a 2021 16-inch MacBook Pro):
Browser
setTimeout
MessageChannel
window
scheduler.postTask
Chrome 139
4.2
0.05
0.03
0.00
Firefox 142
4.72
0.02
0.01
0.01
Safari 18.4
26.73
0.52
0.05
Not implemented
Don’t worry about the precise numbers too much: the point is that Chrome and Firefox clamp
setTimeout
to 4ms, and the other three options are roughly equivalent. In Safari, interestingly,
setTimeout
is even more heavily throttled, and
MessageChannel.postMessage
is a tad slower than
window.postMessage
(although
window.postMessage
is still janky for the reasons listed above).
This experiment answered my immediate question:
fake-indexeddb
should use
scheduler.postTask
(which I prefer for its ergonomics) and fall back to either
MessageChannel.postMessage
or
window.postMessage
. (I did experiment with different
priorities
for
postTask
, but they all performed
almost identically
. For
fake-indexeddb
‘s use case, the default priority of
'user-visible'
seemed most appropriate, and that’s what the benchmark uses.)
None of this answered my original question, though: why exactly
do
browsers bother to throttle
setTimeout
if web developers can just use
scheduler.postTask
or
MessageChannel
instead? I asked my friend
Todd Reifsteck
, who was co-chair of the
Web Performance Working Group
back when a lot of these discussions about
“interventions”
were underway.
He said that there were effectively two camps: one camp felt that timers needed to be throttled to protect web devs from themselves, whereas the other camp felt that developers should “measure their own silliness,” and that any subtle throttling heuristics would just cause confusion. In short, it was the standard tradeoff in designing performance APIs: “some APIs are quick but come with footguns.”
This jives with my own intuitions on the topic. Browser interventions are usually put in place because web developers have either used too much of a good thing (e.g.
setTimeout
), or were blithely unaware of better options (the
touch listener controversy
is a good example). In the end, the browser is a “user agent” acting on the user’s behalf, and the W3C’s
priority of constituencies
makes it clear that end-user needs always trump web developer needs.
That said, web developers often
do
want to do the right thing. (I consider this blog post an attempt in that direction.) We just don’t always have the tools to do it, so instead we grab whatever blunt instrument is nearby and start swinging. Giving us more control over tasks and scheduling could avoid the need to hammer away with
setTimeout
and cause a mess that calls for an intervention.
My prediction is that
postTask
/
postMessage
will remain unthrottled for the time being. Out of Todd’s two “camps,” the very existence of the
Scheduler API
, which offers a whole slew of fine-grained tools for task scheduling, seems to point toward the “pro-control” camp as the one currently steering the ship. Although Todd sees the API more as a compromise between the two groups: yes, it offers a lot of control, but it also aligns with the browser’s actual rendering pipeline rather than random timeouts.
The pessimist in me wonders, though, if the API could still be abused – e.g. by carelessly using the
user-blocking
priority
everywhere. Perhaps in the future, some enterprising browser vendor will put their foot more firmly on the throttle (so to speak) and discover that it causes websites to be snappier, more responsive, and less battery-draining. If that happens, then we may see another round of interventions. (Maybe we’ll need a
scheduler2
API to dig ourselves out of that mess!)
I’m not involved much in web standards anymore and can only speculate. For the time being, I’ll just do what most web devs do: choose whatever API accomplishes my goals today, and hope that browsers don’t change too much in the future. As long as we’re careful and don’t introduce too much “silliness,” I don’t think that’s a lot to ask.
Thanks to Todd Reifsteck for feedback on a draft of this post.
When a legal takedown request arrives, whether it’s about copyright,
censorship, privacy, or something more vague, how a Free and Open Source
Software (FOSS) project responds can make all the difference.
Handled well, a takedown request can be a manageable administrative step. Handled poorly, it can cause panic, disrupt infrastructure, or even put contributors at legal risk.
As part of our legal resilience research, we spoke with a range of legal experts, software freedom advocates, and maintainers of mature FOSS infrastructure to understand how others manage these moments. In this article, we share what we learned, and how F-Droid is incorporating these lessons into its own approach.
A Pattern Emerges
Despite differences in jurisdiction, size, and mission, a few common themes
from our research emerged when we asked how other projects handle takedown
requests:
1. Don’t Be a Soft Target
Legal threats often follow the path of least resistance. FOSS projects that
publish a formal takedown policy, require legal submissions through specific
channels, and insist on a valid legal basis are much less likely to receive,
or comply with, vague or harassing demands.
One FOSS organization, for example, requires all legal correspondence to be submitted by postal mail in the national language and citing local law. Most complaints evaporate once asked to comply.
2. Creating a transparent and documented process
Several digital rights organizations advised setting up structured response
steps:
Require submissions to a dedicated legal@ or abuse@ email.
Insist on full documentation: legal basis, jurisdiction, evidence of the
infringement and identity of the complainant.
Review for sufficiency, proportionality, and standing before acting.
This creates proper documentation to process valid claims, while protecting projects from illegitimate or unfounded requests.
3. Use Jurisdiction Strategically
Projects based in civil law jurisdictions, particularly in Europe, are often
better positioned to deflect legal demands from foreign entities. Several
organizations emphasized that complying with vague or extrajudicial
requests, especially those originating outside your jurisdiction, can
increase risk unnecessarily. Instead, they recommended requiring a valid
legal basis grounded in the project’s home country. Formal legal processes,
such as court orders or official government channels, were seen as the
appropriate threshold, not informal emails or unverifiable demands.
Notification and Appeals: Fairness and Transparency
All of the projects we consulted emphasized the importance of notifying
developers whose apps are being targeted, informing them (if possible) of
the seriousness of the claim, and the proposed strategy F-Droid is taking to
handle the claim.
If a threat is deemed to be valid and a developer’s content is flagged for
takedown:
The developer or maintainer is informed, unless prohibited by law (gag
orders).
A window for response (commonly 14 days) is offered, unless unfeasible due
to seriousness and time restraints of the request itself
If the developer disputes the claim and provides supporting information
(e.g. license, public domain status, fair use justification), the claim is
reviewed.
If the claim is upheld, the content is removed, but always with an
internal record and opportunity to appeal.
Transparency, Censorship, and What You Can (Legally) Publish
Takedown requests occupy a complex space between legal enforcement and
censorship. While some are legitimate claims, like copyright violations or
privacy breaches, others are vague, politically motivated, or intended to
silence dissent. For FOSS projects that have a global user base, it’s not
always obvious how to respond. Complying too quickly can reinforce
censorship practices; resisting without process can lead to full website
shut downs,
domain names being taken
away
(as in the US) or
large and costly legal
battles
.
One strategy that helps balance this tension is radical
transparency. Several projects we spoke with emphasized the importance of
documenting what actions were taken and why, not just for accountability,
but as a form of resistance. A well-known example is GitHub’s DMCA takedown
policy (as of July 2025), which mandates compliance with valid takedown
requests, but also posts each one publicly in their
github/dmca
repository
. The result: potential abusers
know their requests will face public scrutiny, which acts as a deterrent.
However, not all jurisdictions allow this kind of transparency. In India,
for example, we were informed that it is often illegal to disclose that you
have received a government request, even to the developer of the affected
app. In contrast, in Russia, takedown requests can often be
legally
posted
, though by doing so you may be
putting yourself at risk for retaliation, additional takedown requests and
legal troubles.
With that in mind, some best practices for FOSS projects include:
Publishing biannual transparency reports, even if redacted or aggregated.
Maintaining an internal log of all takedown activity, with public
disclosure where legally possible.
Explaining the general reasons for content removals, who made the request,
under what law, and what action was taken, unless legally prohibited.
Being explicit about what cannot be shared, and why.
Transparency won’t prevent all forms of censorship, but it can slow them
down, raise awareness, and provide a record that strengthens the broader
FOSS ecosystem.
What We’re Doing at F-Droid
F-Droid is revising its own takedown policy, informed by:
Dutch law and EU regulations
The structural support provided by The Commons Conservancy
Practical lessons from long-standing FOSS organizations
Our draft process includes:
Written takedown submission request to legal@f-droid.org including the
required information.:
Identify the specific material in question (e.g. app name)
Include valid legal basis under applicable jurisdiction (e.g. copyright
law, court order statutory basis)
Indicate jurisdiction in which the legal basis is claimed to apply
Include sufficient evidence of the alleged infringement (e.g. copyright
certificate, ownership declaration)
Clearly state that the complaintant is authorized to act on behalf of the
rights holder
Include full contact details and a verifiable identity (subject to
exceptions, such as gag orders or whistleblower protection)
Verification of jurisdiction and legal basis, including evidence
Developer notification and appeal procedures
Rejection of requests lacking documentation or legal authority may be
rejected or ignored
Biannual transparency reports and public tracking of takedown requests
We’re also working to improve contributor education about potential exposure
when contributing to F-Droid, document internal escalation paths, and ensure
consistent handling of international claims.
Final Thoughts
Takedown requests are not going away in fact, they’re becoming more frequent
and more complex. But FOSS projects don’t have to face them unprepared.
By building processes, establishing clear jurisdiction, and protecting individuals through structure and policy, we can handle these challenges with the seriousness they deserve without letting them derail our mission.
Legal Disclaimer
The content provided in this article is for informational purposes only and
does not constitute legal advice. While we strive to provide accurate and
up-to-date information, F-Droid makes no representations or warranties of
any kind, express or implied, about the completeness, accuracy, or
suitability of the information contained herein.
F-Droid is not a law firm and does not offer legal services. Any reliance
you place on the information provided is strictly at your own risk. If you
have questions about legal obligations, rights, or compliance, we strongly
recommend consulting a qualified legal professional familiar with your
jurisdiction.
F-Droid and its contributors disclaim all liability for any loss or damage
arising from the use or misuse of this content.
Bolsonaro Joins a Rogues’ Gallery of Coup Plotters
Portside
portside.org
2025-09-12 18:20:13
Bolsonaro Joins a Rogues’ Gallery of Coup Plotters
barry
Fri, 09/12/2025 - 13:20
...
Former Brazilian President Jair Bolsonaro has been sentenced to 27 years in prison for his attempted coup, image: screen grab
Jair Bolsonaro’s
conviction on Sept. 11, 2025
, puts the former Brazilian president in a rogues’ gallery of failed coup plotters to be held to account for their attempted power grab.
Brazil’s Supreme Court
found Bolsonaro guilty
of being part of an armed criminal organization and other counts relating to a
coup plot to overturn
the ex-president’s 2022 election defeat to Luiz Inácio Lula da Silva. Prosecutors had earlier argued that Bolsonaro and others discussed a
scheme to assassinate Lula
and
incited a riot on Jan. 8, 2023
, in hopes that Brazil’s military would intervene and return Bolsonaro to power.
Four of the five justices on the panel voted to convict. Justice Cármen Lúcia, who was among the majority,
said that the right-winger
acted “with the purpose of eroding democracy and institutions.” Sentenced to 27 years and three months behind bars, Bolsonaro is expected to appeal the verdict.
Not all coup plotters are held accountable for their actions. And even for those, like Bolsonaro, who are – it doesn’t necessarily mark the end of their political ambitions.
Coup and punishments
Plotting a coup is risky business. Some of those who attempt to seize or usurp power unconstitutionally are killed during their takeover bid, particularly when security forces loyal to the incumbent leader foil the attack.
Christian Malanga
, an exiled former army captain who led a
violent attempt to seize power
in the Democratic Republic of Congo, is one such example. He was killed in the ensuing shootout in May 2024.
But most leaders of failed coups survive.
And although they typically face punishment, the severity of consequences varies greatly; it often depends on whether the attempt is
a self-coup
, which is a power grab by an incumbent leader, or an attempt to oust a sitting government.
Some coup plotters and their co-conspirators are charged in a court and, if convicted, sent to prison. Malanga’s American co-conspirators were ultimately sentenced to
life in prison
in April 2025.
A similar fate has now befallen Bolsonaro. His conviction means that unless successful on appeal, Bolsonaro could end his days in confinement.
Still, it could have been worse – failed coupmakers are often punished outside of independent courts, where the penalty is often more severe. Coup plotters have been summarily executed or sentenced to death by a military tribunal or a “people’s court.” The longtime Zairean dictator Mobutu Sese Seko
executed over a dozen junior officers
and civilians after his government uncovered an alleged coup plot in 1978.
One recent estimate suggests
40% of coup conspirators suffer relatively light punishment
. Many coup backers are
simply demoted
or purged
from the government without facing trial or execution. An especially popular move is to send coup plotters into exile to discourage their supporters from mobilizing against the regime. Former Haitian president
Dumarsais Estimé
was forced into exile after his self-coup attempt failed in May 1950; he died in the U.S. a few years later.
Punishment doesn’t always end threat
The problem facing governments is that failed putschists pose a lingering political threat. Ousted leaders often plot “counter-coups” to return to power. For example, former president of the Philippines Ferdinand Marcos, after being ousted in the 1986 People Power movement,
masterminded coup plots from exile
, though he never returned to power.
Some succeed, such as
David Dacko
, who returned from exile to grab power in the Central African Republic in 1979, but only with the help of French forces.
Even when convicted or exiled, coup plotters may be later freed. Some members of Brazil’s Congress had already, prior to the verdict, introduced a bill that could
grant Bolsonaro amnesty
.
A few former failed coup leaders manage to come to power later. Ghana’s
Jerry Rawlings
led a failed coup in May 1979 but went on to seize power in subsequent coups in June 1979 and 1981;
Hugo Chavez
was convicted and jailed for leading a failed coup in 1992 but ended up being elected president in Venezuela in 1998.
Though Bukele temporarily backed down, he faced no legal or political backlash. His party won a legislative supermajority in 2021, and he won reelection in 2024. Bukele’s ruling party recently
lifted presidential term limits
, allowing him to potentially
rule for life
.
The good news about punishing unsuccessful coup plotters is that because they’ve failed, they do not have to be coaxed out of power. Thus, holding them accountable for their actions should deter future plotters from attempting the same thing. In contrast, for a leader who has done unsavory things while still in office – such as
killing domestic dissidents
or
committing war crimes
– the
threat of punishment
once they leave power can
backfire
by giving them a reason to
fight to stay in power
.
In the long term, failed coup leaders who escape punishment are more likely to make a political comeback.
When defeated at the polls, both Donald Trump and Bolsonaro tried to overturn the official results. Both attempted to
alter
vote
totals
after they had lost
and
block
an election winner from being inaugurated.
In contrast, a conviction for Bolsonaro means it is now unlikely he will follow the same path to political resurrection. Even if he’s
eventually pardoned
, a guilty verdict makes him
ineligible
to compete again for Brazil’s presidency.
This article is republished from
The Conversation
under a Creative Commons license. Read the
original article
.
The Conversation is a nonprofit, independent news organization dedicated to unlocking the knowledge of experts for the public good. Get fact-based journalism written by experts in your inbox each morning with a
Conversation newsletter
.
New HybridPetya ransomware can bypass UEFI Secure Boot
Bleeping Computer
www.bleepingcomputer.com
2025-09-12 18:18:07
A recently discovered ransomware strain called HybridPetya can bypass the UEFI Secure Boot feature to install a malicious application on the EFI System Partition. [...]...
A recently discovered ransomware strain called HybridPetya can bypass the UEFI Secure Boot feature to install a malicious application on the EFI System Partition.
HybridPetya appears inspired by the destructive Petya/NotPetya malware that encrypted computers and prevented Windows from booting in attacks in
2016
and
2017
but did not provide a recovery option.
Researchers at cybersecurity company ESET found a sample of HybridPetya on VirusTotal. They note that this may be a research project, a proof-of-concept, or an early version of a cybercrime tool still under limited testing.
Still, ESET says that its presence is yet another example (along with
BlackLotus
,
BootKitty
, and Hyper-V Backdoor) that UEFI bootkits with Secure Bypass functionality are a real threat.
HybridPetya incorporates characteristics from both Petya and NotPetya, including the visual style and attack chain of these older malware strains.
However, the developer added new things like installation into the EFI System Partition and the ability to bypass Secure Boot by exploiting the
CVE-2024-7344
vulnerability.
ESET discovered the flaw in January this year, The issue consists in Microsoft-signed applications that could be exploited to deploy bootkits even with Secure Boot protection active on the target.
Execution logic
Source: ESET
Upon launch, HybridPetya determines if the host uses UEFI with GPT partitioning and drops a malicious bootkit into the EFI System partition consisting of several files.
These include configuration and validation files, a modified bootloader, a fallback UEFI bootloader, an exploit payload container, and a status file that tracks the encryption progress.
ESET lists the following files used across analyzed variants of HybridPetya:
\EFI\Microsoft\Boot\config (encryption flag + key + nonce + victim ID)
\EFI\Microsoft\Boot\verify (used to validate correct decryption key)
\EFI\Microsoft\Boot\counter (progress tracker for encrypted clusters)
\EFI\Microsoft\Boot\bootmgfw.efi.old (backup of original bootloader)
\EFI\Microsoft\Boot\cloak.dat (contains XORed bootkit in Secure Boot bypass variant)
Also, the malware replaces \EFI\Microsoft\Boot\bootmgfw.efi with the vulnerable ‘reloader.efi,’ and removes \EFI\Boot\bootx64.efi.
The original Windows bootloader is also saved to be activated in the case of successful restoration, meaning that the victim paid the ransom.
Once deployed, HybridPetya triggers a BSOD displaying a bogus error, as Petya did, and forces a system reboot, allowing the malicious bootkit to execute upon system boot.
At this step, the ransomware encrypts all MFT clusters using a Salsa20 key and nonce extracted from the config file while displaying a fake CHKDSK message, like NotPetya.
Fake CHKDSK message
Source: ESET
Once the encryption completes, another reboot is triggered and the victim is served a ransom note during system boot, demanding a Bitcoin payment of $1,000.
HybridPetya's ransom note
Source: ESET
In exchange, the victim is provided a 32-character key they can enter on the ransom note screen, which restores the original bootloader, decrypts the clusters, and prompts the user to reboot.
Though HybridPetya has not been observed in any real attacks in the wild, similar projects may choose to weaponize the PoC and use it in broad campaigns targeting unpatched Windows systems at any time.
Indicators of compromise to help defend against this threat have been made available on this
GitHub repository
.
Microsoft fixed CVE-2024-7344 with the
January 2025 Patch Tuesday
, so Windows systems that have applied this or later security updates are protected from HybridPetya.
Another solid practice against ransomware is to keep offline backups of your most important data, allowing free and easy system restoration.
Frustrated federal appeals court judges publicly wrestled Thursday with how to follow vague “signals” from the Supreme Court contained in tersely worded — and often unexplained — orders handed down on the justices’ emergency docket.
Some judges on the 4th Circuit Court of Appeals even questioned whether they still had a role to play or were expected, at least in some cases, to simply reiterate the high court’s orders and leave it at that.
“They’re leaving the circuit courts, the district courts out in limbo,” said Judge James Wynn, an Obama appointee, during oral arguments in a case about the Department of Government Efficiency employees’ access to Social Security data. “We’re out here flailing. … I’m not criticizing the justices. They’re using a vehicle that’s there, but they are telling us nothing. They could easily just give us direction and we would follow it.”
“They cannot get amnesia in the future because they didn’t write an opinion on it. Write an opinion,” Wynn said. “We need to understand why you did it. We judges would just love to hear your reasoning as to why you rule that way. It makes our job easier. We will follow the law. We will follow the Supreme Court, but we’d like to know what it is we are following.”
During the
remarkable, 80-minute venting session
Thursday at the Richmond-based court, judges openly debated how to follow the justices’ sparse guidance while fulfilling their own constitutional duty to issue detailed rulings in complex cases.
“The Supreme Court’s action must mean something,” said Judge J. Harvie Wilkinson III, a Reagan appointee. “It doesn’t do these things just for the kicks of it.”
Much of the uncertainty was triggered by
a July order
from the justices that emphatically informed lower courts they are obliged to follow the Supreme Court’s rulings on its so-called emergency docket, even in cases that have not yet reached the high court.
The 4th Circuit judges are far from the first to suggest the Supreme Court is muddying the waters with some of its emergency docket decisions.
In
a ruling last week in a high-profile case
over the Trump administration’s targeting of Harvard University, U.S. District Judge Allison Burroughs said a series of high court rulings on the administration’s efforts to end various government grants “have not been models of clarity, and have left many issues unresolved.” She added that sometimes those rulings appear to shift the law “without much explanation or consensus.”
Despite recent umbrage expressed by Justices Neil Gorsuch and Brett Kavanaugh —
suggesting lower court judges had flippantly defied high court emergency-docket rulings
— the public debate among the 4th Circuit judges underscored continuing confusion on the lower courts about precisely how to follow those decisions in practice, especially when they lack any detailed reasoning.
Compounding that issue in the case argued Thursday, about sensitive Social Security Administration information, is that in June
the Supreme Court overruled the 4th Circuit’s decision
and lifted an injunction against DOGE’s use of the data.
By an apparent 6-3 vote, the justices went further, saying that no matter what the appeals court decided, the injunction would remain on hold until the case returned to the Supreme Court. Yet, the high court’s majority offered no substantive rationale for the lower court to parse.
That left many of the 15 4th Circuit judges on hand for Thursday’s unusual en banc arguments puzzling at their role. One even suggested the appeals court should simply issue a one-line opinion saying the injunction is lifted and kick the case back to the Supreme Court to resolve.
However, some judges said they were obliged to consider and rule on the case like any other, even if the high court has already decided that any such ruling would be of no immediate effect.
“It sounds like some of my colleagues think that there’s no work to be done, that we’re done because the Supreme Court has told us what the answer is,” said Judge Albert Diaz, an Obama appointee.
Judge Robert King said punting on the case would be a mistake.
“We each have a commission and we have a robe and we have an oath to abide by,” said King, a Clinton appointee.
Diaz opened Thursday’s court session by observing the anniversary of the Sept. 11, 2001, terrorist attacks and offering a prayer for the family of Turning Point founder Charlie Kirk, who was
shot dead in Utah
Wednesday, as well as those involved in a school shooting in Colorado the same day that left two students wounded and the apparent shooter dead.
“We pray for them and also more importantly hope and pray that somehow this country will find a way to end the senseless violence that seems to have overtaken us in a way that has become almost unimaginable,” Diaz said.
Josh Gerstein
is POLITICO’s Senior Legal Affairs Reporter. Gerstein covers the intersection of law and politics, including Special Counsel Robert Mueller’s investigation of President Donald Trump and his associates, as well as ensuing counter-investigations into the origins of the FBI’s initial inquiry into the Trump-Russia saga.
Kyle Cheney
is a senior legal affairs reporter for POLITICO. Cheney joined the Congress team after covering the 2016 presidential election on POLITICO's politics team. He covered the Republican primary field with a focus on the national GOP, the Republican National Convention and the internal machinations of the party as it adjusted to the emergence of Donald Trump.
POLITICO
is the global authority on the intersection of politics, policy, and power. It is the most robust news operation and information service in the world specializing in politics and policy, which informs the most influential audience in the world with insight, edge, and authority. Founded in 2007, POLITICO has grown to a team of more than 1,100 working across North America and Europe. In October 2021, POLITICO was acquired by, and is a subsidiary of,
Axel Springer SE
.
POLITICO.com is accessible without subscription. POLITICO's newspaper is available for free in locations throughout Washington. Domestic subscriptions are $200 for one year and $350 for two years. Overseas subscriptions are $600 per year. To subscribe, please
sign up online
or call 866-504-4251.
I host a bunch of hobby code on my server. I would think it’s really only interesting to me, but it turns out every day, thousands of people from all over the world are digging through my code, reviewing years old changesets. On the one hand, wow, thanks, this is very flattering. On the other hand, what the heck is wrong with you?
This has been building up for a while, and I’ve been intermittently developing and deploying
countermeasures
. It’s been a lot like solving a sliding block puzzle. Lots of small moves and changes, and eventually it starts coming together.
My primary principle is that I’d rather not annoy real humans more than strictly intended. If there’s a challenge, it shouldn’t be too difficult, but ideally, we want to minimize the number of challenges presented. You should never suspect that I suspected you of being an enemy agent.
First measure is we only challenge on the deep URLs. So, for instance, I can link to the
anticrawl
repo no problem, or even the source for
anticrawl.go
, and that’ll be served immediately. All the pages any casual browser would visit make up less than 1% of the possible URLs that exist, but probably contain 99% of the interesting content.
Also, these pages get cached by the reverse proxy first, so anticrawl doesn’t even evaluate them. We’ve already done the work to render the page, and we’re trying to shed load, so why would I want to increase load by generating challenges and verifying responses? It annoys me when I click a seemingly popular blog post and immediately get challenged, when I’m 99.9% certain that somebody else clicked it two seconds before me. Why isn’t it in cache? We must have different objectives in what we’re trying to accomplish. Or who we’re trying to irritate.
The next step is that anybody loading
style.css
gets marked friendly. Big Basilisk doesn’t care about my artisanal styles, but most everybody else loves them. So if you start at a normal page, and then start clicking deeper, that’s fine, still no challenge. (Sorry lynx browsers, but don’t worry, it’s not game over for you yet.)
And then let’s say somebody directly links to a changeset like
/r/vertigo/v/b5ea481ff167
. The first visitor will probably hit a challenge, but then we record that URL as in use. The bots are shotgun crawling all over the place, but if a single link is visited more than once, I’ll assume it’s human traffic, and bypass the challenge. No promises, but clicking that link will mostly likely just return content, no challenge.
The very first version of anticrawl relied on a weak POW challenge (find a SHA hash with first byte 42), just to get something launched, but this does seem
counter intuitive
. Why are we making humans solve a challenge optimized for machines? Instead I have switched to a much more diabolical challenge. You are asked how many Rs in strawberry. Or maybe something else. To be changed as necessary. But really, the key observation is that any challenge, anything at all, easily sheds like 99.99% of the crawling load.
Notably, because the challenge does not include its own javascript solver, even a smart crawler isn’t going to solve it automatically. If you include the solution on the challenge page, at least some bots are going to use it. All anticrawl challenges now require some degree of contemplation, not just blind interpretation.
It took a few iterations because the actual deployment involves a few pieces. I had to reduce the
style.css
cache time, so that visitors would periodically refresh it (and thus their humanity). And then exclude it from the caching proxy, so that the request would be properly observed. Basically, a few minutes tinkering now and then while I wait for my latte to arrive, and now I think I’ve gotten things to the point where it’s unlikely to burden anybody except malignant crawlers.
elsewhere
I have focused my bot detection efforts on humungus because the ratio of crawler to legit traffic was out of control. But now that I know what to look for, I see the same patterns scraping everywhere else. Seems really unlikely a worldwide colelctive of Opera users is suddenly interested in my old honks. I’m starting to deploy similar countermeasures.
appendix
Some log samples. There’s always somebody to insist these could be real humans, and I have somehow misjudged them. Make your own decision.
The big brain solution is you just cache all these requests, but unfortunately I have slightly less than 2TB RAM. Dealing with these relentless scans renders the cache for actual content less useful because everything gets LRUd out.
Take a look at this guy. Apparently a pro starcraft player has taken up speed browsing as a side hustle, middle clicking ten times per second. These links aren’t even on the same page, so he’s switching tabs between clicks, too. Amazing. But he doesn’t solve a single challenge. I thought gamers liked puzzles? Maybe that’s why this totally real human quit gaming.
logs
46.183.108.190 1.361957ms humungus.tedunangst.com [2025/07/08 13:41:32] "GET /r/azorius/v/a0824eb087c7 HTTP/1.1" 402 1808 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 2.450003ms humungus.tedunangst.com [2025/07/08 13:41:32] "GET /r/azorius/v/1a4d35ff94ef HTTP/1.1" 402 1808 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 3.049854ms humungus.tedunangst.com [2025/07/08 13:41:32] "GET /r/azorius/v/9ca7ed390641 HTTP/1.1" 402 1808 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 3.215804ms humungus.tedunangst.com [2025/07/08 13:41:32] "GET /r/humungus/v/67e77258e203 HTTP/1.1" 402 1809 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 3.26703ms humungus.tedunangst.com [2025/07/08 13:41:32] "GET /r/honk/v/2fc6f904deaa HTTP/1.1" 402 1805 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 830.924µs humungus.tedunangst.com [2025/07/08 13:41:32] "GET /r/humungus/v/1437b0d26457 HTTP/1.1" 402 1809 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 1.23004ms humungus.tedunangst.com [2025/07/08 13:41:32] "GET /r/honk/v/945572a3b51d HTTP/1.1" 402 1805 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 789.496µs humungus.tedunangst.com [2025/07/08 13:41:32] "GET /r/humungus/v/63f9c1f17606 HTTP/1.1" 402 1809 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 841.414µs humungus.tedunangst.com [2025/07/08 13:41:32] "GET /r/humungus/v/b5711a883e66 HTTP/1.1" 402 1809 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 916.233µs humungus.tedunangst.com [2025/07/08 13:41:32] "GET /r/humungus/v/a5307922b3f5 HTTP/1.1" 402 1809 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 565.518µs humungus.tedunangst.com [2025/07/08 13:41:33] "GET /r/honk/v/ab1e84cac5e6 HTTP/1.1" 402 1805 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 574.083µs humungus.tedunangst.com [2025/07/08 13:41:33] "GET /r/honk/v/a9043d011e41 HTTP/1.1" 402 1805 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 602.026µs humungus.tedunangst.com [2025/07/08 13:41:33] "GET /r/azorius/v/1cc1393b6832 HTTP/1.1" 402 1808 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 497.831µs humungus.tedunangst.com [2025/07/08 13:41:33] "GET /r/azorius/v/4d53be2bdbd5 HTTP/1.1" 402 1808 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 516.365µs humungus.tedunangst.com [2025/07/08 13:41:33] "GET /r/honk/v/302e58335796 HTTP/1.1" 402 1805 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 614.239µs humungus.tedunangst.com [2025/07/08 13:41:33] "GET /r/azorius/v/1a4d35ff94ef HTTP/1.1" 402 1808 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 530.161µs humungus.tedunangst.com [2025/07/08 13:41:33] "GET /r/azorius/v/a0824eb087c7 HTTP/1.1" 402 1808 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 625.91µs humungus.tedunangst.com [2025/07/08 13:41:33] "GET /r/azorius/v/9ca7ed390641 HTTP/1.1" 402 1808 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 757.497µs humungus.tedunangst.com [2025/07/08 13:41:33] "GET /r/humungus/v/67e77258e203 HTTP/1.1" 402 1809 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
46.183.108.190 841.494µs humungus.tedunangst.com [2025/07/08 13:41:33] "GET /r/vertigo/v/6b3ffb3b21f5 HTTP/1.1" 402 1808 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0"
I think there’s a common perception that 10 req/s just isn’t that much, based on some simple benchmarks. But that doesn’t account for TLS, etc. 10 handshakes/s requires a bit more juice than GET hello. I’ve worked to keep response times in the low millisecond range, just seems like good sense, but I think people should be allowed to program in slower languages and frameworks. You shouldn’t need a fairly substantial EPYC server like I have, either.
And there’s always stuff happening in the background. Mastodon hits me once a second every time anybody deletes something. Lemmy hits me twice a second every time somebody likes anything. There’s a bunch of nitwits with misconfigured RSS readers.
A 4x overhead in one area doesn’t matter, but a 1/4 as powerful CPU, with 1/4 as many cores, and 1/4 as fast language, all of which are entirely realistic, and pretty soon we’re running close to the edge.
Abstract:
K2-Think is a reasoning system that achieves state-of-the-art performance with a 32B parameter model, matching or surpassing much larger models like GPT-OSS 120B and DeepSeek v3.1. Built on the Qwen2.5 base model, our system shows that smaller models can compete at the highest levels by combining advanced post-training and test-time computation techniques. The approach is based on six key technical pillars: Long Chain-of-thought Supervised Finetuning, Reinforcement Learning with Verifiable Rewards (RLVR), Agentic planning prior to reasoning, Test-time Scaling, Speculative Decoding, and Inference-optimized Hardware, all using publicly available open-source datasets. K2-Think excels in mathematical reasoning, achieving state-of-the-art scores on public benchmarks for open-source models, while also performing strongly in other areas such as Code and Science. Our results confirm that a more parameter-efficient model like K2-Think 32B can compete with state-of-the-art systems through an integrated post-training recipe that includes long chain-of-thought training and strategic inference-time enhancements, making open-source reasoning systems more accessible and affordable. K2-Think is freely available at
this http URL
, offering best-in-class inference speeds of over 2,000 tokens per second per request via the Cerebras Wafer-Scale Engine.
Submission history
From: Taylor Killian [
view email
]
[v1]
Tue, 9 Sep 2025 11:25:55 UTC (3,106 KB)
Behind the Blog: 'Free Speech' and Open Dialogue
403 Media
www.404media.co
2025-09-12 18:00:59
This week, we discuss "free speech," keeping stupid thoughts in one's own head, and cancel culture....
This is Behind the Blog, where we share our behind-the-scenes thoughts about how a few of our top stories of the week came together. This week, we discuss "free speech," keeping stupid thoughts in one's own head, and cancel culture.
JASON:
In August 2014,
I spoke to Drew Curtis
, the founder of Fark.com, a timeless, seminal internet website, about a decision he had just made. Curtis banned misogyny from his website, partially in the name of facilitating free speech.
“We don't want to be the He Man Woman Hater's Club. This represents enough of a departure from pretty much how every other large internet community operates that I figure an announcement is necessary,” Curtis
wrote
when he announced the rule. “Adam Savage once described to me the problem this way: if the Internet was a dude, we'd all agree that dude has a serious problem with women.”
This post is for paid members only
Become a paid member for unlimited ad-free access to articles, bonus podcast content, and more.
QGIS is a full-featured, user-friendly, free-and-open-source (FOSS) geographical information system (GIS) that runs on Unix platforms, Windows, and MacOS.
Data abstraction framework, with local files, spatial databases (PostGIS, SpatiaLite, SQL Server, Oracle, SAP HANA), and web services (WMS, WCS, WFS, ArcGIS REST) all accessed through a unified data model and browser interface, and as flexible layers in user-created projects
Spatial data creation via visual and numerical digitizing and editing, as well as georeferencing of raster and vector data
On-the-fly reprojection between coordinate reference systems (CRS)
Nominatim (OpenStreetMap) geocoder access
Temporal support
Example: Temporal animation
Example: 3D map view
2. Beautiful cartography
Large variety of rendering options in 2D and 3D
Fine control over symbology, labeling, legends and additional graphical elements for beautifully rendered maps
Respect for embedded styling in many spatial data sources (e.g. KML and TAB files, Mapbox-GL styled vector tiles)
In particular, near-complete replication (and significant extension) of symbology options that are available in proprietary software by ESRI
Advanced styling using data-defined overrides, blending modes, and draw effects
500+ built-in color ramps (cpt-city, ColorBrewer, etc.)
Create and update maps with specified scale, extent, style, and decorations via saved layouts
Generate multiple maps (and reports) automatically using QGIS Atlas and QGIS Reports
Display and export elevation profile plots with flexible symbology
Flexible output direct to printer, or as image (raster), PDF, or SVG for further customization
On-the-fly rendering enhancements using geometry generators (e.g. create and style new geometries from existing features)
Preview modes for inclusive map making (e.g. monochrome, color blindness)
Powerful processing framework with 200+ native processing algorithms
Access to 1000+ processing algorithms via providers such as GDAL, SAGA, GRASS, OrfeoToolbox, as well as custom models and processing scripts
Geospatial database engine (filters, joins, relations, forms, etc.), as close to datasource- and format-independent as possible
Immediate visualization of geospatial query and geoprocessing results
Model designer and batch processing
Example: Travel isochrones
Example: Model designer
4. Powerful customization and extensibility
Fully customizable user experience, including user interface and application settings that cater to power-users and beginners alike
Rich
expression engine
for maximum flexibility in visualization and processing
Broad and varied
plugin ecosystem
that includes data connectors, digitizing aids, advanced analysis and charting tools,
in-the-field data capture, conversion of ESRI style files, etc.
Style manager for creating, storing, and managing styles
Python and C++ API for standalone (headless) applications as well as in-application comprehensive scripting (PyQGIS)
Example: Style manager
Example: Plugins
5. QGIS Server
Headless map server -- running on Linux, macOS, Windows, or in a docker container -- that shares the same code base as QGIS.
Industry-standard protocols (WMS, WFS, WFS3/OGC API for Features and WCS) allow plug-n-play with any software stack
Works with any web server (Apache, nginx, etc) or standalone
All beautiful QGIS cartography is supported with best-in-class support for printing
Fully customizable with Python scripting support
Example: QGIS server WMS response
Example: QGIS server WFS response
Under the hood
QGIS is developed using the
Qt toolkit
and C++, since 2002, and has a pleasing, easy to use graphical
user interface with multilingual support. It is maintained by an active developer team and supported by vibrant
community of GIS professionals and enthusiasts as well as geospatial data publishers and end-users.
Versions and release cycle
QGIS development and releases follow a
time based schedule/roadmap
. There are three main branches of QGIS that users can install. These are the
Long Term Release (LTR)
branch, the
Latest Release (LR)
branch, and the
Development (Nightly)
branch.
Every month, there is a
Point Release
that provides bug-fixes to the LTR and LR.
Free and Open Source
QGIS is released under the GNU Public License (GPL) Version 2 or any later version.
Developing QGIS under this license means that you can (if you want to) inspect
and modify the source code and guarantees that you, our happy user will always
have access to a GIS program that is free of cost and can be freely
modified.
QGIS is part of the Open-Source Geospatial Foundation (
OSGeo
), offering a range of complementary open-source GIS software projects.
Installing and using QGIS
Precompiled binaries for QGIS are available at
the QGIS.org download page
.
Please follow the installation instructions carefully.
The
building guide
can be used to get started with building QGIS from source.
Chatting with other users real-time.
Please wait around for a response to your question as many folks on the channel are doing other things and it may take a while for them to notice your question. The following paths all take you to the same chat room:
Using an IRC client and joining the
#qgis
channel on irc.libera.chat.
At the
GIS stackexchange
or
r/QGIS reddit
, which are not maintained by the QGIS team, but where the QGIS and broader GIS community provides lots of advice
We are excited to announce Vectroid, a serverless vector search solution that delivers exceptional accuracy and low latency in a cost effective package. Vectroid is not just
another vector search
solution—it’s a search engine that performs and scales in all scenarios.
Why we built Vectroid
Talk to any team working with large, low latency vector workloads and you’ll hear a familiar story: something always has to give. Vector databases often make significant tradeoffs between speed, accuracy, and cost. That’s the nature of the mathematical underpinnings of vector search works—taking algorithmic
shortcuts
to get near-perfect results in a short amount of time.
There are some common permutations of these tradeoffs:
Ve
ry high accuracy
, but
very expensive
and
slow
Fast speed
and
tolerable accuracy
, but
very expensive
Cheap
and
fast
, but
inaccurate
to a
disqualifying degree
Based on the existing vector database landscape, it would seem that building a cost effective vector database requires sacrificing either speed or accuracy at scale. We believe that’s a false pretense: building a cost-efficient database
is
possible with high accuracy and low latency. We just need to rethink our underlying mechanism.
Our “aha” moment
Query speed and recall are largely a function of the chosen ANN algorithm. Algorithms which are both fast
and
accurate like HNSW (Hierarchical Navigable Small Worlds) are memory intensive and expensive to index. The traditional assumption is that these types of algorithms are untenable for a cost-conscious system.
We had two major realizations which challenged this assumption.
Demand for in-memory HNSW is not static.
Real world usage patterns are bursty and uneven. A cost efficient database can optimize for this reality by making resource allocation dynamic and by individually scaling system components as needed.
HNSW’s memory footprint is tunable.
It can be easily be flattened (ex. by compressing vectors using quantization) and expanded (ex. by increasing layer count), which gives us the flexibility to experiment with different configurations to find a goldilocks setup.
What is Vectroid?
Vectroid is a serverless vector database with premium performance. It delivers the same or stronger balance of speed and recall promised by high-end offerings, but costs less than competitors.
Performant vector search: HNSW for ultra fast, high recall similarity search.
Near real-time search capabilities: Newly ingested records are searchable almost instantly.
Massive scalability: Seamlessly handles billions of vectors in a single namespace.
The core philosophy of Vectroid is that optimizing for one metric at any cost to the others doesn’t make for a robust system. Instead, vector search should be designed for balanced performance across recall, latency, and cost so users don’t have to make painful tradeoffs as workloads grow.
When tested against other state-of-the-art vector search, Vectroid is not only competitive but the most consistent across the board. Across all of our tests, Vectroid is the
only
databases that was able to maintain over 90% recall while scaling to 10 query threads per second—all while maintaining good latency scores.
Some early benchmarks:
Indexed
1B vectors (Deep1B)
in ~
48 minutes
Achieved
P99 latency of 34ms
on the
MS Marco 138M vector / 1024 dimensions dataset
We’ll be releasing the full benchmark suite (with setup details so anyone can reproduce them) in an upcoming post. For now, these numbers highlight the kind of scale and performance we designed Vectroid to handle.
How Vectroid works
Vectroid is composed of two independently scalable microservices for writes and reads.
As the diagram shows, index state, vector data, and metadata are persisted to cloud object storage (GCS for now, S3 coming soon). Disk, cache, and in-memory storage layers each employ a usage-aware model for index lifecycle in which indexes are lazily loaded from object storage on demand and evicted when idle.
For fast, high-recall ANN search, we chose the
HNSW algorithm
. It offers excellent latency and accuracy tradeoffs, supports incremental updates, and performs well across large-scale workloads. To patch its limitations, we added a number of targeted optimizations:
Slow indexing speed ⇒ in-memory write buffer to ensure newly inserted vectors are immediately searchable
High indexing cost ⇒ batched, highly concurrent and partitioned indexing
High memory usage ⇒ vector compression via quantization
Final Thoughts
We’re just getting started. If you’re building applications that rely on fast, scalable vector search (or you’re running up against the limits of your current stack), we’d love to hear from you. Start using Vectroid today or sign up for our newsletter to follow along as we continue building.
The Other End of the Weight Spectrum: Very Thin People
Portside
portside.org
2025-09-12 17:46:40
The Other End of the Weight Spectrum: Very Thin People
barry
Fri, 09/12/2025 - 12:46
...
Before weight coach Bella Barnes consults with new clients, she already knows what they’ll say. The women struggle with their weight, naturally. But they don’t want to lose pounds. They want to gain
them.
Her clients find themselves too thin, and they’re suffering. “Last week, I signed up a client who wears leggings that have bum pads in them,” says Barnes, who lives in Great Britain. “I’ve had another client recently that, in summer, wears three pairs of leggings just to try and make herself look a bit bigger.”
These women belong to a demographic group that has been widely overlooked. As the world focuses on its
billion-plus obese citizens
, there remain people at the other end of the spectrum who are skinny, often painfully so, but don’t want to be. Researchers estimate that around 1.9 percent of the population are “constitutionally thin,” with 6.5 million of these people in the United States alone.
Constitutionally thin individuals often eat as much as their peers and don’t exercise hard. Yet their body mass index is below 18.5 — and sometimes as low as 14, which translates to 72 pounds on a five-foot frame — and they don’t easily gain weight. The condition is “
a real enigma
,” write the authors of a recent paper in the
Annual Review of Nutrition
. Constitutional thinness, they say, challenges “basic dogmatic knowledge about energy balance and metabolism.” It is also understudied: Fewer than 50 clinical studies have looked at constitutionally thin people, compared with thousands on
unwanted weight gain
.
Recently, researchers have started to investigate how naturally thin bodies are different. The scientists hope to unlock metabolic insights that will help constitutionally thin people gain weight. The work may also help overweight people lose pounds, since constitutional thinness appears to be “a mirror model” of obesity, says Mélina Bailly, a coauthor of the recent review and a physiological researcher at AME2P, a metabolism research lab at the University Clermont Auvergne in France.
Genetic and metabolic factors
Individuals who eat heartily but remain inexplicably skinny were first reported in the scientific literature in 1933. Decades later, a landmark 1990 experiment demonstrated how profoundly people differ in regulating their weight.
Twelve pairs of identical twins were fed 1,000 surplus calories for six days a week. After three months of such overfeeding — equivalent to an extra Big Mac and medium fries daily —
the young men had gained an average of almost 18 pounds, mostly fat
, but within a large range: One gained almost 30 pounds and another fewer than 10. The latter had somehow diffused around 60 percent of the extra energy.
The study also found that the variation of weight gain was three times greater between twin pairs than within them — indicating a genetic influence on the tendency to add pounds when overfed.
Other studies confirmed that constitutionally thin people largely “resist” weight gain, particularly when eating fatty foods. Whatever pounds they do gain through overfeeding rapidly vanish once they resume normal eating.
This aligns with current thinking to some extent. Many researchers believe that our bodies have a preprogrammed weight “set point” or “set range” to which they try to return. That’s one reason few dieters manage to keep off lost weight long-term. Their metabolism slows down, burning fewer calories and making weight regain easier, particularly once the dieter stops restricting calories. (The system displays some flexibility, explaining why many of us put on inches around our midsections over the years.)
‘Skinny shaming’
As a group, lean individuals are probably as heterogeneous as overweight people. Some may stay thin because they have smaller appetites or feel full sooner. Others consume just as many calories as heavier individuals. One study found that constitutionally thin people
eat 300-plus calories more per day than their metabolism needs
. “They have a positive energy balance and they still resist weight gain,” says Bailly, a collaborator on NUTRILEAN, a project focused on constitutional thinness, at University Clermont Auvergne in France.
Like obese people, constitutionally thin people face their social stigma. Thin men may feel too scrawny to satisfy masculine ideals. Skinny women often lament lacking curves. People might suspect they are hiding
eating disorders
. They get “comments from random people on the street,” says Jens Lund, a postdoc in metabolic research at the Novo Nordisk Foundation Center for Basic Metabolic Research at the University of Copenhagen. “These people feel like they can’t go to toilet after a family dinner … because they are afraid that people would look at them as if they’re going out to puke, like having bulimia.”
Weight gain coach Barnes was never technically all the way in the constitutionally thin category, but she experienced plenty of such “skinny shaming” firsthand. Family members commented on her weight but dismissed her distress. “I felt like I could never speak about it,” she says. “People would be like, ‘That’s not a real problem,’ or ‘Just take some weight from me.’”
Where do the calories in constitutionally thin people go? Researchers have started eliminating possibilities.
A 2021 meta-analysis offered some surprises. When Bailly and colleagues compiled data on thin people’s body composition, they discovered something unexpected: Constitutionally thin individuals
carry nearly normal amounts of fat throughout their bodies
. “It’s really unusual to have such low body weight combined with quite normal fat mass,” says Bailly.
What seems to be lacking is muscle mass. Constitutionally thin people have less of it — research has found that they have muscle fibers that are on average about 20 percent smaller than those of normal-weight people. Constitutionally thin people may also have reduced
bone
mass.
These facts suggest that there are health costs to leanness. Though studies are lacking, Bailly suspects that as they age, especially thin women might run a higher risk of osteoporosis, a dangerous weakening of the bones. The reduced muscle mass could also make everyday tasks, like opening jars or carrying groceries, more arduous.
And it could mean fewer protein reserves during illness, says Julien Verney, a physiological researcher at Clermont Auvergne’s metabolic lab and coauthor of the
Annual Review of Nutrition
paper.
They may also excrete more calories than others. While this hasn’t been explored specifically for lean people, it’s known that some people lose up to 10 percent of ingested calories through feces (and to a lesser extent, urine), compared to just 2 percent in others. In one study,
a woman excreted 200 calories daily
— equivalent to half a liter of soda.
Additional metabolic idiosyncrasies of constitutionally thin people may still await discovery. “We recently found some clues that may suggest more metabolic activity of their fat mass tissues,” says Bailly. “This is really surprising.” Other studies have already suggested that naturally thin people have more “brown fat” — a calorie-burning tissue that generates body heat.
Stay in the Know
Sign up
for the
Knowable Magazine
newsletter today
To find more specific answers, Lund plans to launch an inpatient study at the University of Copenhagen. The study will use a metabolic chamber to track energy intake, expenditure and all routes of energy loss — including feces, urine and exhaled gases — in constitutionally thin people. Since 2020, Lund’s team has assembled a network of Danes who self-report as naturally lean, providing a unique pool for future research.
Constitutional thinness, as the 1990 twin study showed, has a strong genetic component: Research shows that 74 percent of very lean people have relatives with similar stature. As researchers identify gene variants, they realize that many of these — with names like
FTO
,
MC4R
and
FAIM2
— are also involved in processes leading to obesity. Although they don't yet understand the specifics, scientists suspect that people with constitutional thinness may have unique activity patterns in genes related to energy production.
One such gene that has drawn researchers’ attention is
ALK
(anaplastic lymphoma kinase). When scientists deleted this gene in mice,
the animals became resistant to weight gain
when fed high-fat diets — even in mouse strains genetically prone to obesity. The
ALK
gene seems to act in the brain, which then sends signals affecting the rate at which fat cells burn energy.
Understanding genetic mechanisms like these could lead to new treatments for both unhappily thin and unhappily obese people, says Lund. “If you can figure out what protects them from developing overweight, then whatever that mechanism is, you can then try to turn that into a drug,” he says. “There are so many signaling molecules in the body that we don’t even know exist.” The dream is to find a breakthrough as transformative as the latest obesity medications.
While researchers hunt for biological clues, Bella Barnes navigates the complexities of weight gain on her own. After years of trial and error, she gradually gained about 40 pounds by combining strength training with careful, intentional eating. At first, if she hadn’t reached her calories for the day, she’d just grab a packet of cookies — anything to get the numbers up. But she found more balance over time. “Not all calories are the same. You want to be eating whole foods,” she says. And a lot of them.
Today, Barnes has coached more than a hundred women on her weight gain techniques and has a strong TikTok following; she says that she’s proud of the strong body she’s built.
Maybe five more pounds, she adds, “would make me at my happiest.”
What is known? What isn’t known?
Knowable Magazine
, the
award-winning
digital publication from Annual Reviews, seeks to make scientific knowledge accessible to all. We explore the real-world significance of scholarly work through a journalistic lens, reporting on the current state of play across a wide variety of fields — from agriculture to high-energy physics; biochemistry to water security; the origins of the universe to psychology.
Knowable Magazine content is thoroughly researched, reported, edited, copy-edited and fact-checked, with editorial decisions made independently by the magazine staff.
From 19k to 4.2M events/sec: story of a SQLite query optimisation
Sit down comfortably. Take a cushion if you wish. This is,
clear its
throat
, the story of a funny performance quest. The
Matrix Rust SDK
is a
set of crates aiming at providing all the necessary tooling to develop robust
and safe
Matrix
clients. Of course, it involves databases to persist some
data. The Matrix Rust SDK supports multiple databases: in-memory,
SQLite
, and
IndexedDB
. This story is about the SQLite database.
The structure we want to persist is a novel type we have designed specifically
for the Matrix Rust SDK: a
LinkedChunk
. It's the underlying structure that
holds all events manipulated by the Matrix Rust SDK. It is somewhat similar to
a
linked list
; the differences are subtle and the goal of this article is
not
to present all the details. We have developed many API around this type
to make all operations fast and efficient in the context of the Matrix protocol.
What we need to know is that in a
LinkedChunk<_, Item, Gap>
, each node
contains a
ChunkContent<Item, Gap>
defined as:
Put it differently: each node can contain a
gap
, or a set of
items
(be
Matrix events).
Le Comte
May I recapitulate?
Each Matrix
room
contains a
LinkedChunk
, which is a set of
chunks
. Each
chunk
is either a
gap
or a set of
events
. It seems to map fairly easily to
SQL tables, isn't it?
You're right: it's pretty straightforward! Let's see the first table:
linked_chunks
which contains all the chunks. (Note that the schemas are
simplified for the sake of clarity).
CREATETABLE"linked_chunks" (
-- Which linked chunk does this chunk belong to?
"linked_chunk_id" BLOB NOTNULL,
-- Identifier of the chunk, unique per linked chunk.
"id"INTEGERNOTNULL,
-- Identifier of the previous chunk.
"previous"INTEGER,
-- Identifier of the next chunk.
"next"INTEGER,
-- Our enum for the content of the chunk: `E` for events, `G` for a gap.
"type"TEXTCHECK("type"IN ('E', 'G')) NOTNULL,
-- … other things …
);
Alrighty. Next contenders: the
event_chunks
and the
gap_chunks
tables, which
store the
ChunkContent
s of each chunk, respectively for
ChunkContent::Items
and
ChunkContent::Gap
. In
event_chunks
, each row corresponds to an event. In
gap_chunks
, each row corresponds to a gap.
CREATETABLE"event_chunks" (
-- Which linked chunk does this event belong to?
"linked_chunk_id" BLOB NOTNULL,
-- Which chunk does this event refer to?
"chunk_id"INTEGERNOTNULL,
-- The event ID.
"event_id" BLOB NOTNULL,
-- Position (index) in the **chunk**.
"position"INTEGERNOTNULL,
-- … other things …
);
CREATETABLE"gap_chunks" (
-- Which linked chunk does this event belong to?
"linked_chunk_id" BLOB NOTNULL,
-- Which chunk does this gap refer to?
"chunk_id"INTEGERNOTNULL,
-- … other things …
);
Last contender,
events
. The assiduous reader may have noted that
event_chunks
doesn't contain the content of the events: only its ID and its
position,
roll its eyes
… let's digress a bit, should we? Why is that? To
handle out-of-band events. In the Matrix protocol, we can receive events via:
the
/sync
endpoint
, it's the main source of inputs, we get most
of the events via this API,
the
/messages
endpoint
, when we need to get events around
a particular events; this is helpful if we need to paginate backwards or
forwards around an event,
When an event is fetched but cannot be positioned regarding other events, it is
considered
out-of-band
: it belongs to zero linked chunk, but we keep it in the
database. Maybe we can attach it to a linked chunk later, or we want to keep it
for saving future network requests. Anyway. You're a great digression companion.
Let's jump back to our tables.
The
events
table contains
all
the events: in-band
and
out-of-band.
-- Events and their content.
CREATETABLE"events" (
-- The ID of the event.
"event_id" BLOB NOTNULL,
-- The JSON encoded content of the event (it's an encrypted value).
"content" BLOB NOTNULL,
-- … other things …
);
At some point, we need to fetch metadata about a
LinkedChunk
. A certain
algorithm needs these metadata to work efficiently. We don't need to load all
events, however we need:
to know all the chunks that are part of a linked chunk,
for each chunk, the number of events: 0 in case of a
ChunkContent::Gap
(
G
), or the number of events in case of a
ChunkContent::Items
(
E
).
A first implementation has landed in the Matrix Rust SDK. All good. When
suddenly…
Incredibly slow sync
A power-user
1
was
experiencing slowness
. It's always
a delicate situation. How to know the reason of the slowness? Is it the device?
The network? The asynchronous runtime? A lock contention? The file system? …
The database?
We don't have the device within easy reach. Hopefully, Matrix users are always
nice and willing to help! We have added a bunch of logs, then the user has
reproduced the problem, and shared their logs (via a rageshake) with us. Logs are
never trivial to analyse. However, here is a tip we use in the Matrix Rust SDK:
we have a special tracing type that logs the time spent in a portion of the
code; called
TracingTimer
.
Basically, when a
TracingTimer
is created, it keeps its creation time in
memory. And when the
TracingTimer
is dropped, it emits a log containing the
elapsed time since its creation. It looks like this (it uses
the
tracing
library
):
And with that, let's use its companion macro
timer!
(I won't copy-paste it
here, it's pretty straightforward):
{let _timer =timer!("built something important");// … build something important …
// `_timer` is dropped here, and will emit a log.
}
With this technique, we were able to inspect the logs and saw immediately what
was slow… assuming we have added
timer!
s at the right places! It's not magic,
it doesn't find performance issues for you. You have to probe the correct places
in your code, and refine if necessary.
Le Comte
I don't know if you heard about
sampling profilers
, but those are programs
far superior at analysing performance problems, compared to your… rustic
TracingTimer
(pun intended!). Such programs can provide flamegraphs, call
trees etc.
I'm personally a regular user of
samply
, a command line CPU profiler relying
on the
Firefox profiler
for its UI. It works on macOS, Linux and Windows.
I do also use
samply
pretty often! But you need an access to the processes
to use such tools. Here, the Matrix Rust SDK is used and embedded inside Matrix
clients. We have no access to it. It lives on devices everywhere around the
world. We may use better log analysers to infer “call trees”, but supporting
asynchronous logs (because the code is asynchronous) makes it very difficult.
And I honestly don't know if such a thing exists.
So. Yes.
We found the culprit
. With
ripgrep
, we were able to scan
megabytes of logs and find the culprit pretty quickly. I was looking for lags of
the order of a second. I wasn't disappointed:
107 seconds. Be 1 minute and 47 seconds. Hello sweety.
The slow query
load_all_chunks_metadata
is a method that runs this SQL query:
SELECTlc.id,
lc.previous,
lc.next,
COUNT(ec.event_id) as number_of_events
FROM linked_chunks as lc
LEFT JOIN event_chunks as ec
ON ec.chunk_id=lc.idWHERElc.linked_chunk_id= ?
GROUP BYlc.id
For each chunk of the linked chunk, it counts the number of events associated to
this chunk. That's it.
Do you remember that a chunk can be of two kinds:
ChunkContent::Items
if it
contains a set of events, or
ChunkContent::Gap
if it contains a gap, so, no
event.
This query does the following:
if the chunk is of kind
ChunkContent::Items
, it does count all events
associated to itself (via
ec.chunk_id = lc.id
),
otherwise, the chunk is of kind
ChunkContent::Gap
, so it will try to count
but… no event is associated to it: it's impossible to get
ec.chunk_id = lc.id
to be true for a gap. This query will scan
all events
for each
gap… for no reason whatsoever! This is a linear scan here. If there are
300 gaps for this linked chunk, and 5000 events, 1.5 millions events will be
scanned for
no reason
!
How lovingly inefficient.
faster
Let's use an
INDEX
I hear you say (let's pretend
you're saying that, please, for the sake of the narrative!).
A database index provides rapid lookups after all. It has become a reflex
amongst the developer community.
Le Procureur
Indexes are designed to quickly locate data without scanning the full table. An
index contains a copy of the data, organised in a way enabling very efficient
search. Behind the scene, it uses various data structures, involving trade-offs
between lookup performance and index size. Most of the time, an index makes it
possible to transform a linear lookup,
,
to a logarithmic lookup,
.
See
Database index
to learn more.
That's correct. But we didn't want to use an index here. The reason is twofold:
More spaces
. Remember that
Le Procureur
said an index contains a
copy
of the data. Here, the data is
the event ID
. It's not heavy, but
it's not nothing. Moreover, we are not counting the
key
to associate the
copied data
to the row containing the real data in the source table.
Still extra useless time
. We would still need to traverse the index
for gaps, which is pointless.
SQLite implements indexes as
B-Trees
, which is really efficient, but still, we already know
that a gap has zero event because… it's… a gap between events!
Do you remember that the
linked_chunks
table has a
type
column? It contains
E
when the chunk is of kind
ChunkContent::Items
—it represents a set of
events—, and
G
when of kind
ChunkContent::Gap
—it represents a gap—. Maybe…
stare into the void
Le Factotum
May I interrupt?
Do you know that SQLite provides
a
CASE
expression
? I know it's
unusual. SQL designers prefer to think in terms of sets, sub-sets, joins,
temporal tables, partial indexes… but honestly, for what I'm concerned, in
our case, it's simple enough and it can be powerful. It's a maddeningly
pragmatic
match
statement.
Moreover, the
type
column is already typed as an enum with the
CHECK("type" IN ('E', 'G'))
constraint. Maybe the SQL engine can run some even smarter
optimisations for us.
Oh, that would be brilliant! If
type
is
E
, we count the number of events,
otherwise we conclude it's
de facto
zero, isn't it? Let's try. The SQL query
then becomes:
SELECTlc.id,
lc.previous,
lc.next,
CASElc.typeWHEN'E'THEN (
SELECTCOUNT(ec.event_id)
FROM event_chunks as ec
WHEREec.chunk_id=lc.id )
ELSE0ENDas number_of_events
FROM linked_chunks as lc
WHERElc.linked_chunk_id= ?
Since we have spotted the problem, we have written a benchmark to measure the
solutions. The benchmark simulates 10'000 events, with 1 gap every 80 events.
A set of data we consider
realistic
somehow for a normal user (not for a
power-user though, because a power-user has usually more gaps than events). Here
are the before/after results.
Lower bound
Estimate
Upper bound
Throughput
19.832 Kelem/s
19.917 Kelem/s
19.999 Kelem/s
0.0880234
0.1157540
0.0857823
Mean
500.03 ms
502.08 ms
504.24 ms
Std. Dev.
2.2740 ms
3.6256 ms
4.1963 ms
Median
498.23 ms
500.93 ms
506.25 ms
MAD
129.84 µs
4.1713 ms
6.1184 ms
Benchmark's results for the original query with
COUNT
and
LEFT JOIN
.
The Probability Distribution Function graph, and the Iteration times graph for
the
LEFT JOIN
approach
Benchmark's Probability Distribution Function for the
LEFT JOIN
approach.
Benchmark's Iteration Times for the
LEFT JOIN
approach.
Lower bound
Estimate
Upper bound
Throughput
251.61 Kelem/s
251.84 Kelem/s
251.98 Kelem/s
0.9999778
0.9999833
0.9999673
Mean
39.684 ms
39.703 ms
39.726 ms
Std. Dev.
8.8237 µs
35.948 µs
47.987 µs
Median
39.683 ms
39.691 ms
39.725 ms
MAD
1.9369 µs
13.000 µs
50.566 µs
Benchmark's results for the new query with the
CASE
expression.
The Probability Distribution Function graph, and the Linear Regression graph
for the
CASE
approach
Benchmark's Probability Distribution Function for the
CASE
approach.
Benchmark's Linear Regression for the
CASE
approach.
It's clearly better, but we couldn't stop ourselves. Having spotted the problem,
and having found this solution, it has made us creative! We have noticed that
we are running one query per chunk of kind
ChunkContent::Items
. If the linked
chunk contains 100 chunks, it will run 101 queries.
Then suddenly,
hit forehead with the hand's palm
, an idea pops! What if
we could only use 2 queries for all scenarios!
The first query would count all events for each chunk in
events_chunk
in
one pass, and would store that in a
HashMap
,
The second query would fetch all chunks also in one pass,
Finally, Rust will fill the number of events for each chunk based on the data
in the
HashMap
.
The first query translates like so in Rust:
// The first query.
let number_of_events_by_chunk_ids = transaction
.prepare(r#" SELECT
ec.chunk_id,
COUNT(ec.event_id)
FROM event_chunks as ec
WHERE ec.linked_chunk_id = ?
GROUP BY ec.chunk_id
"#,)?.query_map((&hashed_linked_chunk_id,),|row|{Ok(( row.get::<_, u64>(0)?, row.get::<_, usize>(1)?))})?.collect::<Result<HashMap<_, _>, _>>()?;
Only two queries. All tests are passing. Now let's see what the benchmark has to say!
Lower bound
Estimate
Upper bound
Throughput
4.1490 Melem/s
4.1860 Melem/s
4.2221 Melem/s
0.9961591
0.9976310
0.9960356
Mean
2.3670 ms
2.3824 ms
2.3984 ms
Std. Dev.
16.065 µs
26.872 µs
31.871 µs
Median
2.3556 ms
2.3801 ms
2.4047 ms
MAD
3.8003 µs
36.438 µs
46.445 µs
Benchmark's results for the two queries approach.
The Probability Distribution Function graph, and the Linear Regression graph
for the two queries approach
Benchmark's Probability Distribution Function for the two queries approach.
Benchmark's Linear Regression for the two queries approach.
It is
faster compared to the previous
solution, so
faster than the first query!
We went from 502ms to 2ms. That's mental! From a throughput of 19.9 Kelem/s
to 4.2 Melem/s!
You can see the patches containing the improvement
.
The throughput is measured by
element
, where an
element
here represents
a Matrix event. Consequently, 4 Melem/s means 4 millions events per second,
which means that
load_all_chunks_metadata
can do its computation at a rate of
4 millions events per second.
I think we can stop here. Performance are finally acceptable.
Notice how the SQL tables layout didn't change. Notice how the
LinkedChunk
implementation didn't change. Only the SQL queries have changed, and it has
dramatically improved the situation.
In the hours immediately after the conservative activist Charlie Kirk
was shot and killed
in front of a large crowd of students at a Utah university on Wednesday, there was no word on who had actually done it and no explanation for why it had happened. But, in Washington, those who profess certainty no longer need much in the way of facts: partisans come equipped with preëxisting truths, and events are slotted into narratives that existed long before the events occurred. Even before Kirk’s death had been confirmed, Nancy Mace, a Republican congresswoman from South Carolina, spoke to reporters outside the Capitol. “Democrats own what happened today,” she told them. When Ryan Nobles, the chief Capitol Hill correspondent for NBC News, asked her if, by that logic, Republicans would own the shooting this summer of two Minnesota Democratic lawmakers, she replied, “Are you kidding me? . . . Some raging leftist lunatic put a bullet through his neck and you want to talk about Republicans right now? No. . . . Democrats own this a hundred per cent.”
In a different time, it might have been easier to dismiss Mace as just playing to the cameras, and to take heart instead from the many statements rejecting political violence and expressing shock, horror, and solidarity that were already rolling in from Democrats and Republicans alike. Vice-President J.D. Vance offered a heartfelt eulogy on X, calling the thirty-one-year-old political provocateur, who had been his close friend, an exemplar of “a foundational virtue of our Republic: the willingness to speak openly and debate ideas.” But the visceral rage channelled by Mace was not an outlier. On the House floor, when Speaker Mike Johnson called for a moment of silent prayer for Kirk, members from both parties rose from their seats and the brief hush suggested that at least some of the old habits of ritual bipartisanship in a crisis might still be intact. Then a shouting match erupted, with
Lauren Boebert
, a Colorado Republican, loudly demanding more than a silent prayer and various Democrats objecting that there had been no prayer offered for students in a mass shooting that same day in Colorado. Anna Paulina Luna, a Florida Republican, shouted back at the Democrats, “You all caused this.”
A few hours later,
Donald Trump
reacted to Kirk’s death, in a four-minute Oval Office
video
that he posted on his social-media feed. There would be no Joe Biden-esque lectures about “the need for us to lower the temperature in our politics,” or about how, while “we may disagree, we are not enemies.” (Which was what Biden actually
said
when
Trump was grazed
by a would-be assassin’s bullet in the summer of 2024.) Instead, Trump explicitly laid blame for what he called a “heinous assassination” on his and Kirk’s political opponents. He neither cited any evidence nor seemed to think that any was necessary. He made no mention of any of the political attacks in recent years that have claimed Democratic victims, including, earlier this summer,
the shooting of two Minnesota state legislators
, one of whom died.
“For years, those on the radical left have compared wonderful Americans like Charlie to Nazis and the world’s worst mass murderers and criminals. This kind of rhetoric is directly responsible for the terrorism that we are seeing in the country today, and it must stop right now,” Trump said, before offering a list of other victims of “radical-left political violence,” including himself. He promised swift action to take down the perpetrators of such violence as well as “organizations” that fund and promote it. Trump’s remarkable threat somehow did not get much attention. It should have. Not only was the President not even trying to unite the country but he seemed to be blaming the large chunk of the nation that reviles his racially divisive policies and those promoted by Kirk as surely as if they had pulled the trigger.
Some of Trump’s most influential allies and advisers were clarifying what this could mean by explicitly calling for a crackdown on the American left—hardly consistent with the spirit of free expression that Kirk used as his rallying cry for recruiting a new generation of young conservatives. “It’s time for the Trump administration to shut down, defund, & prosecute every single Leftist organization,” Laura Loomer, a far-right conspiracy theorist who has successfully pushed Trump to fire a number of senior national-security officials, wrote on X. “We must shut these lunatic leftists down. Once and for all. The Left is a national security threat.”
Christopher Rufo
, another influential Trumpist, who led the move against diversity initiatives that eventually became a
core tenet
of the second Trump Administration, invoked the political convulsions of
the nineteen-sixties
. “The last time the radical Left orchestrated a wave of violence and terror, J. Edgar Hoover shut it all down within a few years,” he wrote. “It is time, within the confines of the law, to infiltrate, disrupt, arrest, and incarcerate all of those who are responsible for this chaos.”
And in case there was any mistaking the official view of such pronouncements, Trump’s deputy chief of staff Stephen Miller on Thursday joined in from the West Wing, promising in a lengthy post on X to wage war on the “wicked ideology” that had killed Kirk and the proponents of it who, he claimed, were online cheering Kirk’s death. “The fate of our children, our society, our civilization hinges on it,” Miller added. Dialing it down, they were not.
It was purely a sad coincidence that Kirk’s killing happened to fall just a day before September 11th, when Trump would be marking the twenty-fourth anniversary of the attacks on the United States. The destruction of the Twin Towers in New York by Osama bin Laden and his band of Islamic extremists brought forth the George W. Bush Administration’s “global war on terror”—another war against an ism that first motivated Miller and many other young conservatives to become politically active in the early two-thousands. Back in his student days, Miller launched a project to warn against the threat of “Islamofascism,” and portrayed the United States as having been forced into a worldwide conflict with radical Islamic jihadist ideology.
How striking it is, then, to read Miller’s manifesto about what he considers to be today’s chief threat, which, like much of Trump and his
MAGA
movement’s current rhetoric, is focussed not against external adversaries such as Russia and China but on the scary prospect of a violent enemy within, “an ideology that has been steadily growing in this country which hates everything that is good, righteous and beautiful and celebrates everything that is warped, twisted and depraved,” as Miller called it.
Although it’s fair to point out that much of what Miller wrote about today’s leftists in response to Kirk’s death is similar to what he might have said about Islamic terrorists a couple of decades ago, it’s not Miller’s lack of creativity that stands out, so much as the speed and explicitness with which he—and Trump—chose to exploit the shooting of one of their most important allies in service of a sweeping attack on the American political left.
While others were praying for a sane conversation around how to end the rapidly escalating problem of violence across the political spectrum, the President and his close adviser defined the crisis differently: it was about the American right under siege—and what Trump was going to do about it. The point here was clear for those who chose to listen: the President doesn’t care one bit about all those sanctimonious calls for healing. It is not a dialogue about the crisis of political violence in America that he wants right now but an aggressive new policy of political vengeance.
Glasser has served as the top editor of several Washington publications, including Politico, where she founded the award-winning Politico Magazine, and Foreign Policy, which won three National Magazine Awards, among other honors, during her tenure as editor-in-chief. Before that, she worked for a decade at the Washington Post, where she was the editor of Outlook and national news. She also oversaw coverage of the impeachment of Bill Clinton, served as a reporter covering the intersection of money and politics, spent four years as the Post’s Moscow co-bureau chief, and covered the wars in Iraq and Afghanistan. She started her journalism career in the nineteen-eighties, as an intern at the Capitol Hill newspaper Roll Call, which she later edited.
Since its founding, in 1925,
The New Yorker
has evolved from a Manhattan-centric “fifteen-cent comic paper”—as its first editor, Harold Ross, put it—to a multi-platform publication known worldwide for its in-depth reporting, political and cultural commentary, fiction, poetry, and humor.
Today, The New Yorker continues to stand apart for its rigor, fairness, and excellence, and for its singular mix of stories that surprise, delight, and inform.
Sign up
for our daily newsletter to receive the best stories from
The New Yorker
.
CISA warns of actively exploited Dassault RCE vulnerability
Bleeping Computer
www.bleepingcomputer.com
2025-09-12 17:19:39
The U.S. Cybersecurity and Infrastructure Security Agency (CISA) is warning of hackers exploiting a critical remote code execution flaw in DELMIA Apriso, a manufacturing operations management (MOM) and execution (MES) solution from French company Dassault Systèmes. [...]...
The U.S. Cybersecurity and Infrastructure Security Agency (CISA) is warning of hackers exploiting a critical remote code execution flaw in DELMIA Apriso, a manufacturing operations management (MOM) and execution (MES) solution from French company Dassault Systèmes.
The agency added the vulnerability, tracked as
CVE-2025-5086
and rated with a critical severity score (CVSS v3: 9.0), to the Known Exploited Vulnerabilities (KEV).
DELMIA Apriso is used in production processes for digitalizing and monitoring. Enterprises worlwide rely on it to schedule production, for quality management, allocate resources, warehouse management, and for integration between production equipment and business applications.
It is typically deployed in automotive, aerospace, electronics, high-tech, and industrial machinery divisions, where high quality control, traceability, compliance, and a high level of process standardization are critical.
The flaw is a deserialization of untrusted data vulnerability that may lead to remote code execution (RCE).
The vendor
disclosed the issue
on June 2, noting that it impacts all versions of DELMIA Apriso from Release 2020 through Release 2025, without sharing many details.
On September 3, threat researcher
Johannes Ullrich
published a post on SANS ISC disclosing observation of active exploitation attempts leveraging CVE-2025-5086.
The observed exploit involves sending a malicious SOAP request to vulnerable endpoints that loads and executes a Base64-encoded, GZIP-compressed .NET executable embedded in the XML.
The actual payload is a Windows executable tagged as malicious by
Hybrid Analysis
and flagged only by one engine in
VirusTotal
.
The malicious requests were observed originating from the IP 156.244.33[.]162, likely associated with automated scans.
CISA has not linked to the Ullrich report, so it is unclear if this is the report that prompted them to
add CVE-2025-5086 to KEV
, or if they had a separate source confirming exploitation.
The U.S. government agency is now giving the federal enterprise sector until October 2 to apply available security updates or mitigations, or stop using DELMIA Apriso.
Although the BOD 22-01 guidance is binding only for federal agencies, private organizations around the world should also consider CISA’s warning and take appropriate action.
Windows 11 23H2 Home and Pro reach end of support in 60 days
Bleeping Computer
www.bleepingcomputer.com
2025-09-12 17:15:22
Microsoft has reminded customers today that devices running Home and Pro editions of Windows 11 23H2 will stop receiving updates in November. [...]...
Microsoft has reminded customers today that devices running Home and Pro editions of Windows 11 23H2 will stop receiving updates in November.
Enterprise and Education editions will continue to receive mainstream support for an additional year, until November 10, 2026, as stated on the
Windows release health dashboard
.
"On November 11, 2025, Windows 11, version 23H2 (Home and Pro editions) will reach end of servicing. The November 2025 monthly security update will be the last update available for these editions,"
Microsoft said
in a message center update on Friday.
"After this date, devices running these editions will no longer receive monthly security and preview updates containing protections from the latest security threats."
Windows 11 23H2 users are advised to upgrade their systems to Windows 11 24H2 (also known as the Windows 11 2024 Update), the latest version of Windows 11, which became generally available
to eligible Windows 11 22H2/23H2 devices
in October 2024,
following its rollout
to Windows Insider enterprise customers in May 2024.
In July, Microsoft
also reminded users
that the last supported Windows 11 22H2 editions will reach their end of servicing on October 14.
You can find more details about Windows end-of-service dates on the Windows Lifecycle FAQ page or using the Lifecycle Policy search tool. Microsoft also provides a list of
products that will be retired
or reach the end of support soon.
As AI becomes more integrated into our lives, building it with privacy at its core is a critical frontier for the field.
Differential privacy
(DP) offers a mathematically robust solution by adding calibrated noise to prevent memorization. However, applying DP to LLMs introduces trade-offs. Understanding these trade-offs is crucial. Applying DP noise alters traditional
scaling laws
— rules describing performance dynamics — by reducing training stability (the model's ability to learn consistently without experiencing catastrophic events like loss spikes or divergence) and significantly increasing batch size (a collection of input prompts sent to the model simultaneously for processing) and computation costs.
Our new research, “
Scaling Laws for Differentially Private Language Models
”, conducted in partnership with Google DeepMind, establishes laws that accurately model these intricacies, providing a complete picture of the compute-privacy-utility trade-offs. Guided by this research, we’re excited to introduce VaultGemma, the largest (1B-parameters), open model trained from scratch with differential privacy. We are releasing the weights on
Hugging Face
and
Kaggle
, alongside a
technical report
, to advance the development of the next generation of private AI.
Understanding the scaling laws
With a carefully thought-out experimental methodology, we aimed to quantify the benefit of increasing model sizes, batch sizes, and iterations in the context of DP training. Our work required making some simplifying assumptions to overcome the exponential number of combinations one might consider trying. We assumed that how well the model learns depends mostly on the "noise-batch ratio” which compares the amount of random noise we add for privacy to the size of the data groups (batches) we use for training. This assumption works because the privacy noise we add is much greater than any natural randomness that comes from sampling the data.
To establish a DP scaling law, we conducted a comprehensive set of experiments to evaluate performance across a variety of model sizes and noise-batch ratios. The resulting empirical data, together with known deterministic relationships between other variables, allows us to answer a variety of interesting scaling-laws–style queries, such as, “For a given compute budget, privacy budget, and data budget, what is the optimal training configuration to achieve the lowest possible training loss?”
Key findings: A powerful synergy
Before diving into the full scaling laws, it’s useful to understand the dynamics and synergies between the compute budget, privacy budget, and data budget from a privacy accounting perspective — i.e., understand how these factors influence the noise-batch ratio for a fixed model size and number of iterations. This analysis is significantly cheaper to do as it does not require any model training, yet it yields a number of useful insights. For instance, increasing the privacy budget in isolation leads to diminishing returns, unless coupled with a corresponding increase in either the compute budget (
FLOPs
) or data budget (tokens).
To explore this synergy further, the visualization below shows how the optimal training configuration changes based on different constraints. As the privacy and compute budgets change, notice how the recommendation shifts between investing in a larger model versus training with larger batch sizes or more iterations.
This data provides a wealth of useful insights for practitioners. While all the insights are reported in the paper, a key finding is that one should train a much smaller model with a much larger batch size than would be used without DP. This general insight should be unsurprising to a DP expert given the importance of large batch sizes. While this general insight holds across many settings, the optimal training configurations do change with the privacy and data budgets. Understanding the exact trade-off is crucial to ensure that both the compute and privacy budgets are used judiciously in real training scenarios. The above visualizations also reveal that there is often wiggle room in the training configurations — i.e., a range of model sizes might provide very similar utility if paired with the correct number of iterations and/or batch size.
Applying the scaling laws to build VaultGemma
The
Gemma
models are designed with responsibility and safety at their core. This makes them a natural foundation for developing a production-quality, DP-trained model like VaultGemma.
Algorithmic advancements: Training at scale
The scaling laws we derived above represent an important first step towards training a useful Gemma model with DP. We used the scaling laws to determine both how much compute we needed to train a compute-optimal 1B parameter Gemma 2-based model with DP, and how to allocate that compute among batch size, iterations, and sequence length to achieve the best utility.
One prominent gap between the research underlying the scaling laws and the actual training of VaultGemma was our handling of
Poisson sampling
, which is a central component of
DP-SGD
. We initially used a straightforward method of loading data in uniform batches but then switched to Poisson sampling to get the best privacy guarantees with the least amount of noise. This method posed two main challenges: it created batches of different sizes, and it required a specific, randomized order for processing the data. We solved this by using our recent work on
Scalable DP-SGD
, which allows us to process data in fixed-size batches — either by adding extra padding or trimming them — while still maintaining strong privacy protections.
Results
Armed with our new scaling laws and advanced training algorithms, we built VaultGemma, to date the largest (1B-parameters) open model fully pre-trained with differential privacy with an approach that can yield high-utility models.
From training VaultGemma, we found our scaling laws to be highly accurate. The final training loss of VaultGemma was remarkably close to what our equations predicted, validating our research and providing the community with a reliable roadmap for future private model development.
We also compare downstream performance of our model against its non-private counterpart across a range of standard academic benchmarks (i.e.,
HellaSwag
,
BoolQ
,
PIQA
,
SocialIQA
,
TriviaQA
,
ARC-
C,
ARC-
E ). To put this performance in perspective and quantify the current resource investment required for privacy, we also include a comparison to an older similar-sized GPT-2 model, which performs similarly on these benchmarks. This comparison illustrates that today’s private training methods produce models with utility comparable to that of non-private models from roughly 5 years ago, highlighting the important gap our work will help the community systematically close.
Finally, the model comes with strong theoretical and empirical privacy protections.
Formal privacy guarantee
In general, both the privacy parameters (ε, δ) and the privacy
unit
are important considerations when doing DP training, as these together determine what the trained model can learn. VaultGemma was trained with a
sequence
-level DP guarantee of (ε ≤ 2.0, δ ≤ 1.1e-10), where a sequence consists of 1024 consecutive tokens extracted from heterogeneous data sources. Specifically, we used the same training mixture that was used to train the
Gemma 2
model, consisting of a number of documents of varying lengths. During pre-processing, long documents are split up and tokenized into multiple sequences, and shorter documents are packed together into a single sequence. While the sequence-level privacy unit was a natural choice for our training mixture, in situations where there is a clear mapping between data and users,
user-level differential privacy
would be a better choice.
What does this mean in practice? Informally speaking, because we provide protection at the sequence level, if information relating to any (potentially private) fact or inference occurs in a single sequence, then VaultGemma essentially does not know that fact: the response to any query will be statistically similar to the result from a model that never trained on the sequence in question. However, if many training sequences contain information relevant to a particular fact, then in general VaultGemma will be able to provide that information.
Empirical memorization
Sequence-level DP provably bounds the influence of any single training sequence (example) on the final model. We prompted the model with a 50-token prefix from a training document to see if it would generate the corresponding 50-token suffix. VaultGemma 1B shows no detectable memorization of its training data and successfully demonstrates the efficacy of DP training.
Conclusion
VaultGemma represents a significant step forward in the journey toward building AI that is both powerful and private by design. By developing and applying a new, robust understanding of the scaling laws for DP, we have successfully trained and released the largest open, DP-trained language model to date.
While a utility gap still exists between DP-trained and non-DP–trained models, we believe this gap can be systematically narrowed with more research on mechanism design for DP training. We hope that VaultGemma and our accompanying research will empower the community to build the next generation of safe, responsible, and private AI for everyone.
Acknowledgements
We'd like to thank the entire Gemma and Google Privacy teams for their contributions and support throughout this project, in particular, Peter Kairouz, Brendan McMahan and Dan Ramage for feedback on the blog post, Mark Simborg and Kimberly Schwede for help with visualizations, and the teams at Google that helped with algorithm design, infrastructure implementation, and production maintenance. The following people directly contributed to the work presented here (ordered alphabetically): Borja Balle, Zachary Charles, Christopher A. Choquette-Choo, Lynn Chua, Prem Eruvbetine, Badih Ghazi, Steve He, Yangsibo Huang, Armand Joulin, George Kaissis, Pritish Kamath, Ravi Kumar, Daogao Liu, Ruibo Liu, Pasin Manurangsi, Thomas Mesnard, Andreas Terzis, Tris Warkentin, Da Yu, and Chiyuan Zhang.
Corporations are trying, and now failing, to hide job openings from US citizens
We’re thrilled to announce that Ankit Gupta is joining YC as our newest General Partner.
Ankit has already worked with dozens of YC founders as a visiting partner in recent batches. He has a rare blend of deep machine learning expertise and firsthand startup experience that make him an ideal mentor for founders building at the bleeding edge of AI. Though he'll frequently commute to SF, Ankit will primarily be based out of Cambridge, Massachusetts — re-establishing a YC foothold at the site of our original office!
Ankit’s YC journey began in the Winter 2018 batch, when he co-founded Reverie Labs, a biotech company applying machine learning to drug discovery. The team at Reverie developed machine learning models for small-molecule design, partnered with pharmaceutical companies, and advanced their own medicines before eventually being acquired by Ginkgo Bioworks in 2024.
Before founding Reverie, Ankit earned a B.A. and M.S. in Computer Science at Harvard and published research at ICML and other conferences on deep learning and large-scale model training. It’s that blend of research and founder experience that makes Ankit uniquely equipped to help the next generation of AI startups succeed.
We are super excited to have him and are certain his expertise will continue to support YC founders at every stage. Welcome, Ankit!
Every day, thousands of researchers race to solve the
AI alignment problem
. But they struggle to coordinate on the basics, like whether a misaligned superintelligence will seek to destroy humanity, or just enslave and torture us forever. Who, then, aligns the aligners?
We do.
We are the world's first AI alignment alignment center, working to subsume the countless other AI centers, institutes, labs, initiatives and forums into one final AI center singularity.
completely un-
affiliated with these AI alignment organizations:
but our design agency said their logos would look good on our site
Until our next prediction of when AGI is coming
Updates
250,000 AI agents and 3 humans read our newsletter
Join them:
Reports
About CAAAC
At the forefront of AI thought thought leadership leadership
We think constantly about AI so that politicians and journalists don't have to.
Fiercely independent, we are backed by philanthropic funding from some of the world's biggest AI companies who also form a majority on our board.
This allows us to deliver policy solutions and legislation that can be implemented rapidly by lawmakers without the delay of democratic scrutiny, unless Trump has told them to stop regulating AI in which case our work is totally pointless.
New
Start your own AI center in under 60 seconds
We're excited to open source
CenterGen-4o
, the powerful tool behind the creation of CAAAC.
CenterGen uses generative AI to set you up as the Executive Director of a brand new AI center in less than a minute, zero AI knowledge required.
Choose an organization type:
Every second you don't subscribe, another potential future human life is lost
Stop being a mass murderer:
[$] Creating a healthy kernel subsystem community
Linux Weekly News
lwn.net
2025-09-12 16:50:51
Creating welcoming communities within open-source projects is a recurring
topic at conferences; those projects rely on contributions from others, so
making them welcome is important. The kernel has, rather infamously
over the years, been an oft-cited example of an unwelcoming project, though
there h...
Reader subscriptions are a necessary way
to fund the continued existence of LWN and the quality of its content.
If you are already an LWN.net subscriber, please log in
with the form below to read this content.
Please consider
subscribing to LWN
. An LWN
subscription provides numerous benefits, including access to restricted
content and the warm feeling of knowing that you are helping to keep LWN
alive.
(Alternatively, this item will become freely
available on September 25, 2025)
This post isn’t about the virtues of some editors versus others:
that's already been written by somebody else
(and it’s really good) – if you want to know
why
I use emacs, I suggest reading that instead.
This
post will help you understand why "extensibility" and "introspectability" are such prominent emacs features even without an emacs lisp background.
Bridging the gap from
spacemacs
or
doom emacs
to a bespoke configuration wasn't easy for me because I didn’t know
how
to learn emacs, so I'm going to stumble through one of my own use cases to demonstrate how this process goes if you're peeking in from outside the emacs ecosystem,
horrified
curious about how this all works.
At my day job I write
our user documentation
using
Sphinx
.
It expects my stilted prose in
.rst
format, which is kind of like Markdown if you squint.
I do an awful lot of cross-referencing between
references
(or
refs
) to link concepts across the documentation.
You define a reference like this:
ReST
Font used for directives and roles.
Font used for all other defining constructs.
.. _code_example:.. code:: echo "HELP I'M TRAPPED IN A CODE EXAMPLE"
…and then link to it later like this:
ReST
Font used for field names and interpreted text.
Font used for directives and roles.
This :ref:`doesn't look like anything to me <code_example>`.
…or like this (if the
ref
is associated with a title of some sort):
ReST
Font used for field names and interpreted text.
Font used for directives and roles.
Don't say :ref:`code_example`.
My problem is that I have an
assload
of references across the all of the documentation and my brain cannot recall them on the spot.
What I really need is the ability to call up the list of references to easily discover and select from that list – this is basically auto-completion but for documentation headers (or titles).
Before we dig into emacs' guts, here are some principles that I learned after my first elisp experiments that might help somebody digging into this ecosystem for the first time:
1. Emacs
Wants
You to Extend It
I haven't written plugins for other editors extensively, but I
can
tell you this: emacs doesn't just make deep customization available, but it actively
encourages
you to make an absolute customization
messes
masterpieces.
Core editor functions aren't just documented, but often include tidbits about "you probably want to see this other variable" or "here's how you should use this".
Not only that, but emacs happily hands you functions shaped like nuclear warheads like
advice-add
(that let you override any function) that can absolutely obliterate your editor if you hold it the wrong way.
Of course, this also grants you
unlimited power
.
Remember that emacs is designed to be torn apart and rearranged.
2. Geriatric Software
The first public release of GNU emacs happened in 1985.
Literally
40 years
of development sits inside of emacs and its developers are
still
adding non-trivial features (native language server support landed in version 29 in 2023).
The ecosystem is vast and the language has evolved for a long time.
There's nearly always something useful if you need a particular piece of functionality, so even moreso than with other ecosystems: remember to do your homework first.
3. Lisp for for the un-Lisped
The syntax is polarizing, I know.
Gurus will wince when I get this wrong, but:
Writing lisp is like writing any other code, just with the parentheses wrapping
everything
instead of just
arguments
.
print
(
"Macrodata Refinement"
)
becomes
(
print
"Macrodata Refinement"
)
Sometimes you don't get functions, you get macros that behave special ways.
For example,
let
sets variables for an inner block of code.
Like this:
(
let
(
name
"Mark S."
)
(
print
name
)
)
Lispers say "this is actually data and not calling code" by doing this with single quotes:
'
(
"list"
"of"
"strings"
)
I'm out of my depth in lisp, but if you're a novice, those notes might help.
Extensible MACroS
With that prelude out of the way, let's begin.
Inside of emacs you can call up a list of potential
completions
by using the keyboard shortcut
M-.
(that’s "hit the meta key along with period", where "meta" is the
Alt
key for me).
This applies in a wide variety of scenarios, like when completing class names or variables.
If we want to ask emacs to hand us a list of potential references, then the system we want to hook into is this
completions
system.
(This is the only time I'll assume we know where to go without crawling through documentation. You could discover it yourself looking for "
completion
" or similar string in emacs docs).
To start our hero’s journey, we figure out what the hell
M-.
actually
does
.
We can ask emacs this by calling the function
describe-key
, which is bound to
C-h k
.
Hitting
Ctrl-h
, then
k
, then
M-.
drops us into a help buffer that looks like this:
M-. runs the command completion-at-point (found inevil-insert-state-map), which is an interactive native-compiled Lispfunction in ‘minibuffer.el’.It is bound to M-..(completion-at-point)Perform completion on the text around point.The completion method is determined by ‘completion-at-point-functions’. Probably introduced at or before Emacs version 23.2.
We have the next breadcrumb to follow, which is the variable
completion-at-point-functions
.
Running
completion-at-point
by hitting
M-.
consults that variable to hand us completion candidates, so we
describe-variable
it with
C-h v
and then choose
completion-at-point-functions
from the list of variables:
completion-at-point-functions is a variable defined in ‘minibuffer.el’.Its value is (cape-dict cape-file tags-completion-at-point-function)Special hook to find the completion table for the entity at point.Each function on this hook is called in turn without any argument andshould return either nil, meaning it is not applicable at point,or a function of no arguments to perform completion (discouraged),or a list of the form (START END COLLECTION . PROPS)
…and it goes on from there.
You can see some existing completion functions in there: I use a package called
cape
to offer helpful suggestions like file paths if I start typing in something like
./filename
.
The description for this variable instructs us about how to add our own functions (scary!)
You’ll note that emacs calls this a "hook", which is most often just a term used to describe a variable that is a list of functions that get called at a specific time (hooks show up everywhere).
I elided the full description for
completion-at-point-functions
– which is lengthy! – but if you parse it all out, you learn the following:
Your completion at point function should return either
nil
(the elisp "null") – which means your completion function doesn’t apply right now – or another function (which emacs discourages), or a list, which is what we’ll do because it sounds like the most-correct thing to do.
The list we return is
(
START END COLLECTION . PROPS
)
:
START
and
END
should be positions in the buffer between which emacs will replace the completed symbol with our candidate.
That is, if your cursor is calling a method on a Python object like
file.ope|
(where the bar is your cursor), emacs will replace just
ope
when you select
open
from a list of completions and not the entire
file.ope
string.
COLLECTION
is the juicy bit. The documentation calls it a completion "table", and there’s probably hidden meaning there, but you can just return a list of candidates and move on with your day, which is what I'll do.
Okay, so we need to write something to find the bounds of a string to replace and a function that returns that list.
Completions Abound
I fooled around with some regular expressions for a while until I did the
right
thing and examined how other completion backends do it.
If you have the package installed, the aforementioned
cape-file
function gives us a hint: hit
M-x
, then choose
find-function
, select
cape-file
, and poke around. You’ll find the use of a function called
bounds-of-thing-at-point
.
Describing it with
C-h f
bounds-of-thing-at-point
gives us:
Determine the start and end buffer locations for the THING at point.THING should be a symbol specifying a type of syntactic entity.Possibilities include ‘symbol’, ‘list’, ‘sexp’, ‘defun’, ‘number’,‘filename’, ‘url’, ‘email’, ‘uuid’, ‘word’, ‘sentence’, ‘whitespace’,‘line’, and ‘page’.
And
that
is useful for our
START
and
END
needs.
You can take it for a test drive at any time with
M-:
(
bounds-of-thing-at-point
'
word
)
to see where emacs thinks the
word
at your cursor starts and ends.
This is a common theme when developing elisp: try out functions all the time within the editor since they’re near at hand.
The argument to
bounds-of-thing-at-point
is a symbol for a literal
thing
that is predefined by the function
define-thing-chars
.
We pass
define-thing-chars
a name for our "thing" and a regex, and we can call
bounds-of-thing-at-point
with it from that point on.
The function documentation in
thingatpt.el
that emacs refers you to explains more if you’re interested.
define-thing-chars
expects a string with characters to put into a regex character class (like
[...]
) - just any valid character.
This is a pretty standard character class and we can start with something super simple.
I can’t be bothered to look up whatever the reStructedText spec is for references, but let’s start with "word characters, dashes, and underscores".
That expressed as a "thing" looks like this:
ELisp
Font used to highlight strings.
Font used to highlight keywords.
(define-thing-chars rst-ref "[:alpha:]_-")
Now we have a
thing
called
rst-ref
we can use with
bounds-of-thing-at-point
.
In typical emacs fashion, we can run elisp ad-hoc in our editor just to tinker, so let’s do that now.
Remember: we’re trying to write a function to give us the
start
and
end
of whatever piece of text we intend for a completion to replace.
Let’s try it out: in any sort of buffer, put a piece of fake
.rst
text with a reference, like this:
ReST
Font used for field names and interpreted text.
Font used for directives and roles.
This is a :ref:`other-reference`.
Place your point somewhere within "
other-reference
" and try out your
thing
:
M-:
(
bounds-of-thing-at-point
'
rst-ref
)
You’ll see something like
(
number . number
)
in the echo area (the little minibuffer at the bottom of the emacs window frame).
Congratulations!
We’ve got the first part of the problem solved.
Gathering Completions
Recall the structure of what our "completion backend" needs to return to emacs:
ELisp
(START END COLLECTION . PROPS)
We can construct
START
and
END
with
bounds-of-thing-at-point
, now we just need
COLLECTION
, which is a list of potential candidates.
Conceptually the task isn’t hard: we should find all instances of strings of the form:
ReST
Font used for all other defining constructs.
in our document and capture
my-reference
.
Where do we start?
Once again you can rely on discovery mechanisms like searching for functions that
sound
related (by browsing
describe-function
) or look at existing code.
Personally, I found this:
(re-search-forward REGEXP &optional BOUND NOERROR COUNT)Search forward from point for regular expression REGEXP.
The documentation refers you to some other related functions, like this one:
(match-beginning SUBEXP)Return position of start of text matched by last search.SUBEXP, a number, specifies which parenthesized expression in the lastregexp.
So we can
(
re-search-forward
)
for something then invoke
(
match-beginning
1
)
, for example, if we used a regex capture group to grab the reference’s label.
Cool: we can start there.
As you get deeper into elisp you’ll find that regular expressions are
everywhere
, and this case is no different.
We need a solid regex to search through a reStructuredText buffer (and honor any quirks in emacs’ regular expression engine), so we’ll use this opportunity to kick the tires on
interactively
developing regular expressions in emacs.
Regexes
Geriatric millennial software engineers like myself grew up on
https://regexr.com/
when it was still a Flash application.
Unless you’re a masochist that lives and breathes regular expressions, it’s kind of hard to develop a good regex without live feedback, which sites like
https://regexr.com/
provide.
Little did I know that emacs comes with its own live regular expression builder and it's goooood.
Within any emacs buffer, run
M-x
re-builder
to open the regex builder window split alongside the current buffer.
If I then enter the string
"re-
\\
(
builder
\\
)
"
into that buffer, that string a) gets highlighted in my original buffer and b) the capture group gets highlighted in its own unique group color.
You can do this all day long to fine-tune a regular expression, but there’s yet
another
trick when writing regular expressions, which is to use the
rx
macro.
My previous example regular expression
"re-
\\
(
builder
\\
)
"
works, but the quirks when writing emacs regular expressions pile up quickly: escaping characters is one example but there are more, too.
Instead, the
rx
macro will let you define a regular expression in lisp-y form and evaluate it into a typical string-based regular expression you can use normally, so it works any place emacs expects a string-based regular expression.
For example, if you evaluate this with
M-:
:
ELisp
Font used to highlight strings.
Font used to highlight keywords.
(rx"re-"(group "builder"))
This is what emacs returns:
ELisp
Font for backslashes in Lisp regexp grouping constructs.
Font used to highlight strings.
Identical!
The
rx
documentation
explains all the constructs available to you.
Jumping back to
re-builder
, with the
re-builder
window active, invoke
M-x
reb-change-syntax
and choose
rx
.
Now you can interactively build regular expressions with the
rx
macro!
In the
re-builder
window, you’ve got to enter a weird syntax to get it to take
rx
constructs (I’m… not sure why this is), but you end up with the same outcome:
ELisp
Font used to highlight strings.
'(:"re-"(group "builder"))
Watch the regex get highlighted live just as it was in the string-based regex mode.
To bring this full circle, hop into a buffer with an example
.rst
document like this one:
ReST
Font used for all other defining constructs.
Font used for the adornment of a section header.
Default font for section title text at level 1.
A Heading=========.. _my-reference:Link to me!
Using our newfound
re-builder
knowledge, let’s build a regex interactively to make short work of it:
Invoke
M-x
re-builder
Change the engine to something easier with
M-x
reb-change-syntax
and choose
rx
Start trying out solutions
I’ll refer here to the
rx constructs documentation
which lists out all the possibilities that you can plug into the
rx
macro.
Here’s a recorded example of what developing it looks like from start to finish, ending up with a functional
rx
construct:
Live-highlighting regex development.
Nice.
If you add more groups, more colors show up.
In this example the
rx constructs
I’m using are:
Any strings end up as literal matches
Special symbols
bol
and
eol
for "beginning of line" and "end of line", respectively
Symbols like
+
behave like their regex counterparts ("at least one")
Some symbols like
not
are nice little shortcuts (in this case, to negate the next form)
Because
rx
is a macro, we don’t ever actually
need
to compile its regular expressions to use elsewhere - we can always just use
rx
when we need a regex.
Gathering Completions: Continued
Okay, we've cut our teeth on emacs regular expressions.
Let's use 'em.
(Not our teeth. Regexes.)
To start, let's save our reStructuredText regular expression to find a
ref
so we can easily grab it later.
I'll save the one I came up with to the name
tmp/re
(this name is arbitrary, I drop temporary variables into
tmp/<name>
out of habit)
ELisp
Font used to highlight built-in function names.
Font used to highlight strings.
Font used to highlight keywords.
(setq tmp/re (rx bol ".."(+ blank)"_"(group (+(not":")))":" eol))
Now we can reference it easily.
I mentioned before that
re-search-forward
accepts a regex, so let's hop into a reStructuredText rev up the regex.
Here's my sample text that I'll work with:
ReST
Font used for directives and roles.
Font used for all other defining constructs.
Font used for the adornment of a section header.
Default font for section title text at level 1.
A Title=======Beware the Jabberwock, my son... _my-reference:You are like a little baby. Watch this... _code-sample:.. code:: python print("emacs needs telemetry")The end?
The
re-search-forward
documentation indicates that it starts at the
point
's current position, so head to the start of the buffer, hit
M-:
to enter the elisp
Eval
prompt, and try:
ELisp
Font used to highlight built-in function names.
(re-search-forward tmp/re)
This is anticlimactic because you'll just see the point move to the end of one of the references.
BUT.
This means that the search succeeded.
So… what now?
More reading in the
re-search-forward
documentation will educate you about emacs
global match data
.
In non-functional-programming style, functions like
match-beginning
and
match-end
serve to interrogate a global state that functions like
re-search-forward
will modify.
In concise terms, our regular expression defines one match group and we can grab it with
(
match-string-no-properties
1
)
to get the first group match (
match-string
will return a string with "properties", which is a bunch of data like font styling that we don't want).
Within our example buffer, executing this after the regex search should return our match:
ELisp
Font used to highlight function names.
(match-string-no-properties 1)
I see
"my-reference"
from this command.
Now we're cooking like it's 1985, baby.
You can enter the minibuffer again with
M-:
, press
↑
to find the
re-search-forward
command again, and repeat this process again to watch the point move to the next match, after which you can see the matched string with
match-string-no-properties
.
Note that running this a few times will eventually error out after no matches exist past your point.
We'll address this.
If you're a human (or Claude) at this point, you can see the path ahead – we need to write some elisp that will:
Move the point to the beginning of the buffer (important, remember that
re-search-forward
relies upon the current position of your point)
Iteratively execute an
re-search-forward
command to aggregate reference targets
Conclude when there aren't any more matches
I'll start with the code and then explain which demons the parentheses are summoning afterward:
ELisp
Font used to highlight function names.
Font used to highlight strings.
Font used to highlight special form names.
Font used to highlight built-in function names.
Font used to highlight keywords.
;; This function will save the current position of the cursor and then;; return it to this position once the code that it wraps has finished;; executing, which lets us hop around the buffer without driving the;; programmer insane. Important for any functions that move the point;; around.(save-excursion;; progn is a simple function that just executes each lisp form;; step-by-step.(progn;; Step one: go to the beginning of the buffer.(goto-char(point-min));; Step two: loop;;;; cl-loop is a macro with a long and venerable heritage stemming;; from the common lisp family of macros, which it mimics the;; behavior of. You could spend hours honing your ability to wield;; the common lisp `loop` macro, but we'll just explain the parts;; we're using:;;;; `while` runs the loop until its argument evalutates to a falsy;; value. We can overload our use of `re-search-forward` here: we;; can use it to step our loop forward each time and also rely;; upon it returning `nil` once it stops matching substrings in;; the buffer and we should finish up.(cl-loopwhile(re-search-forward(rx bol ".."(+ blank)"_"(group (+(not":")))":" eol);; The aforementioned `while` termination case;; relies upon this `t` parameter, which says;; "don't error out with no matches, just return;; nil". Once no more matches are found, the loop;; exits. nil t);; The `collect` keyword instructs `cl-loop` how to form;; its return value. We can helpfully summarize the regex;; match item by pulling out the global match data. collect (match-string-no-properties 1))))
Without belaboring the point, you can – like I did – discover most of these functions by skimming existing elisp code and using it as a launch pad.
Many of these functions are bog standard and show up all over the place in emacs packages (
save-excursion
,
progn
,
goto-char
…)
Here's the result when I run this code against our example
.rst
file:
ELisp
Font used to highlight strings.
("my-reference""code-sample")
Looks good!
Completing the Completion Backend
We're now armed with the ability to:
Identify the bounds of the string we want to replace, and
Collect a list of targets for completion candidates
We are
so close
.
Recall the description of the variable we need to modify:
completion-at-point-functions is a variable defined in ‘minibuffer.el’.Its value is (cape-dict cape-file tags-completion-at-point-function)Special hook to find the completion table for the entity at point.Each function on this hook is called in turn without any argument andshould return either nil, meaning it is not applicable at point,or a function of no arguments to perform completion (discouraged),or a list of the form (START END COLLECTION . PROPS)
To return the list that
completion-at-point-functions
expects, we already have the ability to identify the bounds of a
thing
and sweep up a list of candidates in our buffer.
Note the comment about returning
nil
: we probably don't
always
want to run our backend, so we should short-circuit our function to eagerly return nil to avoid tying up emacs with a regex loop we don't need.
Font used to highlight documentation embedded in program code.
It is typically used for special documentation comments or strings.
Font used to highlight function names.
Font used to highlight strings.
Font used to highlight keywords.
;; Our reStructuredText reference "thing"(define-thing-chars rst-ref "[:alpha:]_-")(defunmy/rst-internal-reference-capf()"Completion backend for buffer reStructuredText references";; Only applies when we're within a reference - outside of a;; reference, we bail out with nil.(when(looking-back(rx":ref:`"(*(not"`")))(point-at-bol));; Get potential bounds for the string to replace(let*((bounds (or(bounds-of-thing-at-point'rst-ref);; Fallback to the current position(cons(point)(point))))(start (car bounds))(end (cdr bounds));; Collect all reference candidates(candidates;; Our previously-noted reference collector(save-excursion(progn(goto-char(point-min))(cl-loopwhile(re-search-forward(rx bol ".."(+ blank)"_"(group (+(not":")))":" eol) nil t) collect (match-string-no-properties 1))))));; Return value suitable for `completion-at-point-functions`(list start end candidates))))
We're following some naming conventions by calling this a "
capf
" (a "completion-at-point function) and prefixing with
my/
(a habit to namespace your own functions)
Our short-circuit takes the form of using
looking-back
to ask, "are we inside of a reStructuredText reference"?
Note the use of
rx
here again to clean up our lisp.
We use our
rst-ref
thing
to easily snag the
start
and
end
of the string to replace – note our fallback to
just
the immediate point if we can't find the bounds of our
thing
.
We wrap it all up with
list
.
Personally, even as somebody relatively new to writing Lisps, I find the code pleasant to read and self-evident.
We did a lot in 17 lines of code!
Inside of our test
.rst
buffer, we can test drive this function.
First, invoke
M-x
eval-defun
with your cursor somewhere in the function to evaluate it, which makes
my/rst-internal-reference-capf
available.
Then run:
Huzzah!
Our function is now live in emacs' completion framework.
You can trigger the completion by calling
completion-at-point
at a relevant spot in a buffer.
Many batteries-included emacs distributions like spacemacs or doom emacs slap nice-looking porcelain on top of the completion framework; here's an example that uses the
corfu
package:
Congratulations, you've extended emacs for the first time!
Dressing Up the Bones
Okay, this is a pretty basic setup.
You could improve it in
many
ways, but here are a few ideas about potential directions:
Mode Hooks
Manually adding your custom completion function to the
completion-at-point-functions
hook is tedious, but there's a way to automate it.
Recall that in emacs parlance, a "hook" is usually a
variable
that holds a
list of functions
that get called at a
specific time
.
If you use
rst-mode
, then opening an
.rst
file will drop you into
rst-mode
and implicitly call the
rst-mode-hook
functions.
That means that this line is sufficient to integrate our completion function:
This says: "when I open an
.rst
file, run this lambda that modifies
completion-at-point-functions
only
for this buffer by adding my internal reference completion function".
It's a little nested which makes it less obvious with the two
add-hook
calls.
Other Files
Okay, our example works for references in the
same buffer
but this is sort of pointless for uses
across
files.
You can solve this too, although my post is already too long so we won't solve this step-by-step.
However, here's how
I
solved it:
Turn my
capf
into a minor mode that manages the completion variables
Doesn't search the buffer every time but instead does so once and then rebuilds it with a hook in
after-change-functions
, saving it to a hash cache
Walk all
.rst
files in the current project and run the reference collection function for each, storing the results into a hash cache for all files that don't have live buffers
When it comes time to call the completion function, combine the hash for completions for files without buffers along with each
.rst
buffer's cached list of references
It sounds complicated, but it works!
Functions like
with-temp-buffer
make this pretty easy by aggregating reference targets for files using the exact same function we do for live buffers.
Emacs' long history includes
company-mode
, which is a third-party completion framework that integrates with the
completion-at-point
set of functions.
Some
company-mode
features include additional metadata about completion candidates, and I found two that were useful:
company-kind
and
company-doc-buffer
.
company-kind
is a simple key that just tells the completion caller what the completion cadidate
is
.
In our case we can add some eye candy by indicating it's
'
text
.
company-doc-buffer
lets us add additional context to a completion candidate.
I leveraged this to include a couple of lines following the reference line to help me figure out what exactly the link refers to.
It's easier to show what this looks like rather than tell:
Notes:
I'm using GUI emacs here for the nicer completion popup with
corfu
which displays a transparent, floating frame
My completion candidate "context" is a real excerpt from the text around the reference, complete with styling, etc.
The small icon to the left of each candidate comes from the
company-kind
attribute.
Completion candidate context is an extra frill but very helpful.
Summary
My experience extending a core emacs function was an instructive and interesting exercise.
I don't know what the future of emacs looks like in an increasingly LLM-crazed world, but I hope that future includes an open and powerful way to extend and customize the tools we use to write software.
Show HN: An MCP Gateway to block the lethal trifecta
Connect AI to your data/software securely without risk of data exfiltration. Gain visibility, block threats, and get alerts on the data your agent is reading/writing.
OpenEdison solves the
lethal trifecta problem
, which can cause agent hijacking & data exfiltration by malicious actors.
Join our Discord
for feedback, feature requests, and to discuss MCP security for your use case:
discord.gg/tXjATaKgTV
📧 To get visibility, control and exfiltration blocker into AI's interaction with your company software, systems of record, DBs,
Contact us
to discuss.
Features ✨
🛑
Data leak blocker
- Edison automatically blocks any data leaks, even if your AI gets jailbroken
🕰️
Deterministic execution
- Deterministic execution. Guaranteed data exfiltration blocker.
🗂️
Easily configurable
- Easy to configure and manage your MCP servers
📊
Visibility into agent interactions
- Track and monitor your agents and their interactions with connected software/data via MCP calls
🔗
Simple API
- REST API for managing MCP servers and proxying requests
🐳
Docker support
- Run in a container for easy deployment
About Edison.watch 🏢
Edison helps you gain observability, control, and policy enforcement for all AI interactions with systems of records, existing company software and data. Prevent AI from causing data leakage, lightning-fast setup for cross-system governance.
Quick Start 🚀
The fastest way to get started:
# Installs uv (via Astral installer) and launches open-edison with uvx.# Note: This does NOT install Node/npx. Install Node if you plan to use npx-based tools like mcp-remote.
curl -fsSL https://raw.githubusercontent.com/Edison-Watch/open-edison/main/curl_pipe_bash.sh | bash
Run locally with uvx:
uvx open-edison
That will run the setup wizard if necessary.
⬇️ Install Node.js/npm (optional for MCP tools)
If you need
npx
(for Node-based MCP tools like
mcp-remote
), install Node.js as well:
uv:
curl -fsSL https://astral.sh/uv/install.sh | sh
Node/npx:
brew install node
uv:
curl -fsSL https://astral.sh/uv/install.sh | sh
Open Edison includes a comprehensive security monitoring system that tracks the "lethal trifecta" of AI agent risks, as described in
Simon Willison's blog post
:
Private data access
- Access to sensitive local files/data
Untrusted content exposure
- Exposure to external/web content
External communication
- Ability to write/send data externally
The configuration allows you to classify these risks across
tools
,
resources
, and
prompts
using separate configuration files.
In addition to trifecta, we track Access Control Level (ACL) for each tool call,
that is, each tool has an ACL level (one of PUBLIC, PRIVATE, or SECRET), and we track the highest ACL level for each session.
If a write operation is attempted to a lower ACL level, it is blocked.
🧰 Tool Permissions (
tool_permissions.json
)
Defines security classifications for MCP tools. See full file:
tool_permissions.json
, it looks like:
Tools
:
server_name/*
(e.g.,
filesystem/*
matches all filesystem tools)
Resources
:
scheme:*
(e.g.,
file:*
matches all file resources)
Prompts
:
type:*
(e.g.,
template:*
matches all template prompts)
Security Monitoring 🕵️
All items must be explicitly configured
- unknown tools/resources/prompts will be rejected for security.
Use the
get_security_status
tool to monitor your session's current risk level and see which capabilities have been accessed. When the lethal trifecta is achieved (all three risk flags set), further potentially dangerous operations are blocked.
Comcast Executives Warn Workers To Not Say The Wrong Thing About Charlie Kirk
403 Media
www.404media.co
2025-09-12 16:18:03
An email sent to NBCUniversal employees, including journalists at NBC, MSNBC, CNBC, Bravo and more, eulogizes Charlie Kirk as an "advocate for open debate" and reminds staff that even milquetoast statements about Kirk's death can result in their firing....
A company-wide email from Comcast executives, sent to everyone working at NBCUniversal on Friday morning, mourns right-wing pundit Charlie Kirk’s death and reminds employees that saying the wrong thing about Kirk’s legacy can get you fired swiftly.
The email, obtained by 404 Media and first
reported by Variety
, has the subject line “A message from Brian Roberts, Mike Cavanagh, and Mark Lazarus.” In it, the executives eulogize Kirk, calling him an “advocate for open debate, whose faith was important to him.”
Roberts is the Chairman and CEO of Comcast Corporation, Cavanagh is the president, and Lazarus is the prospective CEO of VERSANT,
Comcast’s new spinoff
that will include the majority of its NBCUniversal cable network portfolio. NBC, MSNBC, CNBC, Bravo and more journalistic and entertainment properties are under the NBCUniversal umbrella.
“You may have seen that MSNBC recently ended its association with a contributor who made an unacceptable and insensitive comment about this horrific event,” the executives wrote. “That coverage was at odds with fostering civil dialogue and being willing to listen to the points of view of those who have differing opinions.”
💡
Do you have information about how your company is speaking to employees about Charlie Kirk's death, or political speech in general? I would love to hear from you. Using a non-work device, you can message me securely on Signal at sam.404. Otherwise, send me an email at sam@404media.co.
Political analyst Matthew Dowd was fired from MSNBC on Wednesday after speaking about Kirk’s death on air. During a broadcast on Wednesday following the shooting, anchor Katy Tur asked Dowd about “the environment in which a shooting like this happens,”
according to Variety
. Dowd answered: “He’s been one of the most divisive, especially divisive younger figures in this, who is constantly sort of pushing this sort of hate speech or sort of aimed at certain groups. And I always go back to, hateful thoughts lead to hateful words, which then lead to hateful actions. And I think that is the environment we are in. You can’t stop with these sort of awful thoughts you have and then saying these awful words and not expect awful actions to take place. And that’s the unfortunate environment we are in.”
MSNBC president Rebecca Kutler
issued an apology
in response, calling Dowd’s words “inappropriate, insensitive and unacceptable.” Dowd also apologized publicly,
posting on Bluesky
: “On an earlier appearance on MSNBC I was asked a question on the environment we are in. I apologize for my tone and words. Let me be clear, I in no way intended for my comments to blame Kirk for this horrendous attack.”
MSNBC is a division of NBCUniversal. The letter from Comcast executives reiterates to current employees that their jobs are on the line if they stray from bland, milquetoast statements about a man who spent his life fomenting hate will have consequences on their careers. The entire mainstream media environment has been working overtime to sanitize Kirk’s legacy since his murder—
a legacy that includes
targeted harassment of professors at schools across the country and normalizing the notion that basic human rights are up for “debate.”
The full email is below.
Dear Comcast NBCUniversal Team,
The tragic loss of Charlie Kirk, a 31-year-old father, husband, and advocate for open debate, whose faith was important to him, reminds us of the fragility of life and the urgent need for unity in our nation. Our hearts are heavy, as his passing leaves a grieving family and a country grappling with division. There is no place for violence or hate in our society.
You may have seen that MSNBC recently ended its association with a contributor who made an unacceptable and insensitive comment about this horrific event. That coverage was at odds with fostering civil dialogue and being willing to listen to the points of view of those who have differing opinions. We should be able to disagree, robustly and passionately, but, ultimately, with respect. We need to do better.
Charlie Kirk believed that "when people stop talking, really bad stuff starts." Regardless of whether you agreed with his political views, his words and actions underscore the urgency to maintain a respectful exchange of ideas a principle we must champion. We believe in the power of communication to bring us together. Today, that belief feels more vital than ever. Something essential has fractured in our public discourse, and as a company that values the power of information, we have a responsibility to help mend it.
As employees, we ask you to embody our values in your work and communities. We should engage with respect, listen, and treat people with kindness.
About the author
Sam Cole is writing from the far reaches of the internet, about sexuality, the adult industry, online culture, and AI. She's the author of How Sex Changed the Internet and the Internet Changed Sex.
Hi @susam, I primarily know you as a Lisper, what other things do you use?
Yes, I use Lisp extensively for my personal projects, and much of what
I do in my leisure is built on it. I ran a mathematics pastebin
for close to thirteen years. It was quite popular on some IRC
channels. The pastebin wa...
Hi
@susam
, I primarily know you as a Lisper, what other things do you use?
Yes, I use Lisp extensively for my personal projects, and much of what
I do in my leisure is built on it. I ran a
mathematics pastebin
for close to thirteen years. It was quite popular on some IRC
channels. The pastebin was written in Common Lisp. My
personal
website
and blog are generated using a tiny static site generator
written in Common Lisp. Over the years I have built several other
personal tools in it as well.
I am an active Emacs Lisp programmer too. Many of my software tools
are in fact Emacs Lisp functions that I invoke with convenient key
sequences. They help me automate repetitive tasks as well as improve
my text editing and task management experience.
I use plenty of other tools as well. In my early adulthood, I spent
many years working with C, C++, Java, and PHP. My first substantial
open source contribution was to the Apache Nutch project which was in
Java, and one of my early original open source projects was Uncap, a C
program to remap keys on Windows.
These days I use a lot of Python, along with some Go and Rust, but
Lisp remains important to my personal work. I also enjoy writing
small standalone tools directly in HTML and JavaScript, often with all
the code in a single file in a readable, unminified form.
How did you first discover computing, then end up with Lisp, Emacs and mathematics?
As I mentioned earlier while discussing what makes computing fun for
me, I got introduced to computers through the Logo programming
language as a kid. Using simple arithmetic, geometry, logic, and code
to manipulate a two-dimensional world had a lasting effect on me.
I still vividly remember how I ended up with Lisp. It was at an
airport during a long layover in 2007. I wanted to use the time to
learn something, so I booted my laptop running Debian GNU/Linux 4.0
(Etch) and then started
GNU CLISP
2.41. In those days, Wi-Fi
in airports was uncommon. Smartphones and mobile data were also
uncommon. So it was fortunate that I had CLISP already installed on
my system and my laptop was ready for learning Common Lisp. I had it
installed because I had wanted to learn Common Lisp for some time. I
was especially attracted by its simplicity, by the fact that the
entire language can be built up from a very small set of special
forms. I use
SBCL
these days, by the way.
I discovered Emacs through Common Lisp. Several sources recommended
using the
Superior Lisp Interaction Mode for Emacs (SLIME)
for Common Lisp programming, so that’s where I began. For many years
I continued to use Vim as my primary editor, while relying on Emacs
and SLIME for Lisp development. Over time, as I learnt more about
Emacs itself, I grew fond of Emacs Lisp and eventually made Emacs my
primary editor and computing environment.
I have loved mathematics since my childhood days. What has always
fascinated me is how we can prove deep and complex facts using first
principles and clear logical steps. That feeling of certainty and
rigour is unlike anything else.
Over the years, my love for the subject has been rekindled many times.
As a specific example, let me share how I got into number theory. One
day I decided to learn the RSA cryptosystem. As I was working through
the
RSA paper
, I stumbled upon the Euler totient function
φ(n) which gives the number of positive integers not exceeding n that
are relatively prime to n. The paper first states that φ(p) = p - 1
for prime numbers p. That was obvious since p has no factors other
than 1 and itself, so every integer from 1 up to p - 1 must be
relatively prime to it. But then it presents φ(pq) = φ(p) · φ(q) =
(p - 1)(q - 1) for primes p and q. That was not immediately obvious
to me back then. After a few minutes of thinking, I managed to prove
it from scratch. By the inclusion-exclusion principle, we count how
many integers from 1 up to pq are not divisible by p or q. There are
pq integers in total. Among them, there are q integers divisible by
p, and p integers divisible by q. So we need to subtract p + q from
pq. But since one integer (pq itself) is counted in both groups, we
add 1 back. Therefore φ(pq) = pq - (p + q) + 1 = (p - 1)(q - 1).
Next I could also obtain the general formula for φ(n) for an arbitrary
positive integer n using the same idea. There are several other
proofs too, but that is how I derived the general formula for φ(n)
when I first encountered it. And just like that, I had begun to learn
number theory!
You’ve said you prefer computing for fun. What is fun to you? Do you have an
idea of what makes something fun or not?
For me, fun in computing began when I first learnt IBM/LCSI PC Logo when I was nine years old. I had very limited access to computers back
then, perhaps only about two hours per
month
in the computer
laboratory at my primary school. Most of my Logo programming happened
with pen and paper at home. I would “test” my programs by tracing the
results on graph paper. Eventually I would get about thirty minutes
of actual computer time in the lab to run them for real.
So back then, most of my computing happened without an actual
computer. But even with that limited access to computers, a whole new
world opened up for me: one that showed me the joy of computing, and
more importantly, the joy of sharing my little programs with my
friends and teachers. One particular Logo program I still remember
very well drew a house with animated dashed lines, where the dashes
moved around the outline of the house. Everyone around me loved it,
copied it, and tweaked it to change the colours, alter the details,
and add their own little touches.
For me, fun in computing comes from such exploration and sharing. I
enjoy asking “what happens if” and then seeing where it leads me. My
Emacs package
devil-mode
comes from such exploration. It came
from asking, “What happens if we avoid using the
Ctrl
and
Meta
modifier keys and use the comma key (
,
) or another suitable key as a
leader key instead? And can we still have a non-modal editing
experience?”
Sometimes computing for fun may mean crafting a minimal esoteric
drawing language, making a small game, or building a tool that solves
an interesting problem elegantly. It is a bonus if the exploration
results in something working well enough that I can share with others
on the World Wide Web and others find it fun too.
How do you choose what to investigate? Which most interest you, with what commonalities?
For me, it has always been one exploration leading to another.
For example, I originally built
MathB
for my friends and myself
who were going through a phase in our lives when we used to challenge
each other with mathematical puzzles. This tool became a nice way to
share solutions with each other. Its use spread from my friends to
their friends and colleagues, then to schools and universities, and
eventually to IRC channels.
Similarly, I built
TeXMe
when I was learning neural networks and
taking a lot of notes on the subject. I was not ready to share the
notes online, but I did want to share them with my friends and
colleagues who were also learning the same topic. Normally I would
write my notes in LaTeX, compile them to PDF, and share the PDF, but
in this case, I wondered, what if I took some of the code from MathB
and created a tool that would let me write plain Markdown (
GFM
) +
LaTeX (
MathJax
) in a
.html
file and have the tool render the
file as soon as it was opened in a web browser? That resulted in
TeXMe, which has surprisingly become one of my most popular projects,
receiving millions of hits in some months according to the CDN
statistics.
Another example is
Muboard
, which is a bit like an interactive
mathematics chalkboard. I built this when I was hosting an
analytic
number theory book club
and I needed a way to type LaTeX
snippets live on screen and see them immediately rendered. That made
me wonder: what if I took TeXMe, made it interactive, and gave it a
chalkboard look-and-feel? That led to Muboard.
So we can see that sharing mathematical notes and snippets has been a
recurring theme in several of my projects. But that is only a small
fraction of my interests. I have a wide variety of interests in
computing. I also engage in random explorations, like writing IRC
clients (
NIMB
,
Tzero
), ray tracing (
POV-Ray
,
Java
), writing Emacs guides (
Emacs4CL
,
Emfy
),
developing small single-HTML-file games (
Andromeda
Invaders
,
Guess My RGB
), purely recreational
programming (
FXYT
,
may4.fs
,
self-printing machine
code
,
prime number grid explorer
), and so on. The
list goes on. When it comes to hobby computing, I don’t think I can
pick just one domain and say it interests me the most. I have a lot
of interests.
What is computing, to you?
Computing, to me, covers a wide range of activities: programming a
computer, using a computer, understanding how it works, even building
one. For example, I once built a tiny 16-bit CPU along with a small
main memory that could hold only eight 16-bit instructions, using VHDL
and a Xilinx CPLD kit. The design was based on the Mano CPU
introduced in the book Computer System Architecture (3rd ed.) by
M. Morris Mano. It was incredibly fun to enter instructions into the
main memory, one at a time, by pushing DIP switches up and down and then
watch the CPU I had built myself execute an entire program. For
someone like me, who usually works with software at higher levels of
abstraction, that was a thrilling experience!
Beyond such experiments, computing also includes more practical and
concrete activities, such as installing and using my favourite Linux
distribution (Debian), writing software tools in languages like Common
Lisp, Emacs Lisp, Python, and the shell command language, or
customising my Emacs environment to automate repetitive tasks.
To me, computing also includes the abstract stuff like spending time
with abstract algebra and number theory and getting a deeper
understanding of the results pertaining to groups, rings, and fields,
as well as numerous number-theoretic results. Browsing the On-Line
Encyclopedia of Integer Sequences (OEIS), writing small programs to
explore interesting sequences, or just thinking about them is
computing too. I think many of the interesting results in computer
science have deep mathematical foundations. I believe much of
computer science is really discrete mathematics in action.
And if we dive all the way down from the CPU to the level of
transistors, we encounter continuous mathematics as well, with
non-linear voltage-current relationships and analogue behaviour that
make digital computing possible. It is fascinating how, as a
relatively new species on this planet, we have managed to take sand
and find a way to to use continuous voltages and currents in
electronic circuits built with silicon, and convert them into
the discrete operations of digital logic.
We have machines that can simulate themselves!
To me, all of this is fun. To study and learn about these things, to
think about them, to understand them better, and to accomplish useful
or amusing results with this knowledge is all part of the fun.
How do you view programming vs. domains?
I focus more on the domain than the tool. Most of the time it is a
problem that catches my attention, and then I explore it to understand
the domain and arrive at a solution. The problem itself usually
points me to one of the tools I already know.
For example, if it is about working with text files, I might write an
Emacs Lisp function. If it involves checking large sets of numbers
rapidly for patterns, I might choose C++ or Rust. But if I want to
share interactive visualisations of those patterns with others, I
might rewrite the solution in HTML and JavaScript, possibly with the
use of the Canvas API, so that I can share the work as a
self-contained file that others can execute easily within their web
browsers. When I do that, I prefer to keep the HTML neat and
readable, rather than bundled or minified, so that people who like to
‘View Source’ can copy, edit, and customise the code themselves, and
immediately see their changes take effect.
Let me share a specific example. While working on a game, I first
used
CanvasRenderingContext2D.fillText()
to display text on the
game. However, dissatisfied with the text rendering quality, I began
looking for IBM PC OEM fonts and similar retro fonts online. After
downloading a few font packs, I wrote a little Python script to
convert them to bitmaps (arrays of integers), and then used the
bitmaps to draw text on the canvas using JavaScript, one cell at a
time, to get pixel-perfect results! These tiny Python and JavaScript
tools were good enough that I felt comfortable sharing them together
as a tiny toolkit called
PCFace
. This toolkit offers JavaScript
bitmap arrays and tiny JavaScript rendering functions, so that someone
else who wants to display text on their game canvas using PC fonts and
nothing but plain HTML and JavaScript can do so without having to
solve the problem from scratch!
Has the rate of your making new emacs functions has
diminished over time (as if everything’s covered) or do the widening
domains lead to more? I’m curious how applicable old functionality
is for new problems and how that impacts the APIs!
My rate of making new Emacs functions has definitely decreased. There
are two reasons. One is that over the years my computing environment
has converged into a comfortable, stable setup I am very happy with.
The other is that at this stage of life I simply cannot afford the
time to endlessly tinker with Emacs as I did in my younger days.
More generally, when it comes to APIs, I find that well-designed
functionality tends to remain useful even when new problems appear.
In Emacs, for example, many of my older functions continue to serve me
well because they were written in a composable way. New problems can
often be solved with small wrappers or combinations of existing
functions. I think APIs that consist of functions that are simple,
orthogonal, and flexible age well. If each function in an API does
one thing and does it well (the Unix philosophy), it will have
long-lasting utility.
Of course, new domains and problems do require new functions and
extensions to an API, but I think it is very important to not give in
to the temptation of enhancing the existing functions by making them
more complicated with optional parameters, keyword arguments, nested
branches, and so on. Personally, I have found that it is much better
to implement new functions that are small, orthogonal, and flexible,
each doing one thing and doing it well.
What design methods or tips do you have, to increase composability?
For me, good design starts with good vocabulary. Clear vocabulary
makes abstract notions concrete and gives collaborators a shared
language to work with. For example, while working on a network events
database many years ago, we collected data minute by minute from
network devices. We decided to call each minute of data from a single
device a “nugget”. So if we had 15 minutes of data from 10 devices,
that meant 150 nuggets.
Why “nugget”? Because it was shorter and more convenient than
repeatedly saying “a minute of data from one device”. Why not
something less fancy like “chunk”? Because we reserved “chunk” for
subdivisions within a nugget. Perhaps there were better choices, but
“nugget” was the term we settled on, and it quickly became shared
terminology between the collaborators. Good terminology naturally
carries over into code. With this vocabulary in place, function names
like
collect_nugget()
,
open_nugget()
,
parse_chunk()
,
index_chunk()
,
skip_chunk()
, etc. immediately become meaningful to
everyone involved.
Thinking about the vocabulary also ensures that we are thinking about
the data, concepts, and notions we are working with in a deliberate
manner, and that kind of thinking also helps when we design the
architecture of software.
Too often I see collaborators on software projects jump straight into
writing functions that take some input and produce some desired
effect, with variable names and function names decided on the fly. To
me, this feels backwards. I prefer the opposite approach. Define the
terms first, and let the code follow from them.
I also prefer developing software in a layered manner, where complex
functionality is built from simpler, well-named building blocks. It
is especially important to avoid
layer violations
, where one complex
function invokes another complex function. That creates tight
coupling between two complex functions. If one function changes in
the future, we have to reason carefully about how it affects the
other. Since both are already complex, the cognitive burden is high.
A better approach, I think, is to identify the common functionality
they share and factor that out into smaller, simpler functions.
To summarise, I like to develop software with a clear vocabulary,
consistent use of that vocabulary, a layered design where complex
functions are built from simpler ones, and by avoiding layer
violations. I am sure none of this is new to the Lobsters community.
Some of these ideas also occur in
domain-driven design
(DDD).
DDD defines the term
ubiquitous language
to mean, “A language
structured around the domain model and used by all team members within
a bounded context to connect all the activities of the team with the
software.” If I could call this approach of software development
something, I would simply call it “vocabulary-driven development”
(VDD), though of course DDD is the more comprehensive concept.
Like I said, none of this is likely new to the Lobsters community. In
particular, I suspect Forth programmers would find it too obvious. In
Forth, it is very difficult to begin with a long, poorly thought-out
monolithic word and then break it down into smaller ones later. The
stack effects quickly become too hard to track mentally with that
approach. The only viable way to develop software in Forth is to
start with a small set of words that represent the important notions
of the problem domain, test them immediately, and then compose
higher-level words from the lower-level ones. Forth naturally
encourages a layered style of development, where the programmer thinks
carefully about the domain, invents vocabulary, and expresses complex
ideas in terms of simpler ones, almost in a mathematical fashion. In
my experience, this kind of deliberate design produces software that
remains easy to understand and reason about even years after it was
written.
Not enhancing existing functions but adding new small ones seems
quite lovely, but how do you come back to such a codebase later with
many tiny functions? At points, I’ve advocated for very large
functions, particularly traumatized by Java-esque 1000 functions in
1000 files approaches. When you had time, would you often
rearchitecture the conceptual space of all of those functions?
The famous quote from Alan J. Perlis comes to mind:
“It is better to have 100 functions operate on one data structure
than 10 functions on 10 data structures.”
Personally, I enjoy working with a codebase that has thousands of
functions, provided most of them are small, well-scoped, and do one
thing well. That said, I am not dogmatically opposed to large
functions. It is always a matter of taste and judgement. Sometimes
one large, cohesive function is clearer than a pile of tiny ones.
For example, when I worked on parser generators, I often found that
lexers and finite state machines benefited from a single top-level
function containing the full tokenisation logic or the full state
transition logic in one place. That function could call smaller
helpers for specific tasks, but we still need the overall
switch
-
case
or
if
-
else
or
cond
ladder somewhere. I think
trying to split that ladder into smaller functions would only make the
code harder to follow.
So while I lean towards small, composable functions, the real goal is
to strike a balance that keeps code maintainable in the long run.
Each function should be as small as it can reasonably be, and no
smaller.
Like you, I program as a tool to explore domains. Which do you know the most about?
For me too, the appeal of computer programming lies especially in how it
lets me explore different domains. There are two kinds of domains in
which I think I have gained good expertise. The first comes from
years of developing software for businesses, which has included
solving problems such as network events parsing, indexing and
querying, packet decoding, developing parser generators, database
session management, and TLS certificate lifecycle management. The
second comes from areas I pursue purely out of curiosity or for hobby
computing. This is the kind I am going to focus on in our
conversation.
Although computing and software are serious business today, for me, as
for many others, computing is also a hobby.
Personal hobby projects often lead me down various rabbit holes, and I
end up learning new domains along the way. For example, although I am
not a web developer, I learnt to build small, interactive single-page
tools in plain HTML, CSS, and JavaScript simply because I needed them
for my hobby projects over and over again. An early example is
QuickQWERTY
, which I built to teach myself and my friends
touch-typing on QWERTY keyboards. Another example is
CFRS[]
,
which I created because I wanted to make a total (non-Turing complete)
drawing language that has turtle graphics like Logo but is absolutely
minimal like P′′.
How do you approach learning a new domain?
When I take on a new domain, there is of course a lot of reading
involved from articles, books, and documentation. But as I read, I
constantly try to test what I learn. Whenever I see a claim, I ask
myself, “If this were wrong, how could I demonstrate it?” Then I
design a little experiment, perhaps write a snippet of code, or run a
command, or work through a concrete example, with the goal of checking
the claim in practice.
Now I am not genuinely hoping to prove a claim wrong. It is just a
way to engage with the material. To illustrate, let me share an
extremely simple and generic example without going into any particular
domain. Suppose I learn that Boolean operations in Python
short-circuit. I might write out several experimental snippets like
the following:
def t(): print('t'); return True
def f(): print('f'); return False
f() or t() or f()
And then confirm that the results do indeed confirm short-circuit
evaluation (
f
followed by
t
in this case).
At this point, one could say, “Well, you just confirmed what the
documentation already told you.” And that’s true. But for me, the
value lies in trying to test it for myself. Even if the claim holds,
the act of checking forces me to see the idea in action. That not
only reinforces the concept but also helps me build a much deeper
intuition for it.
Sometimes these experiments also expose gaps in my own understanding.
Suppose I didn’t properly know what “short-circuit” means. Then the
results might contradict my expectations. That contradiction would
push me to correct my misconception, and that’s where the real
learning happens.
Now this method cannot always be applied, especially if it is very
expensive or unwieldy to do so. For example, if I am learning
something in the finance domain, it is not always possible to perform
an actual transaction. One can sometimes use simulation software,
mock environments, or sandbox systems to explore ideas safely. Still,
it is worth noting that this method has its limitations.
In mathematics, though, I find this method highly effective. When I
study a new branch of mathematics, I try to come up with examples and
counterexamples to test what I am learning. Often, failing to find a
counterexample helps me appreciate more deeply why a claim holds and
why no counterexamples exist.
Do you have trouble not getting distracted with so much on your plate? I’m curious how you balance the time commitments of everything!
Indeed, it is very easy to get distracted. One thing that has helped
over the years is the increase in responsibilities in other areas of
my life. These days I also spend some of my free time studying
mathematics textbooks. With growing responsibilities and the time I
devote to mathematics, I now get at most a few hours each week for
hobby computing. This automatically narrows down my options. I can
explore perhaps one or at most two ideas in a month, and that
constraint makes me very deliberate about choosing my pursuits.
Many of the explorations do not evolve into something solid that I can
share. They remain as little experimental code snippets or notes
archived in a private repository. But once in a while, an exploration
grows into something concrete and feels worth sharing on the Web.
That becomes a short-term hobby project. I might work on it over a
weekend if it is small, or for a few weeks if it is more complex.
When that happens, the goal of sharing the project helps me focus.
I try not to worry too much about making time. After all, this is
just a hobby. Other areas of my life have higher priority. I also
want to devote a good portion of my free time to learning more
mathematics, which is another hobby I am passionate about. Whatever
little spare time remains after attending to the higher-priority
aspects of my life goes into my computing projects, usually a couple
of hours a week, most of it on weekends.
How does blogging mix in? What’s the development like of a single piece of curiosity through wrestling with the domain, learning and sharing it etc.?
Maintaining my personal website is another aspect of computing that I
find very enjoyable. My website began as a loose collection of pages
on a LAN site during my university days. Since then I have been
adding pages to it to write about various topics that I find
interesting. It acquired its blog shape and form much later when
blogging became fashionable.
I usually write a new blog post when I feel like there is some piece
of knowledge or some exploration that I want to archive in a
persistent format. Now what the development of a post looks like
depends very much on the post. So let me share two opposite examples
to describe what the development of a single piece looks like.
One of my most frequently visited posts is
Lisp in Vim
.
It started when I was hosting a Common Lisp programming club for
beginners. Although I have always used Emacs and SLIME for Common
Lisp programming myself, many in the club used Vim, so I decided to
write a short guide on setting up something SLIME-like there. As a
former long-time Vim user myself, I wanted to make the Lisp journey
easier for Vim users too. I thought it would be a 30-minute exercise
where I write up a README that explains how to install
Slimv
and
how to set it up in Vim. But then I discovered a newer plugin called
Vlime
that also offered SLIME-like features in Vim! That detail
sent me down a very deep rabbit hole. Now I needed to know how the
two packages were different, what their strengths and weaknesses were,
how routine operations were performed in both, and so on. What was
meant to be a short note turned into a nearly 10,000-word article. As
I was comparing the two SLIME-like packages for Vim, I also found a
few bugs in Slimv and contributed fixes for them (
#87
,
#88
,
#89
,
#90
). Writing this blog post turned into a month-long
project!
At the opposite extreme is a post like
Elliptical Python
Programming
. I stumbled upon Python
Ellipsis
while
reviewing someone’s code. It immediately caught my attention. I
wondered if, combined with some standard obfuscation techniques, one
could write arbitrary Python programs that looked almost like Morse
code. A few minutes of experimentation showed that a genuinely Morse
code-like appearance was not possible, but something close could be
achieved. So I wrote what I hope is a humorous post demonstrating
that arbitrary Python programs can be written using a very restricted
set of symbols, one of which is the ellipsis. It took me less than an
hour to write this post. The final result doesn’t look quite like
Morse code as I had imagined, but it is quite amusing nevertheless!
What draws you to post and read online forums? How do you balance
or allot time for reading technical articles, blogs etc.?
The exchange of ideas! Just as I
enjoy sharing my own computing-related thoughts, ideas, and projects,
I also find joy in reading what others have to share.
As I mentioned earlier, other areas of my life take precedence over
hobby projects. Similarly, I treat the hobby projects as higher
priority than reading technical forums.
After I’ve given time to the higher-priority parts of my life and to
my own technical explorations, I use whatever spare time remains to
read articles, follow technical discussions, and occasionally add
comments.
What’re your favorite math textbooks?
I have several favourite mathematics books, but let me share three I
remember especially fondly.
The first is
Advanced Engineering Mathematics
by Erwin Kreyszig. I
don’t often see this book recommended online, but for me it played a
major role in broadening my horizons. I think I studied the 8th
edition back in the early 2000s. It is a hefty book with over a
thousand pages, and I remember reading it cover to cover, solving
every exercise problem along the way. It gave me a solid foundation
in routine areas like differential equations, linear algebra, vector
calculus, and complex analysis. It also introduced me to Fourier
transforms and Laplace transforms, which I found fascinating.
Of course, the Fourier transform has a wide range of applications in
signal processing, communications, spectroscopy, and more. But I want
to focus on the fun and playful part. In the early 2000s, I was also
learning to play the piano as a hobby. I used to record my amateur
music compositions with
Audacity
by connecting my digital piano to
my laptop with a line-in cable. It was great fun to plot the spectrum
of my music on Audacity, apply high-pass and low-pass filters, and
observe how the Fourier transform of the audio changed and then hear
the effect on the music. That kind of hands-on tinkering made Fourier
analysis intuitive for me, and I highly recommend it to anyone who
enjoys both music and mathematics.
The second book is
Introduction to Analytic Number Theory
by Tom M.
Apostol. As a child I was intrigued by the prime number theorem but
lacked the mathematical maturity to understand its proof. Years
later, as an adult, I finally taught myself the proof from Apostol’s
book. It was a fantastic journey that began with simple concepts like
the Möbius function and Dirichlet products and ended with quite clever
contour integrals that proved the theorem. The complex analysis I had
learnt from Kreyszig turned out to be crucial for understanding those
integrals. Along the way I gained a deeper understanding of the
Riemann zeta function ζ(s). The book discusses zero-free regions
where ζ(s) does not vanish, which I found especially fascinating.
Results like ζ(-1) = -1/12, which once seemed mysterious, became
obvious after studying this book.
The third is
Galois Theory
by Ian Stewart. It introduced me to
field extensions, field homomorphisms, and solubility by radicals. I
had long known that not all quintic equations are soluble by radicals,
but I didn’t know why. Stewart’s book taught me exactly why. In
particular, it demonstrated that the polynomial t⁵ - 6t + 3 over the
field of rational numbers is not soluble by radicals. This particular
result, although fascinating, is just a small part of a much larger
body of work, which is even more remarkable. To arrive at this
result, the book takes us through a wonderful journey that includes
the theory of polynomial rings, algebraic and transcendental field
extensions, impossibility proofs for ruler-and-compass constructions,
the Galois correspondence, and much more.
One of the most rewarding aspects of reading books like these is how
they open doors to new knowledge, including things I didn’t even know
that I didn’t know.
How does the newer math jell with or inform past or present
computing, compared to much older stuff?
I don’t always think explicitly about how mathematics informs
computing, past or present. Often the textbooks I pick feel very
challenging to me, so much so that all my energy goes into simply
mastering the material. It is arduous but enjoyable. I do it purely
for the fun of learning without worrying about applications.
Of course, a good portion of pure mathematics probably has no
real-world applications. As G. H. Hardy famously wrote in
A
Mathematician’s Apology
:
I have never done anything ‘useful’. No discovery of mine has made,
or is likely to make, directly or indirectly, for good or ill, the
least difference to the amenity of the world.
But there is no denying that some of it does find applications. Were
Hardy alive today, he might be disappointed that number theory, his
favourite field of “useless” mathematics, is now a crucial part of
modern cryptography. Electronic commerce wouldn’t likely exist
without it.
Similarly, it is amusing how something as abstract as abstract algebra
finds very concrete applications in coding theory. Concepts such as
polynomial rings, finite fields, and cosets of subspaces in vector
spaces over finite fields play a crucial role in error-correcting
codes, without which modern data transmission and storage would not be
possible.
On a more personal note, some simpler areas of mathematics have been
directly useful in my own work. While solving problems for
businesses, information entropy, combinatorics, and probability theory
were crucial when I worked on gesture-based authentication about one
and a half decades ago.
Similarly, when I was developing Bloom filter-based indexing and
querying for a network events database, again, probability theory was
crucial in determining the parameters of the Bloom filters (such as
the number of hash functions, bits per filter, and elements per
filter) to ensure that the false positive rate remained below a
certain threshold. Subsequent testing with randomly sampled network
events confirmed that the observed false positive rate matched the
theoretical estimate quite well. It was very satisfying to see
probability theory and the real world agreeing so closely.
Beyond these specific examples, studying mathematics also influences
the way I think about problems. Embarking on journeys like analytic
number theory or Galois theory is humbling. There are times when I
struggle to understand a small paragraph of the book, and it takes me
several hours (or even days) to work out the arguments in detail with
pen and paper (lots of it) before I really grok them. That experience
of grappling with dense reasoning teaches humility and also makes me
sceptical of complex, hand-wavy logic in day-to-day programming.
Several times I have seen code that bundles too many decisions into
one block of logic, where it is not obvious whether it would behave
correctly in all circumstances. Explanations may sometimes be offered
about why it works for reasonable inputs, but the reasoning is often
not watertight. The experience of working through mathematical
proofs, writing my own, making mistakes, and then correcting them has
taught me that if the reasoning for correctness is not clear and
rigorous, something could be wrong. In my experience, once such code
sees real-world usage, a bug is nearly always found.
That’s why I usually insist either on simplifying the logic or on
demonstrating correctness in a clear, rigorous way. Sometimes this
means doing a case-by-case analysis for different types of inputs or
conditions, and showing that the code behaves correctly in each case.
There is also a bit of an art to reducing what seem like numerous or
even infinitely many cases to a small, manageable set of cases by
spotting structure, such as symmetries, invariants, or natural
partitions of the input space. Alternatively, one can look for a
simpler argument that covers all cases. These are techniques we
employ routinely in mathematics, and I think that kind of thinking and
reasoning is quite valuable in software development too.
When you decided to stop with MathB due to moderation burdens, I
offered to take over/help and you mentioned
others had too. Did anyone end up forking it, to your knowledge?
I first thought of shutting down the
MathB
-based pastebin website
in November 2019. The website had been running for seven years at
that time. When I announced my thoughts to the IRC communities that
would be affected, I received a lot of support and encouragement. A
few members even volunteered to help me out with moderation. That
support and encouragement kept me going for another six years.
However, the volunteers eventually became busy with their own lives
and moved on. After all, moderating user content for an open pastebin
that anyone in the world can post to is a thankless and tiring
activity. So most of the moderation activity fell back on me.
Finally, in February 2025, I realised that I no longer want to spend
time on this kind of work.
I developed MathB with a lot of passion for myself and my friends. I
had no idea at the time that this little project would keep a corner
of my mind occupied even during weekends and holidays. There was
always a nagging worry. What if someone posted content that triggered
compliance concerns and my server was taken offline while I was away?
I no longer wanted that kind of burden in my life. So I finally
decided to shut it down. I’ve written more about this in
MathB.in Is
Shutting Down
.
To my knowledge, no one has forked it, but others have developed
alternatives. Further, the
Archive Team
has
archived
all posts from the now-defunct MathB-based
website. A member of the Archive Team reached out to me over IRC and
we worked together for about a week to get everything successfully
archived.
re: QWERTY touch typing, you use double spaces after periods which
I’d only experienced from people who learned touch typing on
typewriters, unexpected!
Yes, I do separate sentences by double spaces. It is interesting that
you noticed this.
I once briefly learnt touch typing on typewriters as a kid, but those
lessons did not stick with me. It was much later, when I used a Java
applet-based touch typing tutor that I found online about two decades
ago, that the lessons really stayed with me. Surprisingly, that
application taught me to type with a single space between sentences.
By the way, I disliked installing Java plugins into the web browser,
so I wrote
QuickQWERTY
as a similar touch typing tutor in plain HTML and JavaScript
for myself and my friends.
I learnt to use double spaces between sentences first with Vim and
then later again with Emacs. For example, in Vim, the
joinspaces
option is on by default, so when we join sentences with the normal
mode command
J
, or format paragraphs with
gqap
, Vim inserts two
spaces after full stops. We need to disable that behaviour with
:set nojoinspaces
if we want single spacing.
It is similar in Emacs. In Emacs, the
delete-indentation
command
(
M-^
) and the
fill-paragraph
command (
M-q
) both insert two
spaces between sentences by default. Single spacing can be enabled
with
(setq sentence-end-double-space nil)
.
Incidentally, I spent a good portion of the README for my Emacs
quick-start DIY kit named
Emfy
discussing sentence spacing
conventions under the section
Single Space for Sentence
Spacing
. There I explain how to configure Emacs to
use single spaces, although I use double spaces myself. That’s
because many new Emacs users prefer single spacing.
The defaults in Vim and Emacs made me adopt double spacing. The
double spacing convention is also widespread across open source
software. If we look at the Vim help pages, Emacs built-in
documentation, or the Unix and Linux man pages, double spacing is the
norm. Even inline comments in traditional open source projects often
use it. For example, see Vim’s
:h usr_01.txt
, Emacs’s
(info “(emacs) Intro”)
, or the comments in the
GCC
source code
.
How and why do you use reference-style links? I’ve only seen them unrendered on HN with confusion.
I am typing out the reference-style links manually with a little help
from Emacs. For example, if I type the key sequence
C-c C-l
,
actually it is
, c , l
with
devil-mode
, Emacs invokes the
markdown-insert-link
command. Then I type the key sequence
[] M-j example RET https://example.com/ RET RET
to have Emacs insert the
following for me:
example
.
I normally use
reference links
in Markdown to save horizontal
space for my text. As you can see, I hard-wrap my paragraphs so that
no line exceeds 70 characters in length. Long URLs can break this
rule, since some are longer than 70 characters, but reference-style
links solve that problem. They let me keep paragraphs neatly wrapped,
and they also collect all URLs together at the bottom of the section.
I like the aesthetics of this style.
Of course, you are welcome to reformat the links however you like
while publishing your post on Lobsters! As a reader on Lobsters, I
don’t think I can tell which style you use. I’d also like to suggest
adding another link:
https://oeis.org/
for “On-Line Encyclopedia of
Integer Sequences”.
But not all of my setup is in the form of
.emacs
. Many of my Emacs
Lisp functions are spread out across numerous
.org
files. Each
.org
file is like a little workspace for a specific aspect of my
life. For example, there is one
.org
file for bookmarks, another
for checklists, another to keep track of my utility bills, another to
plan my upcoming trips, and so on. I have several Emacs Lisp source
blocks in these
.org
files to perform computations on the data in
these files, generate tables with derived values, and so on.
Doom-ada: Doom Emacs Ada language module with syntax, LSP and Alire support
When Imad Khachan saw the L-shaped Greenwich Village storefront in the fall of 1995, the NYU graduate student took it as a sign that renting it was his next move. The space mirrored how a knight, the board's only wayward traveler, moves on a chess board: two squares in one direction and one to the side.
For Khachan, the son of Palestinian refugees who discovered chess as a child growing up in war-ravaged Lebanon, his own journey had felt more like that of a pawn. Plus, he had just been locked out of the chess shop where he'd worked for six years, and which just happened to be located across the street from that empty storefront; the owner had told him to "get lost" when Khachan asked for his promised stake in the business. Here was his chance to wrest something that felt like control over his life.
Khachan, who had decided he was done with academia, decided to open his own store, Chess Forum, in that L-shaped space,
triggering
what locals dubbed the "Civil War on Thompson Street."
Thirty years later, Chess Forum—which has outlived all its rivals—is still at 219 Thompson Street, and Khachan, who turned 60 in August, is still at its helm.
We received multiple reports of a phishing campaign targeting crates.io users
(from the
rustfoundation.dev
domain name), mentioning a compromise of our
infrastructure and asking users to authenticate to limit damage to their crates.
These emails are malicious and come from a domain name not controlled by the
Rust Foundation (nor the Rust Project), seemingly with the purpose of stealing
your GitHub credentials. We have no evidence of a compromise of the crates.io
infrastructure.
We are taking steps to get the domain name taken down and to monitor for
suspicious activity on crates.io. Do not follow any links in these emails if you
receive them, and mark them as phishing with your email provider.
Thanks to my sponsors:
Astrid, James Brown, Geoffrey Thomas, Raphaël Thériault, Jon Gjengset, prairiewolf, Evan Relf, Chris, Romain Kelifa, Raine Godmaire, Kevin Nguyen, Marty Penner, Sarah Berrettini, Alex Rudy, Xirvik Servers, Cole Kurkowski, Adam Lassek, Alexandra Østermark, Michał Bartoszkiewicz, Walther
and
279
more
Some 154 million people get health insurance through their employer — and many could see their paycheck deductions surge next year. Some will likely also see co-pays and other out-of-pocket costs rise.
Jeff Chiu/AP
hide caption
toggle caption
Jeff Chiu/AP
The United States has the
most expensive health care
in the developed world. Now it's about to get even more expensive.
Some
154 million people
get health insurance through their employer — and many could see their paycheck deductions surge next year, by
6% to 7%
on average. Some will likely also see their out-of-pocket costs rise as employers pass along the spiking costs of care.
That's because employers will be paying a lot more — almost 9% more per employee on average, for the same level of coverage — to provide health benefits for their workers. Even after cutting or changing their health care benefits, employers are facing the biggest price increase in 15 years, according to a
new survey of more than 1,700 organizations
by Mercer, a benefits consultancy.
And 59% of those employers told Mercer they plan to pass those higher prices along to their workers in the form of "cost-cutting changes," such as higher deductibles, copays or other out-of-pocket costs, such as prices for filling prescriptions.
"It's almost a perfect storm that's hitting employers right now," says Larry Levitt, executive vice president for health policy at KFF, a health policy research nonprofit.
"The price of health care is going up faster than it has in a long time," he adds. "And typically when an employer is getting a big increase from an insurer, the employer is turning around and trying to pass on some or all of that to its workers."
The surging health benefit costs come at a time when consumers are still
feeling the hangover
of pandemic-era record inflation and are generally
uneasy
about the U.S. economy. Though inflation has cooled considerably in the past two years, prices are
starting to tick up again
, as many of President Trump's sweeping taxes on imports go into effect.
These soaring costs also underline a hidden-in-plain-sight truth about the
broken U.S. health care system
: For the
majority of Americans
under age 65, their employers ultimately decide how much they pay for health insurance and medical care.
Employers themselves are at the mercy of entities that have even more
market power
: Drug companies, pharmacy benefit managers, hospitals and others have collectively driven up the costs of accessing medical care in the United States. Health insurers, some owned by gigantic for-profit conglomerates, often draw the blame for the high costs of U.S. health care — as demonstrated by the
national outpouring
of rage and frustration against
UnitedHealth Group
, one of the world's largest companies, after the head of its health insurance business was
shot and killed
last December.
But when it comes down to determining how much most working Americans pay to stay healthy, the buck stops with employers. And now they're planning on charging a lot more.
"It's kind of hidden, because [premium deductions are] coming out of your paycheck and if you're not paying close attention, it may not be obvious," Levitt says. "But your take-home pay is going down."
The good and bad news about why prices are rising
Some of the reasons for the rise in health care prices are actually good news. For example, pharmaceutical companies have developed more effective
cancer treatments
and
weight-loss drugs
— which they can also charge more for. And after several years when the COVID-19 pandemic and soaring inflation made many people
reluctant to seek
non-urgent care, more people are going to the doctor or other providers. But that surge in demand has also led to a surge in prices.
Other reasons have to do with a loss of competition. Some hospitals, doctors' offices, insurance companies and other businesses within the health care system have
merged or consolidated
, often allowing the remaining businesses to
raise prices
for their services.
"What's missing in health care is: It's not a traditional free market. You don't have those competitive forces," says Sunit Patel, Mercer's chief actuary for health and benefits in the United States.
This isn't the first time employers are facing this problem: The costs they pay to provide health care are steep, and those have been rising for years.
Last year, the average U.S. employer spent more than $19,000 per employee to provide family coverage while the employee kicked in $6,000, according to
KFF
. The total average family premium of $25,572 has
increased 52%
in the past decade.
Beth Umland, Mercer's director of health and benefits research, says that employers have tried to avoid passing on all the recent cost increases to employees, in part to try to retain workers during a tight post-pandemic labor market. But after years of elevated costs, she says, "I think just something had to give."
Employers tend to consider health care benefits as part of the total compensation they pay workers — meaning that if they are spending more on health care, they will probably spend less on traditional salary increases.
And while workers have tried-and-true methods of asking for salary raises, they generally have less opportunity to bargain over the prices their employers set for health care.
"In general for workers, it's kind of take it or leave it," Levitt says. "And they really don't have much of a choice but to take it."
Many Hard LeetCode Problems Are Easy Constraint Problems
In my first interview out of college I was asked the change counter problem:
Given a set of coin denominations, find the minimum number of coins required to make change for a given number. IE for USA coinage and 37 cents, the minimum number is four (quarter, dime, 2 pennies).
I implemented the simple greedy algorithm and immediately fell into the trap of the question: the greedy algorithm only works for "well-behaved" denominations. If the coin values were
[10, 9, 1]
, then making 37 cents would take 10 coins in the greedy algorithm but only 4 coins optimally (
10+9+9+9
). The "smart" answer is to use a dynamic programming algorithm, which I didn't know how to do. So I failed the interview.
But you only need dynamic programming if you're writing your own algorithm. It's really easy if you throw it into a constraint solver like
MiniZinc
and call it a day.
int: total;
array[int] of int: values = [10, 9, 1];
array[index_set(values)] of var 0..: coins;
constraint sum (c in index_set(coins)) (coins[c] * values[c]) == total;
solve minimize sum(coins);
You can try this online
here
. It'll give you a prompt to put in
total
and then give you successively-better solutions:
Lots of similar interview questions are this kind of mathematical optimization problem, where we have to find the maximum or minimum of a function corresponding to constraints. They're hard in programming languages because programming languages are too low-level. They are also exactly the problems that constraint solvers were designed to solve. Hard leetcode problems are easy constraint problems.
1
Here I'm using MiniZinc, but you could just as easily use Z3 or OR-Tools or whatever your favorite generalized solver is.
More examples
This was a question in a different interview (which I thankfully passed):
Given a list of stock prices through the day, find maximum profit you can get by buying one stock and selling one stock later.
It's easy to do in O(n^2) time, or if you are clever, you can do it in O(n). Or you could be not clever at all and just write it as a constraint problem:
Reminder, link to trying it online
here
. While working at that job, one interview question we tested out was:
Given a list, determine if three numbers in that list can be added or subtracted to give 0?
This is a satisfaction problem, not a constraint problem: we don't need the "best answer", any answer will do. We eventually decided against it for being too tricky for the engineers we were targeting. But it's not tricky in a solver;
include "globals.mzn";
array[int] of int: numbers = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
array[index_set(numbers)] of var {0, -1, 1}: choices;
constraint sum(n in index_set(numbers)) (numbers[n] * choices[n]) = 0;
constraint count(choices, -1) + count(choices, 1) = 3;
solve satisfy;
Okay, one last one, a problem I saw last year at
Chipy AlgoSIG
. Basically they pick some leetcode problems and we all do them. I failed to solve
this one
:
Given an array of integers heights representing the histogram's bar height where the width of each bar is 1, return the area of the largest rectangle in the histogram.
The "proper" solution is a tricky thing involving tracking lots of bookkeeping states, which you can completely bypass by expressing it as constraints:
array[int] of int: numbers = [2,1,5,6,2,3];
var 1..length(numbers): x;
var 1..length(numbers): dx;
var 1..: y;
constraint x + dx <= length(numbers);
constraint forall (i in x..(x+dx)) (y <= numbers[i]);
var int: area = (dx+1)*y;
solve maximize area;
output ["(\(x)->\(x+dx))*\(y) = \(area)"]
Now if I actually brought these questions to an interview the interviewee could ruin my day by asking "what's the runtime complexity?" Constraint solvers runtimes are unpredictable and almost always than an ideal bespoke algorithm because they are more expressive, in what I refer to as the
capability/tractability tradeoff
. But even so, they'll do way better than a
bad
bespoke algorithm, and I'm not experienced enough in handwriting algorithms to consistently beat a solver.
The real advantage of solvers, though, is how well they handle new constraints. Take the stock picking problem above. I can write an O(n²) algorithm in a few minutes and the O(n) algorithm if you give me some time to think. Now change the problem to
Maximize the profit by buying and selling up to
max_sales
stocks, but you can only buy or sell one stock at a given time and you can only hold up to
max_hold
stocks at a time?
That's a way harder problem to write even an inefficient algorithm for! While the constraint problem is only a tiny bit more complicated:
include "globals.mzn";
int: max_sales = 3;
int: max_hold = 2;
array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
array [1..max_sales] of var int: buy;
array [1..max_sales] of var int: sell;
array [index_set(prices)] of var 0..max_hold: stocks_held;
var int: profit = sum(s in 1..max_sales) (prices[sell[s]] - prices[buy[s]]);
constraint forall (s in 1..max_sales) (sell[s] > buy[s]);
constraint profit > 0;
constraint forall(i in index_set(prices)) (stocks_held[i] = (count(s in 1..max_sales) (buy[s] <= i) - count(s in 1..max_sales) (sell[s] <= i)));
constraint alldifferent(buy ++ sell);
solve maximize profit;
output ["buy at \(buy)\n", "sell at \(sell)\n", "for \(profit)"];
Most constraint solving examples online are puzzles, like
Sudoku
or "
SEND + MORE = MONEY
". Solving leetcode problems would be a more interesting demonstration. And you get more interesting opportunities to teach optimizations, like symmetry breaking.
If you're reading this on the web, you can subscribe
here
. Updates are once a week. My main website is
here
.
My new book,
Logic for Programmers
, is now in early access! Get it
here
.
Over the past several years, I've enjoyed the hobby of
paper
modeling
(or papercraft), the art
of creating 3D models from cut and glued parts from paper sheets. This hobby is
a superset of origami, in that it allows for cutting and gluing, as well as for
multiple sheets of paper for a single model. The alleviation of these
constraints means that papercraft allows for more complex models that are easier
to assemble.
Over many years, I've built models designed by others as well as designed my
own. In this post, I want to share everything I've learned along the way,
covering the entire process from design to assembly.
I love this hobby for three reasons:
It is extremely accessible.
There is no fancy hardware or software
involved. As we'll see, the core tools are paper, scissors, and glue;
everything else is an addon to make the experience better. All software tools
can be free. Accidentally messed up during assembly and need a replacement
part? Just print out another page. The entire creation of a model can be done
in the ballpark of a few cents.
It is equally technical and creative.
As we'll see, many of the problems
faced in papercraft require an engineering-like approach and a willingness to
experiment and iterate on designs. While it may appear outwardly like a craft
project, the end-to-end process involves constraints and optimizing within
them.
There's no limits on what you can make.
What you decide to build is
limited by your patience and imagination. Theoretically, nearly any object can
be represented as a paper model.
Let's dive in. My most recent model is a papercraft plane inspired by the
SR-71
Blackbird
, a
reconnaissance plane that to this day holds many records for being one of the
fastest aircrafts ever. It's now one of most iconic planes ever designed and an
engineering masterpiece. The program was ultimately retired in 1999.
The model we'll be desiging and assembling in this post
An actual SR-71 Blackbird, By USAF / Judson Brohmer - Armstrong Photo Gallery, Public Domain, https://commons.wikimedia.org/w/index.php?curid=30816
We're going to walk through the full model design and assembly process, while
referencing specific examples I encountered during creating this SR-71.
Let's set some constraints for how we're allowed to model our creation. These
are self-imposed limitations that fit my preferred-style for model design:
All parts in the assembled model must be made of paper.
Each part must be a single, solid color. The parts must not use any printed
textures or designs.
The model must be represented as a
simple
polyhedron
. There may be
no curvatures, holes, two-dimensional surfaces, or surface-to-surface contact.
If the figure we're trying to capture has any of these features, we must find
a way to approximate it using only flat faces. The object must be manifold (an
edge is only shared by 2 faces).
It may feel weird to impose constraints on an art. However, I find that these
constraints encourage a better designed model that can be
assembled easily and
predictably, including by others
.
Using features like curvatures, printing with textures, etc. are shortcuts. For
example, printing textures helps fill in details that aren't captured inherently
by the model; curvatures and 2d surfaces are flimsy and introduce variances in
how a model can be assembled. Simply polyhedral designs with single color parts
ensure that the 3D form itself captures the object being depicted, and can be
assembled in a structurally sound, predictable way.
In addition to constraints, we also have some goals that we're optimzing for.
These goals will be considered in each step of our design process.
Ease of assembly:
By far the most important goal, our model should be
easy to put together. Given the nature of paper and glue, a model that is
difficult to assemble will almost certainly look bad. A model can have a
well-designed topology, but still be difficult to assemble based on the parts
design we put together.
Aesthetic appeal:
This is an art, after all. The model we design should
be aesthetically pleasing and resemble the object of interest.
Minimal consumption of resources:
We should aim to minimize waste and use
our materials efficiently.
As in engineering, we have to consider trade-offs between these goals, and
optimize for these goals within our constraints.
The process of designing a paper model is iterative. Each iteration consists of
the following steps:
Mesh modeling
- using software to create a 3D polyhedron mesh of our desired form
Mesh unfolding
- unfolding the mesh into a 2D layout of parts
Assembly
- putting the parts together to create the final model
The remainder of this article will be walking through each step in detail. The
discussion of each step will be centered around the goals and constraints
declared from above.
In this phase, we design the mesh for our model. We aim to capture the essence
of an object in a way that can feasibly be built with paper. Depending on how
you approach this, this can easily be the most complicated step.
What do I mean by "feasibly built with paper"? Our mesh is a collection of
polygons that represent a 3D object. The closeness of that representation is
largely determined by how many polygons we use. We could use many
really small
polygons to closely match the subtle curves of our plane, but this would be hard
to assemble in reality. Alternatively, we could simplifiy our representation
down to a triangular pyramid. This would be trivially easy to assemble, but it
wouldn't look a lot like our plane.
We can now see that our goals of ease of assembly and aesthetic appeal are at
odds. Imagine that we have a continuum, where on the left we have a triangular
pyramid (the simplest possible polyhedron) and on the right we have a mesh of
the SR-71 with an arbitrarily high number (millions) of polygons.
Our mesh can exist anywhere between the simplest polyhedron and a mesh with near perfect resolution. The example on the right is by USSIowa on Thingiverse (Creative Commons - Attribution): https://www.thingiverse.com/thing:5508640
Generally, an "easy" to assemble model will have somewhere around a few hundred
polygons. Thus, our ideal model exists somewhere on the far left of this
spectrum.
The challenge here is what I call "allocation of resolution" - we have a finite
number of polygons to distribute across the features of our object. Certain
features will naturally require more polygons to be accurately captured than
others. For example, curved features require more polygons than flat features -
in this model, the cylindrical engines will require more detail, than say, the
flat wings.
In addition to the number of polygons and their concentrations, the arrangement
of the polygons themselves matters - this is the
topology
of the mesh. Most
discourse on 3D mesh topology is related to shading and animation. For our
purposes, we're considered with ease of assembly. Certain topologies are easier
to assemble and more structurally sound. Generally, here's some positive
topological qualities for papercraft:
Symmetries: a good mesh design is symetrical when possible. Symmetrical shapes
are intuitive and easier to reason about when assembling.
No narrow shapes: really narrow shapes are hard to cut out, hard to fold, and
hard to glue. Avoid them at all costs.
Use quads: quad faces have an aesthetic appeal to them.
If all of this is sounding hard, we've got some options, in increasing order of
difficulty:
The easiest way past this step is to find an existing mesh. There's a whole
genre of 3D modeling called "low-poly" that you can find with a quick search on
Thingiverse
or
Printables
. These are
usually designed for video games or 3D printing, but can be taken up for
papercraft.
Sometimes, you can find a high-resolution mesh of your desired object, but not a
low-poly one. In this case, there are tools available to reduce the polygon
count while preserving the overall shape. This is called "mesh simplification"
or "mesh decimation."
This
Instructable
goes over the process of doing this with
Meshlab
,
but there's many other software alternatives out there.
The pitfall of this approach is that automatic mesh decimation typically results
in some nasty topologies, and there's not a lot you can do to control the
output. To get around this, we
could
add an additional refinement step where
we take the raw decimated mesh output and "clean it up" using a mesh editor
software.
As an example, let's try this with a
SR-71 mesh on
Thingiverse
. The original mesh has
more than 1.2 million faces, and we're going to try decimating down to ~1,000.
Here's what we get from Meshlab:
Result of mesh decimation in Meshlab
In this case, the output is not usable - it's wildly asymetric and is full of
self-intersections. Refining this topology would take just as long (if not
longer) as creating a model from scratch.
The most difficult option is to create your own mesh from scratch. This option
gives you full control over the design, and is what I chose for the SR-71 model.
My software of choice for this is
Blender
. Blender
has a steep learning curve, but the type of mesh design we're doing for this
project doesn't begin to scratch the surface of its full capabilities. I highly
recommend
this low-poly tutorial
if you've never used Blender before and need somewhere to start. Two things I
found very handy were the
mirror
modifier
to enforce symmetry, and the
3D Print
Toolbox
to auto-cleanup the mesh and check for manifoldness.
This process is very tedious. My advice here is: simplify your mesh to the point
where you feel uncomfortable. Recall that we're largely optimizing for ease of
assembly. When modeling, it's very tempting to capture finer details, but fine
details have costs (small parts, hard to glue regions, etc.) that are not worth
it during the assembly phase. Scrutinize every feature, and zoom out once in a
while. When you zoom out, your omissions won't feel as weird.
After many days, here's the initial mesh I created. It contains 732 triangles.
Note the symmetry along the y-axis.
Related goals:
Ease of assembly, minimal consumption of resources
Once we have a mesh, we have to convert it into a 2D template of parts that can
be printed and assembled. This process is called
unfolding
. Each of the
faces of our mesh are grouped into
parts
, and the arrangement of our parts
is a
layout, or template
.
To do this, we're going to turn to software again. The most popular unfolding
tool (and my favorite) is
Pepakura
Designer
. Pepakura is not free (at the
time of this writing, it's a one time $70 purchase) and it only runs on
Windows. There's also
Unfolder for Mac
, which is
$30. If you can't use either of these, Blender can save the day again with its
free
Paper Model
plugin
.
I believe that the unfolding step is one that does not get as much attention as
it deserves. There is a noticable difference between a good template and a bad
one. A good template has parts that make intuitive sense, with logical groupings
and clear flow. The faces themselves are grouped into parts that are easy to cut
out and handle. All of this equates to a better building experience, which means
a better looking model.
Part of unfolding is also deciding the scale of your model. You can make your
model as big or small as you want, but again, ease of assembly should be top of
mind when deciding. A model that's too small will end up with parts that are
hard to cut out and fold. Bigger models are easier to assemble, but you're
limited to the point where the faces of your model must fit on a page.
I ended up making this model 25 inches long. With the original SR-71 being about
107 feet long, this puts our model at around a 1:50 ratio.
Let's start off with the creation of parts. In most unfolding software, the
software will auto-unfold for you, and from there you can regroup faces into
whatever parts you want. Here's Pepakura's auto unfold:
Pepakura default unfolding produces complex parts
The parts it generated are pretty complicated, so we have some work to do.
If you have a mesh with
faces, you can have anywhere from 1 (all the faces
in a single part) to
total parts (each part is a single face). We want our
model to be easy to assemble, and neither of these extremes are easy.
Rather than trying to fix the number of parts and going from there, I recommend
creating parts that are logical. Identify features that can be captured in a
single part, and go from there. For example, in the SR-71, each engine intake
spike makes sense as a single part. So does the nose cone.
If your mesh has an axis of symmetry, then your parts have symmetrical pairings
as well. The same feature on either side of the axis should be represented with
a mirrored part. In the SR-71, the entire plane is symmetrical on the vertical
axis, so all parts across this axis are mirrored. This is good because once
someone builds one side, they can more easily reason about the other side.
I ended up dividing this model into 42 parts. These parts were carefully divided
in such a way that I felt would make them easier to assemble. If you look at any
part in particular, chances are it'll have a symmetric counterpart.
Finalized parts for the layout
They're arranged pretty haphazardly right now, but we'll cleanup this up in the
next step.
Again, most software will automatically arrange the parts for you as part of
unfolding. Here's the 14 page arrangement Pepakura decided for the parts I created:
Auto arrangement of parts by Pepakura
I highlighted all the parts on the first two pages so you can see where the are
on the finished model. Notice that they're scattered throughout different
sections. That's why I typically don't like auto-arrangement - they're designed
to minimize paper usage, but they often result in a less intuitive assembly
process. You can't look at any particular page and loosely know where its parts
will go.
A good part layout reads like a story.
Parts are arranged in a logical
order, with related parts grouped together. I like to arrange mine left to
right, top to bottom on a page. Here's my layout, with the first two pages
highlighted.
Manual arrangement of parts, which now has logical groupings
All the parts that are near each other in the layout are also near each other in
the final assembly. In this case, I even was able to reduce the page count down
to 12 from the starting 14.
Flaps
, or
tabs
, are the appendages on each part that allow for gluing
parts together. Each flap has a singular counterpart edge that it's glued to -
this is known as an
edge/flap pair
. Most software will auto-assign a shared
number between an edge and its flap to make identify pairs easy during the
assembly process.
Two edge/flap pairs, with arrows pointing to their matching number IDs
For an edge/flap pair, most unfolding software will allow us to swap the flap
across parts. Doing this strategically is critical for creating an easy to
assemble model, and also has implications for the structural integrity of the
final build.
For example, consider the two example parts shown above. These two parts that
meet at two shared edges, so these parts have two edge/flap pairs between them.
We could arrange the flaps so that one part has both of them:
Arranging both flaps on the same side
We could also interlace the flaps, so each part has one flap on each side.
Arranging flaps as interlaced, with one flap on each part
Interlacing flaps between parts can create a more stable structure, since
there's only one way for the parts to meet. If two flaps are on the same side,
they can over-extend when glued to the edge. That being said, same-side flaps
can be easier to work with, especially when reaching the closing stages of a
model.
In general, I like to using interlaced flaps wherever possible to create an
overall stronger model, and use same-side flaps selectively.
Once we have an arrangement we like, we can export our layout as a PDF.
65lb (176
) cardstock: This is the ideal paper weight for creating
sturdy models, while still being thin/flexible enough to pass through a normal
printer and be easy to fold.
Adhesive. My recommended adhesive is tacky glue: it's strong, dries clear, but
is forgiving enough to allow for repositioning during assembly. Specifically,
I use
Aleene's Original Tacky
Glue
.
I've also had past success with a glue stick.
We'll also need some tools, which I've listed these in order of importance. The
ones with asterisks are essential. Everything else is a nice-to-have.
Printer*
: You'll need access to a printer to print the template on the
cardstock. Laser jet printers are great because the prints don't smudge.
Cutting tools*
: You'll need a pair of scissors or a craft knife to cut
out the parts. Use sharp tools for clean cuts - it makes a difference.
Ruler*
: Cutting/scoring perfectly straight lines is a must. Steel rulers
are great for their consistent edge, and they don't catch against your tools.
That being said, I used a clear plastic ruler for this model. Being able to
see through the ruler helps with alignment.
Scoring tool*
: This will help you prepare a part for folding. You can use
a bone folder or scoring wheel. I use an embossing tool I found at a dollar
store, but before that, I used a ballpoint pen than ran out of ink. Anything
with a precise (but not too sharp) tip will do.
Toothpicks
: I use toothpicks to spread blobs of glue into thin layers and
get into tight spaces.
Assembly surface
: A cutting mat or piece of cardboard will protect your
work surface and give you a stable surface to cut/score your parts.
Tweezers
: Tweezers are helpful for handling small parts and getting into
tight spaces, especially while holding parts together as glue dries.
If you want to get fancy, you can also purchase an automatic cutting machine,
like a
Cricut
or
Silhouette
. These machines can precisely
cut/score your parts from cardstock. Getting the template into their software
takes some extra effort, but it results in the best quality parts. I did not use
a machine for this project.
To match the real SR-71, I printed my template on black cardstock. Darker
cardstocks are harder to work with because of the low contrast between the ink
and the paper itself. If you're new to the hobby, I would recommend starting
with a lighter color.
Cutting:
Cutting the parts out from the paper with your cutting tool of
choice. Scissors are quicker, but the combination of ruler and craft knife
results in cleaner cuts.
Scoring:
Running a scoring tool over fold lines to get cleaner folds. This
may be tempting to skip, but I cannot emphasize the importance of this step
enough. Scoring is especially important when dealing with thicker paper.
Folding:
Folding the parts in prep for gluing. There's only two types of
folds: mountain folds and valley folds.
Gluing:
Gluing the parts together.
How you decide to batch these steps is up to you. For example, you could cut all
the parts out at once, then score all of them, etc. This approach is effective
because you can develop a rhythm by doing each phase only once, so you're not
constantly switching between tools; the downside is that you only get to start
assembly after a pretty lengthy process. Alternatively, you can do it per part:
cut one part out, score it, fold it, and secure it to the assembly. Here, the
pros and cons are flipped: you get to see the model come together quicker, but
there's a lot of context switching between phases. I've tried both of these
approaches, and find that the latter results in a non-negligible increase in the
assembly time of the model.
To strike a balance, the approach I took for this model was performing the
phases at the granularity of sections (engines, wings, fuselage, etc.) of the
model. This approach has the added final step of assembling all the standalone
sections together into the final model.
Here's some pictures I took during the assembly process. In total, assembly took
6-8 hours.
An part cut from the template — this part is for one of the elevons.
The same part from above, but now scored. Note the visible impressions on the fold lines.
All the parts to make one of the engine/wings, cut and scored.
Beginning to assemble the engine (see completed inlet spike).
Both engines and partial wings, fully assembled.
The assembled nose cone and cockpit.
Bottom view of assembling the engines to the main fuselage.
Top view of assembling the engines to the main fuselage.
Use little glue:
When gluing parts together, apply as little glue as
possible. Using too much glue will result in spillover when the flaps/edges are
put together, and this spillover is hard to wipe away from a porous surface like
paper. Too much glue can even result in subtle paper warping. In the recommended
tools, I suggested a toothpick. I apply a small bead of glue to a flap and use
the toothpick to spread it into a thin film. This prevents any spillage and
keeps the model clean.
Start in complex areas:
As you progress further in gluing parts together,
the degrees of freedom of your model will reduce. This is why I recommend
starting with more complicated areas of your model where you'll need those
degrees of freedom. In this model, this meant starting with precise features,
like the engine inlet spikes or the vertical stabilizers.
Finish in hidden areas:
This goes hand in hand with the tip above. As you
reach to the end of your model, gluing the final parts together can be
very
hard, which means the final edges may come out a bit sloppy. Why does this
happen? Any minor imperfections we made throughout the assembly process result
in stresses in our model that will be felt at the end. Gluing the last part may
be challenging because it'll feel misaligned, and it has the added challenge of
attempting to close a 3D object from the outside. That's why I always recommend
choosing an assembly order that results in the last parts being glued in an area
that is out of sight. For the SR-71, that happens to be the underside of the
fuselage.
No matter how much you scrutinize the modeling and layout phases, you will
inevitably find areas for improvement as you assemble. In the case of the SR-71,
I spotted a few minor assymetries in part tabs, and more importantly, an
opportunity to reduce face count by simplifying the topology of the bottom of
the plane and the nose cone.
I took my mesh back into Blender, and was able to get the triangle count down to
636, which is almost a full 100 faces fewer than the original mesh.
Second mesh iteration in Blender.
Below, you can see the old mesh (left) next to the new mesh (right). It's hard
to tell the difference, yet the new one has almost 15% fewer faces.
First mesh iteration (left) vs. latest mesh iteration (right).
A faster way to iterate is to render the model rather than physically building
it. This allows you to quickly identify and fix visual issues without going
through the hours of assembly. Here's some renders (in Blender) of the final
iteration:
In total, the full cycle of designing the mesh, creating the parts layout,
assembly, and subsequent refinement iterations occurred over the course of a few
months. The process is long, but the results are well worth it.
If you're interested in making this model yourself, you can download the PDFs for
the first iteration of the model below. I've included a template for the stand as well.
The first three things you’ll want during a cyberattack
Bleeping Computer
www.bleepingcomputer.com
2025-09-12 15:02:12
When cyberattacks hit, every second counts. Survival depends on three essentials: clarity to see what's happening, control to contain it, and a lifeline to recover fast. Learn from Acronis TRU how MSPs and IT teams can prepare now for the difference between recovery and catastrophe. [...]...
The moment a cyberattack strikes, the clock starts ticking. Files lock up, systems stall, phones light up and the pressure skyrockets. Every second counts. What happens next can mean the difference between recovery and catastrophe.
In that moment, you need three things above all else: clarity, control and a lifeline. Without them, even the most experienced IT team or managed service provider (MSP) can feel paralyzed by confusion as damage escalates. But with clarity, control and a lifeline, you can move decisively, protect your clients and minimize fallout from the attack.
Learn now how to develop these three critical elements every MSP and IT team should have ready before a breach. Because when chaos strikes, preparation can make the difference between a manageable event and absolute disaster.
1. Clarity: Knowing what’s happening, fast
The first wave of panic a cyberattack comes from uncertainty. Is it ransomware? A phishing campaign? Insider misuse? Which systems are compromised? Which are still safe?
Without clarity, you’re guessing. And in cybersecurity, guesswork can waste precious time or make the situation worse.
That’s why real-time visibility is the first thing you’ll want when an attack hits. You need solutions and processes that can enable you to:
Provide a single, accurate picture
, a unified view of events instead of scattered alerts across different dashboards.
Identify the blast radius
to determine which data, users and systems are affected, as well as how far the attack has spread.
Clarity transforms chaos into a manageable situation. With the right insights, you can quickly decide: What do we isolate? What do we preserve? What do we shut down right now?
The MSPs and IT teams that weather attacks best are the ones who can answer those questions without delays.
2. Control: Stopping the spread
Once you know what’s happening, the next critical need is control. Cyberattacks are designed to spread through lateral movement, privilege escalation and data exfiltration. If you can’t contain an attack quickly, the cost multiplies.
Control means having the ability to:
Isolate compromised endpoints instantly
by cutting them off from the network to stop ransomware or malware from spreading further.
Revoke access right
s on demand to shut credentials down in case attackers have exploited them.
Enforce policies automatically
, from blocking suspicious processes to halting unauthorized file transfers.
Think of it like firefighting: Clarity tells you where the flames are, but control enables you to prevent the blaze from consuming the entire building.
This is also where effective incident response plans matter. It’s not enough to have the tools; you need predefined roles, playbooks and escalation paths so your team knows exactly how to assert control under pressure.
Another essential in this scenario is having a technology stack with integrated solutions that are easy to manage. Running from one system to another during an attack is not only dangerous but also highly inefficient.
Even with visibility and containment, cyberattacks can leave damage behind. They can encrypt data and knock systems offline. Panicked clients demand answers. At this stage, what you’ll want most is a lifeline you can trust to bring everything back and get the organization up and running again.
That lifeline is your backup and recovery solution. But it has to meet the urgency of a live attack with:
Immutable backups
so ransomware can’t tamper with your recovery data.
Granular restore options
to bring back not just full systems but also critical files and applications in minutes.
Orchestrated disaster recovery
to spin up entire workloads in a secure environment while you remediate.
The best defense is knowing that, no matter how bad the attack, you can get operations back up and running quickly. This assurance restores both systems and trust.
For MSPs, recovery is the lifeline that keeps customers loyal after a breach. For internal IT teams, it’s what keeps business operations from grinding to a halt.
Preparation is everything
Cyberattacks are “when” events, not “if.” And when they happen, you don’t have time to improvise. You’ll need clarity, control and a lifeline already in place and ready to execute.
That means investing in advanced monitoring and detection capabilities, building proven incident response playbooks and deploying a backup and recovery platform purpose-built for resilience.
The truth is that no organization can prevent every attack, but every organization can prepare for one. In the face of cyberthreats, preparation is the single greatest differentiator between recovery and catastrophe.
The TRU team researches emerging threats, provides security insights, and supports IT teams with guidelines, incident response and educational workshops.
Think twice before abandoning X11. Wayland breaks everything!
Wayland breaks everything! It is binary incompatible, provides no clear transition path with 1:1 replacements for everything in X11, and is even philosophically incompatible with X11. Hence, if you are interested in existing applications to "just work" without the need for adjustments, then you may be better off avoiding Wayland.
Wayland solves no issues I have but breaks almost everything I need.
Even the most basic, most simple things (like
xkill
) -
in this case with no obvious replacement
. And usually it stays broken, because the Wayland folks mostly seem to care about Automotive, Gnome, maybe KDE - and alienating everyone else (e.g., people using just an X11 window manager or something like GNUstep) in the process.
Feature comparison
Please do fact-check and suggest corrections/improvements below. Maybe this table should find its home in a Wiki, so that everyone could easily collaborate. I'm just a bit fearful of vandalism... ideas?
✅ Supported
⚠️
Available with limitations
❌ Not available or only available on some systems (requires particular compositors or additional software which may not be present on every system)
❌ Not natively available—requires Remote Desktop Portal, which may not be present on every system (
libei GH
,
KDE Input
) . Workaround:
/dev/uinput
should work everywhere.
⚠️
wl_data_offer
,
wl_data_device_manager
(
Wayland Protos
,
KDE Drag&Drop
) but implementations are flaky, especially when dragging between X11 and Wayland applications
❌ Only compositor can set layout; clients have no access (
KDE Dev
). Supported by some compositors which may not be present on every system via
wlr-output-management
and associated tools like
wlr-randr
.
Global menus
✅ Works
❌ Not natively available—requires
qt_extended_surface set_generic_property
which may not be present on every system
✅ Guaranteed by compositor; always tear-free (
Wayland FAQ
)
Security / App Isolation
⚠️
Via extensions, e.g., Xnamespace extension (
The Register
)
⚠️
Wayland tries to separate applications from each other. As a result, applications can't do many things ("We're treated like hostile threat actors on our own workstations")
Click into a window to terminate the application
✅
xkill
❌ Not natively available—some compositors may have proprietary mechanisms, which may not be present on every system
Click into a window to see its metadata
✅
xprop
❌ Not supported
Set and get metadata (properties) on windows to exchange information regarding windows
One window server used by virtually all desktop environments and distributions
✅ Xorg (and Xlibre)
❌ Every desktop environment comes with a different compositor, which behaves differently, supports different features and has different bugs
Status update
Update 06/2025: X11 is alive and well, despite what Red Hat wants you to believe.
https://github.com/X11Libre/xserver
revitalizes the Xorg X11 server as a community project under new leadership.
For the record, even in the latest Raspberry Pi OS you
still
can't drag a file from inside a zip file onto the desktop for it to be extracted. So drag-and-drop is still broken for me.
And Qt
move()
on a window
still
doesn't work like it does on all other desktop platforms (and the Wayland folks think that is
good
).
And global menus
still
don't work (outside of not universally implemented things like
qt_extended_surface
set_generic_property
).
Wayland issues
The Wayland project seems to operate like they were starting a
greenfield project
, whereas at the same time they try to position Wayland as "the X11 successor", which would clearly require a lot of thought about not breaking, or at least providing a smooth upgrade path for, existing software.
In fact, it is merely an incompatible alternative, and not even one that has (nor wants to have) feature parity (
missing features
). And unlike X11 (the X
Window
System), Wayland protocol designers actively avoid the concept of "windows" (making up incomprehensible words like "
xdg_toplevel
" instead).
DO NOT USE A WAYLAND SESSION!
Let Wayland not destroy everything and then have other people fix the damage it caused. Or force more Red Hat/Gnome components (glib, Portals, Pipewire) on everyone!
Please add more examples to the list.
Wayland seems to be made by people who do not care for existing software. They assume everyone is happy to either rewrite everything or to just use Gnome on Linux (rather than, say, twm with ROX Filer on NetBSD).
Edit:
When I wrote the above, I didn't really realize what Wayland even
was
, I just noticed that some distributions (like Fedora) started pushing it onto me and things didn't work properly there. Today I realize that you can't "install Wayland", because unlike Xorg, there is not one "Wayland display server" but actually every desktop envrironment has its own. And maybe "the Wayland folks" don't "only care about Gnome", but then, any fix that is done in Gnome's Wayland implementation isn't automatically going to benefit
all
users of Wayland-based software, and possibly isn't even the implementation "the Wayland folks" would necessarily recommend.
Edit 12/2023:
If something wants to replace X11 for desktop computers (such as professional Unix workstations), then it better support all needed features (and key concepts, like
windows
) for that use case. That people also have displays on their fridge doesn't matter the least bit in that context of discussion. Let's propose
the missing Wayland protocols
for full X11 feature parity.
A crash in the window manager takes down all running applications
You cannot run applications as root
You cannot do a lot of things that you can do in Xorg
by design
There is not one
/usr/bin/wayland
display server application that is desktop environment agnostic and is used by
everyone
(unlike with Xorg)
It offloads a lot of work to each and every window manager. As a result, the same basic features get implemented differently in different window managers, with different behaviors and bugs - so what works on desktop environment A does not necessarily work in desktop environment B (e.g., often you hear that something "works in Wayland", even though it only really works on Gnome and KDE, not in all Wayland implementations). This summarizes it very well:
https://gitlab.freedesktop.org/wayland/wayland/-/issues/233
Apparently the Wayland project doesn't even
want
to be "X.org 2.0", and doesn't want to provide a commonly used implementation of a compositor that could be used by everyone:
https://gitlab.freedesktop.org/wayland/wayland/-/issues/233
. Yet this would imho be required if they want to make it into a worthwile "successor" that would have any chance of ever fixing the many Wayland issues at the core.
Wayland breaks screen recording applications
MaartenBaert/ssr#431
❌ broken since 24 Jan 2016, no resolution ("I guess they use a non-standard GNOME interface for this")
https://github.com/mhsabbagh/green-recorder
❌ ("I am no longer interested in working with things like ffmpeg/wayland/GNOME's screencaster or solving the issues related to them or why they don't work")
vkohaupt/vokoscreenNG#51
❌ broken since at least 7 Mar 2020. ("I have now decided that there will be no Wayland support for the time being. Reason, there is no budget for it. Let's see how it looks in a year or two.") -
This is the key problem. Wayland breaks everything and then expects others to fix the wreckage it caused on their own expense.
obsproject/obs-studio#2471
❌ broken since at least 7 Mar 2020. ("Wayland is unsupported at this time", "There isn't really something that can just be easily changed. Wayland provides no capture APIs")
There is a workaround for OBS Studio that requires a
obs-xdg-portal
plugin (which is known to be Red Hat/Flatpak-centric, GNOME-centric,
"perhaps"
works with other desktops)
phw/peek#1191
❌ broken since 14 Jan 2023. Peek, a screen recording tool, has been abandoned by its developerdue to a number of technical challenges, mostly with Gtk and Wayland ("Many of these have to do with how Wayland changed the way applications are being handled")
As of February 2024, screen recording is
still broken utterly
on Wayland with the vast majority of tools.
Proof
Workaround:
Find a Wayland compositor that supports the
wlr-screencopy-unstable-v1
protocol and use
wf-recorder -a
. The default compositor in Raspberry Pi OS (Wayfire) does, but the default compositor in Ubuntu doesn't. (That's the worst part of Wayland: Unlike with Xorg, it always depends on the particular Wayand compositor what works and what is broken. Is there even one that supports
everything
?)
jitsi/jitsi-meet#6389
❌ broken since 24 Jan 2016 ("Closing since there is nothing we can do from the Jitsi Meet side.")
See? Wayland breaks stuff and leaves application developers helpless and unable to fix the breakage, even if they wanted.
NOTE:
As of November 2023, screen sharing in Chromium using Jitsi Meet is
still utterly broken
, both in Raspberry Pi OS Desktop, and in a KDE Plasma installation, albeit with different behavior. Note that Pipewire, Portals and whatnot are installed, and even with them it does not work.
flathub/us.zoom.Zoom#22
Zoom ❌ broken since at least 4 Jan 2019. ("Can not start share, we only support wayland on GNOME with Ubuntu (17, 18), Fedora (25 to 29), Debian 9, openSUSE Leap 15, Arch Linux").
No word about non-GNOME!
probonopd/Zoom.AppImage#8
❌ broken since 1 Oct 2022 Zoom: "You need PulseAudio 1.0 and above to support audio share" on a system that uses pipewire-pulseaudio
Wayland breaks automation software
sudo pkg install py37-autokey
This is an X11 application, and as such will not function 100% on
distributions that default to using Wayland instead of Xorg.
Wayland breaks Gnome-Global-AppMenu (global menus for Gnome)
https://bugs.kde.org/show_bug.cgi?id=424485
❌ Still broken as of late 2023 ("I am also still not seeing GTK global menus on wayland. They appear on the applications themselves, wasting a lot of space.")
Wayland broke global menus with KDE platformplugin
https://blog.broulik.de/2016/10/global-menus-returning/
("it uses global window IDs, which don’t exist in a Wayland world... no global menu on Wayland, I thought, not without significant re-engineering effort"). KDE had to do additional work to work around it. And it still did not work:
Wayland breaks global menus with non-KDE Qt platformplugins
https://blog.broulik.de/2016/10/global-menus-returning/
❌ broke non-KDE platformplugins. As a result, global menus now need
_KDE_NET_WM_APPMENU_OBJECT_PATH
which only the KDE platformplugin sets, leaving everyone else in the dark
Wayland breaks AppImages that don't ship a special Wayland Qt plugin
https://blog.martin-graesslin.com/blog/2018/03/unsetting-qt_qpa_platform-environment-variable-by-default/
❌ broke AppImages that don't ship a special Wayland Qt plugin. "This affects proprietary applications, FLOSS applications bundled as appimages, FLOSS applications bundled as flatpaks and not distributed by KDE and even the Qt installer itself. In my opinion this is a showstopper for running a Wayland session." However, there is a workaround: "AppImages which ship just the XCB plugin will automatically fallback to running in xwayland mode" (see below).
Wayland does not work properly on NVidia hardware?
Apparently Wayland relies on nouveau drivers for NVidia hardware. The nouveau driver has been giving unsatisfactory performance since its inception. Even clicking on the application starter icon in Gnome results in a stuttery animation. Only the proprietary NVidia driver results in full performance.
See below.
Update 2024: The situation might slowly be improving. It remains to be seen whether this will work well also for all existing old Nvidia hardware (that works well in Xorg).
Wayland prevents GUI applications from running as root
https://bugzilla.redhat.com/show_bug.cgi?id=1274451
❌ broken since 22 Oct 2015 ("No this will only fix sudo for X11 applications. Running GUI code as root is still a bad idea." I absolutely detest it when software tries to prevent me from doing what some developer thinks is "a bad idea" but did not consider my use case, e.g., running
truss
for debugging on FreeBSD needs to run the application as root.
https://bugzilla.mozilla.org/show_bug.cgi?id=1323302
suggests it is not possible: "These sorts of security considerations are very much the way that "the Linux desktop" is going these days".)
https://blog.netbsd.org/tnf/entry/wayland_on_netbsd_trials_and
❌ broken since 28 Sep 2020 ("Wayland is written with the assumption of Linux to the extent that every client application tends to #include <linux/input.h> because Wayland's designers didn't see the need to define a OS-neutral way to get mouse button IDs. (...) In general, Wayland is moving away from the modularity, portability, and standardization of the X server. (...) I've decided to take a break from this, since it's a fairly huge undertaking and uphill battle. Right now, X11 combined with a compositor like picom or xcompmgr is the more mature option."
https://blog.martin-graesslin.com/blog/2018/01/server-side-decorations-and-wayland/
❌ FUD since at least 27 January 2018 ("I heard that GNOME is currently trying to lobby for all applications implementing client-side decorations. One of the arguments seems to be that CSD is a must on Wayland. " ... "I’m burnt from it and are not interested in it any more.") Server-side window decorations are what make the title bar and buttons of
all
windows on a system
consistent
. They are a must have_ for a consistent system, so that applications written e.g., Gtk will not look entirely alien on e.g., a Qt based desktop, and to enforce that developers cannot place random controls into window titles where they do not belong. Client-side decorations, on the other hand, are destroying uniformity and consistency, put additional burden on application and toolkit developers, and allow e.g., GNOME developers to put random controls (that do not belong there) into window titles (like buttons), hence making it more difficult to achieve a uniform look and feel for all applications regardless of the toolkit being used.
Red Hat employee Matthias Clasen ("I work at the Red Hat Desktop team... I am actually a manager there... the people who do the actual work work for me") expicitly stated "Client-side everything" as a principle, even though the protocol doesn't enforce it: "Fonts, Rendering, Nested Windows,
Decorations
. "It also gives the design more freedom to use the titlebar space, which is something our designers appreciate" (sic).
Source
Wayland breaks windows rasing/activating themselves
Apparently Wayland (at least as implemented in KWin) does not respect EWMH protocols, and breaks other command line tools like wmctrl, xrandr, xprop, etc. Please see the discussion below for details.
Wayland requires JWM, TWM, XDM, IceWM,... to reimplement Xorg-like functionality
Querying of the mouse position, keyboard LED state, active window position or name, moving windows (xdotool, wmctrl)
Global shortcuts
System tray
Input Method support/editor (IME)
Graphical settings management (i.e. tools like xranrd)
Fast user switching/multiple graphical sessions
Session configuration including but not limited to 1) input devices 2) monitors configuration including refresh rate / resolution / scaling / rotation and power saving 3) global shortcuts
HDR/deep color support
VRR (variable refresh rate)
Disabling input devices (xinput alternative)
As it currently stands minor WMs and DEs do not even intend to support Wayland given the sheer complexity of writing all the code required to support the above features. You do not expect JWM, TWM, XDM or even IceWM developers to implement all the featured outlined in ^1.
https://github.comelectron/electron#33226
("skipTaskbar has no effect on Wayland. Currently Electron uses
_NET_WM_STATE_SKIP_TASKBAR
to tell the WM to hide an app from the taskbar, and this works fine on X11 but there's no equivalent mechanism in Wayland." Workarounds are only available for
some
desktops including GNOME and KDE Plasma.) ❌ broken since March 10, 2022
Wayland breaks xclip
xclip
is a command line utility that is designed to run on any system with an X11 implementation. It provides an interface to X selections ("the clipboard"). Apparently Wayland isn't compatible to the X11 clipboard either.
AppImage/AppImageKit#1221 (comment)
❌ broken since 2022-20-15 ("Espanso built for X11 will not work on Wayland due to
xclip
. Wayland asks for
wl-copy
")
This is another example that the Wayland requires everyone to change components and take on additional work just because Wayland is incompatible to what we had working for all those years.
X11 atoms can be used to store information on windows. For example, a file manager might store the path that the window represents in an X11 atom, so that it (and other applications) can know for which paths there are open file manager windows. Wayland is not compatible to X11 atoms, resulting in all software that relies on them to be broken until specifically ported to Wayland (which, in the case of legacy software, may well be never).
Games are developed for X11. And if you run a game on Wayland, performance is subpar due to things like forced vsync. Only recently,
some
Wayland implementations (like KDE KWin) let you disable that.
Wayland breaks xdotool
(Details to be added; apparently no 1:1 drop-in replacement available?)
Wayland breaks xkill
xkill
(which I use on a regular basis) does not work with Wayland applications.
Other platforms (Windows, Mac, other destop environments) can set the window position on the screen, so all cross-platform toolkits and applications expect to do the same on Wayland, but Wayland can't (doesn't want to) do it.
PCSX2/pcsx2#10179
PCX2 (Playstation 2 Emulator) ❌ broken since 2023-10-25 ("Disables Wayland, it's super broken/buggy in basically every scenario. KDE isn't too buggy, GNOME is a complete disaster.")
Wayland might allow the
compositor
(not: the application) to set window positions, but that means that as an application author, I can't do anything but
wait
for KDE to implement
https://bugs.kde.org/show_bug.cgi?id=15329
- and even then, it will
only
work under KDE, not Gnome or elsewhere. Big step backward compared to X11!
Wayland breaks color mangement
Apparently color management as of 2023 (well over a decade of Wayland development) is still in the early "thinking" stage, all the while Wayland is already being pushed on people as if it was a "X11 successor".
According to Valve, "DRM leasing is the process which allows SteamVR to take control of your VR headset's display in order to present low-latency VR content".
Qt
setWindowIcon
has no effect on KDE/Wayland
https://bugreports.qt.io/browse/QTBUG-101427
("Resolution: Out of scope", meaning it cannot be fixed in Qt, "Using
QT_QPA_PLATFORM=xcb
works, though", meaning that if you disable Wayland then it works) ❌ broken since 2022-03-03
LibrePCB developer: "Btw it's just one of several problems we have with Wayland, therefore we still enforce LibrePCB to run with Xwayland. It's a shame, but
I feel totally helpless against such decisions made by Wayland
and using XWayland is the only reasonable option (which even works perfectly fine)..."
https://gitlab.freedesktop.org/wayland/wayland-protocols/-/issues/52#note_2155885
Update 6/2024:
Looks like this will get unbroken thanks to
xdg_toplevel_icon_manager_v1
, so that
QWindow::setIcon
will work again
.
If
, and that's a big if, all compositors will support it. At least
KDE
is on it.
Wayland breaks drag and drop
On Raspberry Pi OS (which is running a Wayland session by default), dragging a file out of a zip file onto the desktop fails with "XDirectSave failed."
Many window managers have a
--replace
argument, but Wayland compositors break this convention.
Xpra is an open-source multi-platform persistent remote display server and client for forwarding applications and desktop screens.
Under Xpra a context menu cannot be used: it opens and closes automatically before you can even move the mouse on it. "It's not just GDK, it's the Wayland itself. They decided to break existing applications and expect them to change how they work." (
Xpra-org/xpra#4246
) ❌ broken since 2024-06-01
Wayland breaks multi desktop docks
"Unfortunately Wayland is not designed to support multi desktop dock projects. This is why each DE using Wayland is building their own custom docks. Plus there is a lot of complexity to support Wayland based apps and also merge that data with apps running in Xwayland. A dock isn't useful unless it knows about every window and app running on the system."
zquestz/plank-reloaded#70
❌ broken since 2025-06-10
Users:
Refuse to use Wayland sessions. Uninstall desktop environments/Linux distributions that only ship Wayland sessions. Avoid Wayland-only applications (such as
PreSonus Studio One
) (potential workaround: run in
https://github.com/cage-kiosk/cage
)
Application developers:
Enforce running applications on X11/XWayland (like LibrePCB does as of 11/2023)
Examples of Wayland being forced on users
This is exactly the kind of behavior this gist seeks to prevent.
Alex Karp Insists Palantir Doesn’t Spy on Americans. Here’s What He’s Not Saying.
Intercept
theintercept.com
2025-09-12 15:00:00
Documents from Edward Snowden published by The Intercept in 2017 show the NSA’s use of Palantir technology.
The post Alex Karp Insists Palantir Doesn’t Spy on Americans. Here’s What He’s Not Saying. appeared first on The Intercept....
In an exchange
this week on “All-In Podcast,” Alex Karp was on the defensive. The Palantir CEO used the appearance to downplay and deny the notion that his company would engage in rights-violating in surveillance work.
“We are the single worst technology to use to abuse civil liberties, which is by the way the reason why we could never get the NSA or the FBI to actually buy our product,” Karp
said
.
What he didn’t mention was the fact that a tranche of classified documents revealed by Edward Snowden and The Intercept in 2017 showed how Palantir software
helped
the National Security Agency and its allies spy on the entire planet.
Palantir has attracted increased scrutiny as the pace of its business with the federal government has
surged
during the second Trump administration. In May, the New York Times
reported
Palantir would play a central role in a White House plan to boost data sharing between federal agencies, “raising questions over whether he might compile a master list of personal information on Americans that could give him untold surveillance power.” Karp immediately rejected that report in a
June interview on CNBC
as “ridiculous shit,” adding that “if you wanted to use the deep state to unlawfully surveil people, the last platform on the world you would pick is Palantir.”
Karp made the same argument in this week’s podcast appearance, after “All-In” co-host David Sacks — the Trump administration AI and cryptocurrency czar — pressed him on matters of privacy, surveillance, and civil liberties. “One of the criticisms or concerns that I hear on the right or from civil libertarians is that Palantir has a large-scale data collection program on American citizens,” Sacks said.
Karp replied by alleging that he had been approached by a Democratic presidential administration and asked to build a database of Muslims. “We’ve never done anything like this. I’ve never done anything like this,” Karp said, arguing that safeguards built into Palantir would make it undesirable for signals intelligence. That’s when he said the company’s refusal to abuse civil liberties is “the reason why we could never get the NSA or the FBI to actually buy our product.”
Karp later stated: “To your questions, no, we are not surveilling,” taking a beat
before adding
, “uh, U.S. citizens.”
In 2017, The Intercept published documents originally provided by Snowden, a
whistleblower
and former NSA contractor, demonstrating how Palantir software was used in conjunction with a signals intelligence tool codenamed XKEYSCORE, one of the most explosive revelations from the NSA whistleblower’s 2013 disclosures. XKEYSCORE provided the NSA and its foreign partners with a means of
easily searching through immense troves of data and metadata
covertly siphoned across the entire global internet, from emails and Facebook messages to webcam footage and web browsing. A
2008 NSA presentation
describes how XKEYSCORE could be used to detect “Someone whose language is out of place for the region they are in,” “Someone who is using encryption,” or “Someone searching the web for suspicious stuff.”
Later in 2017,
BuzzFeed News reported
Palantir’s working relationship with the NSA had ceased two years prior, citing an internal presentation delivered by Karp. Palantir did not provide comment for either The Intercept’s or BuzzFeed News’ reporting on its NSA work.
The Snowden documents describe how intelligence data queried through XKEYSCORE could be imported straight into Palantir software for further analysis. One document mentions use of Palantir tools in “Mastering The Internet,” a joint NSA/GCHQ mass surveillance
initiative
that included pulling data directly from the global fiber optic cable network that underpins the internet. References inside HTML files from the NSA’s Intellipedia, an in-house reference index, included multiple nods to the company, such as “Palantir Classification Helper,” “[Target Knowledge Base] to Palantir PXML,” and “PalantirAuthService.”
And although Karp scoffed at the idea that Palantir software would be suitable for “deep state” usage, a British intelligence document note also published by The Intercept
quotes
GCHQ saying the company’s tools were developed “through [an] iterative collaboration between Palantir computer scientists and analysts from various intelligence agencies over the course of nearly three years.”
Karp’s carefully worded clarification that Palantir doesn’t participate in the surveillance of Americans specifically would have been difficult if not impossible for the company to establish with any certainty. From the moment of its disclosure, XKEYSCORE presented immense privacy and civil liberties threats, both to Americans and noncitizens alike. But in the United States, much of the debate centered around the question of how much data on U.S. citizens is ingested — intentionally or otherwise — by the NSA’s globe-spanning surveillance capabilities.
Even without the NSA directly targeting Americans, their online speech and other activity is swept up during the the agency’s efforts to spy on foreigners: say, if a U.S. citizen were to email a noncitizen who is later targeted by the agency. Even if the public takes the NSA at its word that it does not deliberately collect and process information on Americans through tools like XKEYSCORE, it claims the legal authority under Section 702 of the Foreign Intelligence Surveillance Act to subsequently
share
such data it “incidentally” collects with other U.S. agencies,
including the FBI
.
The legality of such collection remains
contested
. Legal loopholes
created in the name of counterterrorism
and national security leave large gaps through which the NSA and its partner agencies can effectively bypass legal protections against spying on Americans and the 4th Amendment’s guarantee against warrantless searches.
A 2014
report
by The Guardian on the collection of webcam footage explained that GCHQ, the U.K.’s equivalent of the NSA, “does not have the technical means to make sure no images of UK or US citizens are collected and stored by the system, and there are no restrictions under UK law to prevent Americans’ images being accessed by British analysts without an individual warrant.” The report notes “Webcam information was fed into NSA’s XKeyscore search tool.”
In 2021, the federal Privacy and Civil Liberties Oversight Board concluded a five-year investigation into XKEYSCORE. In declassified remarks
reported by the Washington Post
, Travis LeBlanc, a board member who took part in the inquiry, said the NSA’s analysis justifying XKEYSCORE’s legality “lacks any consideration of recent relevant Fourth Amendment case law on electronic surveillance that one would expect to be considered.”
“The former Board majority failed to ask critical questions like how much the program costs financially to operate, how many U.S. persons have been impacted by KEYSCORE,” his statement continued. “While inadvertently or incidentally intercepted communications of U.S. persons is a casualty of modern signals intelligence, the mere inadvertent or incidental collection of those communications does not strip affected U.S. persons of their constitutional or other legal rights.”
Palantir did not respond when asked by The Intercept about the discrepancy between its CEO’s public remarks and its documented history helping spy agencies at home and abroad use what the NSA once
described
as its “widest reaching” tool.
Security updates for Friday
Linux Weekly News
lwn.net
2025-09-12 14:54:10
Security updates have been issued by Debian (cups, imagemagick, libcpanel-json-xs-perl, and libjson-xs-perl), Fedora (checkpointctl, chromium, civetweb, glycin, kernel, libssh, ruff, rust-secret-service, snapshot, and uv), Mageia (curl), Red Hat (kernel), SUSE (cups, curl, perl-Cpanel-JSON-XS, regio...
Now, NRK and
Dossier Center
can reveal how extensive and global the fraud was: over 100 ships have sailed with illegitimate insurance documents from Ro Marine.
“It's very serious and unusual that such a serious fraud happens with the help of a Norwegian company. At worst, it could undermine the trust in the Norwegian maritime industry,” says Thomas Angell Bergh from the Norwegian Maritime Authority.
Thomas Angell Bergh
Most of the fake insurance papers were for ships transporting goods out of Russia, mainly oil.
NRK has contacted dozens of Ro Marine’s clients. Only a few were willing to speak with us.
One of the customers says he was scammed by Ro Marine because he believed the insurance policies he purchased were valid.
Another says that "everyone" knows Ro Marine is fake, but that the ships need Western insurance documents. Such documents can make it easier to sail freely, without Western countries interfering in the transport.
This signature appears time and again on documents issued by Ro Marine.
The search for the owner of the signature led us to a Russian website, where you can download many different signatures. The one used by Ro Marine belongs to a "doctor," says the website.
Behind the global hoax is this man.
His name is Andrey Mochalin, a Russian citizen and resident of St. Petersburg. Mochalin has experience working for a reputable Norwegian insurance company.
In March, he, two Norwegians and a Bulgarian were charged with forging documents and operating an insurance business without a permit. Mochalin is also being investigated for violating international sanctions.
Through an attorney, the Norwegians say they do not understand the charges (see their full response in the fact box later in the article). The Bulgarian tells NRK he is innocent.
For several months, NRK and Dossier Center have tried unsuccessfully to get in touch with Mochalin. At the same time, we investigated Ro Marine's operations. What we found surprised experts.
“Among the worst of the worst”
Sanctions expert David Tannenbaum is shocked by the the scale of it. He knows how far Russia is willing to go to protect its oil exports, which are crucial for funding Putin’s illegal war in Ukraine.
Sanctions against Russia can make it difficult for tankers carrying Russian oil to obtain insurance that is approved in the West. But they are finding ways around it.
This is where Ro Marine enters the picture.
“Seems like Ro Marine is popular with sanctions evaders. You don't have this roster by accident,” says David Tannenbaum from Deep Blue Intelligence. The American company specializes in detecting sanctions evasion.
For Tannenbaum, it appears that Ro Marine primarily serves the
shadow fleet
or ships engaged in illegal activities or sanctions evasion.
“Is Ro Marine the worst of the worst? I think they're definitely in contention,” he both asks and answers.
Our documentation shows, for example, that Gatik, known as one of the largest players in the Russian shadow fleet, appears to have placed almost all of their ships with Ro Marine.
In addition, six ships linked to the Russian gas giant Novatek have had fake insurance from the Norwegian company. All six have sailed along the Norwegian coast towards the gas facility Arctic LNG2 in Russia, which is sanctioned by the USA. The ships are sanctioned by the EU.
Ships linked to the sanctioned Iranian oil industry and Iranian military have also been customers of Ro Marine.
Dangerous cargo from Russia
Among the cargo ships that have purchased invalid insurance from Ro Marine, we found "
Agattu
". Here, the vessel is sailing between Denmark and Sweden with explosives in the cargo, bound for Algeria.
Drone footage: Ole Jakobsen
Three tonnes of missile weapons were transported from St. Petersburg in Russia, according to Russian port records.
The ship joins the ranks of Ro Marine customers who have contributed to Russia's export revenues by transporting goods from Russian ports. This does not apply to all customers, but the vast majority, according to research by NRK and Dossier Center.
Provoked a NATO country
Not long ago, it was difficult to imagine that ordinary shipping in Europe could lead to military confrontation. Today, the situation is different. European countries may intervene in oil shipments that violate Western sanctions, which are intended to oppose Putin's bloody war in Ukraine. Russia has its countermeasures.
An illustrative example happened to a Ro Marine customer in mid-May: The oil tanker "
Blint
".
The Estonian navy suspected the ship was sailing without a flag—a clear violation of international regulations. The navy radioed the vessel, but according to Estonian authorities, the captain refused to cooperate.
Suddenly, a Russian fighter jet came whizzing over them, violating the NATO country’s airspace. Instead of stopping, the sanctioned tanker sailed on to the Russian oil port of Primorsk.
This video of the incident is filmed from inside the ship.
Like many other tankers transporting Russian oil, "Blint" has had fake insurance from Ro Marine, NRK can document.
In this way, Ro Marine has acted in line with the interests of the Russian authorities.
Russia's president, October 2023:
“Thanks to the actions of companies and authorities, the tanker fleet has grown, new mechanisms for payment,
insurance
and reinsurance of our cargo have been created.”
Vladimir Putin
Researcher Åse Gilje Østensen at the Norwegian Naval Academy tells that “sometimes, entities act in the Russian interest on their own initiative. Sometimes, Putin or other central figures around him have signaled that certain initiatives are welcome. In such cases, actors will often seek to please the regime.”
Åse Gilje Østensen
“Russia is an authoritarian regime that can force civilian actors to assist the regime. Other times, state bodies may be more directly involved. What is the case regarding Ro Marine is difficult to know.”
The Russian Embassy in Norway does not answer NRK's questions about Ro Marine's operations because the company is Norwegian and refers us to the Norwegian authorities. They also do not respond to whether Ro Marine has acted in accordance with the interests of Russian authorities, or any other statements and findings in this case.
The embassy does however choose to point out that the sanctions against the "shadow fleet" are contrary to international law.
Approved by the largest flag state
For several years, the Norwegian company operated without permission and with fake documents without any authorities noticing — neither in Norway nor abroad.
The earliest objectionable activity was in 2021, according to NRK’s investigation.
At that time, Ro Marine applied to be recognized as an insurance company by the world's largest
flag state
, Panama, despite missing the necessary approval from Norwegian authorities.
NRK has found that most ships among Ro Marine's clients are registered in Panama.
Ro Marine sent the flag state a forged reference, which was originally given to a completely different company, sources tell NRK. With this reference, Ro Marine was recognized by the flag state of Panama in December 2021.
The illegal activity continued unencumbered until NRK alerted the flag states.
Russian owner worked many years for a Norwegian company
The Russian owner of Ro Marine, Andrey Mochalin, has gone underground. Mochalin has not responded to any of the numerous inquiries from NRK and Dossier Center.
For over ten years, he worked for a legitimate Norwegian insurance company. Most of the time, he worked from St. Petersburg.
Occasionally, he visited his employer's office in Oslo. Here he is pictured with his former colleagues.
At this time, two of his Norwegian managers also owned another company that offered insurance. This company later became Ro Marine.
A few weeks after Russia's war against Ukraine began in 2022, Ro Marine passed from Norwegian to Russian ownership.
This is when Mochalin bought Ro Marine from the company of the two Norwegians for almost two million NOK.
These two Norwegians, along with Andrey Mochalin and a Bulgarian citizen, are charged with forging documents and conducting illegal insurance business.
Out of respect for the ongoing investigation, the Norwegians did not want to be interviewed by NRK, according to their lawyer.
Ro Marine claimed its address was here at the Norwegian Shipowners' Association building in Oslo, but according to the association, that was not correct.
Money trail in Russia
Alongside the Norwegian company Ro Marine, Andrey Mochalin runs a company in St. Petersburg with direct links to Ro Marine.
NRK and Dossier Center have obtained access to bank documents for
his Russian company
.
The money transfers are many, and some stand out.
Last year, there were 36 payments totaling approximately five million NOK that we can connect to Ro Marine.
The bank transfers were marked with the name of the ship and the policy number.
The number corresponds to insurance documents issued by Ro Marine.
The company’s account also shows salary payments to Mochalin.
Those charged
Here’s what we can share about the Russian Andrey Mochalin and the others charged in the case (in parentheses is the time period they had official roles in Ro Marine):
A. Mochalin (2022-2025):
Sole owner of Ro Marine during the period when the company's illegal insurance activities were most widespread, according to our documentation.
Majority owner of the Russian company that received payments worth millions of NOK last year marked for Ro Marine.
Norwegian 1 (2016-2023):
Registered as co-owner in the Russian company that last year received payments worth millions of NOK marked for Ro Marine.
Owner and board chairman in 2021 when someone sent a forged reference on behalf of Ro Marine to Panama. At the time the company lacked a permit to sell insurance.
In 2024, a year after he left the board, contributed to ensuring Ro Marine’s continued operations by securing a new board member: the charged Bulgarian.
Norwegian 2 (2017-2023):
Owner and managing director in 2021 when someone sent a forged reference on behalf of Ro Marine to Panama. At the time the company lacked a permit to sell insurance.
Bulgarian (2024-2025):
Car mechanic without any experience in marine insurance.
Tenant of Norwegian 1 for years.
Board member in Ro Marine for over half a year, up to March 2025.
After the Norwegians left the board in 2023, the Norwegian company had a problem. With a Russian as the only board member, the company was in violation of the Companies Act of Norway, which requires at least one board member to be from an EEA country. Since Bulgaria is a member of the EEA, the Bulgarian could solve the issue.
According to the Bulgarian, his Norwegian landlord arranged the board position to help him financially.
However, he never received the money he was promised and left the board because he realized something was wrong, the Bulgarian says. He does not understand the police's suspicion towards him.
“I had no idea what they were doing. I have nothing to hide,” says the Bulgarian to NRK. He has been contacted by the police and says he is fully cooperating with them.
Panama alone has banned 16 ships from sailing because the ships have not shown new, real insurance within the deadline they were given.
The UK has sanctioned Ro Marine. Ro Marines’s website has been taken down. In July, the Oslo District Court forcibly dissolved the company for breach of accounting obligations, because Ro Marine had not submitted annual accounts for 2023.
However, the Russian company in St. Petersburg, which received payments worth millions of NOK marked for Ro Marine, is still active.
One month after NRK's revelation in March, another ship in the Russian shadow fleet presented a fake insurance certificate from Ro Marine. Inspectors at the oil port of Primorsk were presented with a document "signed in Oslo."
The expiration date of the fake insurance?
April next year.
NRK has contacted the companies that operate the mentioned ships "Agattu" and "Blint", with no response. We have not been able to contact Gatik. Novatek has not responded to our questions.
Publisert
12.09.2025, kl. 07.00
Justice Department Announces Actions to Combat North Korean Remote IT Workers
Note:
This press release has been updated to reflect new information regarding the guilty plea of one defendant in the District of Massachusetts.
The Justice Department announced today coordinated actions against the Democratic People’s Republic of North Korea (DPRK) government’s schemes to fund its regime through remote information technology (IT) work for U.S. companies. These actions include two indictments, an information and related plea agreement, an arrest, searches of 29 known or suspected “laptop farms” across 16 states, and the seizure of 29 financial accounts used to launder illicit funds and 21 fraudulent websites.
According to court documents, the schemes involve North Korean individuals fraudulently obtaining employment with U.S. companies as remote IT workers, using stolen and fake identities. The North Korean actors were assisted by individuals in the United States, China, United Arab Emirates, and Taiwan, and successfully obtained employment with more than 100 U.S. companies.
As alleged in court documents, certain U.S.-based individuals enabled one of the schemes by creating front companies and fraudulent websites to promote the bona fides of the remote IT workers, and hosted laptop farms where the remote North Korean IT workers could remote access into U.S. victim company-provided laptop computers. Once employed, the North Korean IT workers received regular salary payments, and they gained access to, and in some cases stole, sensitive employer information such as export controlled U.S. military technology and virtual currency. In another scheme, North Korean IT workers used false or fraudulently obtained identities to gain employment with an Atlanta, Georgia-based blockchain research and development company and stole virtual currency worth approximately over $900,000.
“These schemes target and steal from U.S. companies and are designed to evade sanctions and fund the North Korean regime’s illicit programs, including its weapons programs,” said Assistant Attorney General John A. Eisenberg of the Department’s National Security Division. “The Justice Department, along with our law enforcement, private sector, and international partners, will persistently pursue and dismantle these cyber-enabled revenue generation networks.”
“North Korean IT workers defraud American companies and steal the identities of private citizens, all in support of the North Korean regime,” said Assistant Director Brett Leatherman of FBI’s Cyber Division. “That is why the FBI and our partners continue to work together to disrupt infrastructure, seize revenue, indict overseas IT workers, and arrest their enablers in the United States. Let the actions announced today serve as a warning: if you host laptop farms for the benefit of North Korean actors, law enforcement will be waiting for you.”
“North Korea remains intent on funding its weapons programs by defrauding U.S. companies and exploiting American victims of identity theft, but the FBI is equally intent on disrupting this massive campaign and bringing its perpetrators to justice,” said Assistant Director Roman Rozhavsky of the FBI Counterintelligence Division. “North Korean IT workers posing as U.S. citizens fraudulently obtained employment with American businesses so they could funnel hundreds of millions of dollars to North Korea’s authoritarian regime. The FBI will do everything in our power to defend the homeland and protect Americans from being victimized by the North Korean government, and we ask all U.S. companies that employ remote workers to remain vigilant to this sophisticated threat.”
Zhenxing Wang, et al. Indictment, Seizure Warrants, and Arrest – District of Massachusetts
Today, the United States Attorney’s Office for the District of Massachusetts and the National Security Division announced the arrest of U.S. national Zhenxing “Danny” Wang of New Jersey pursuant to a
five-count indictment
. The indictment describes a multi-year fraud scheme by Wang and his co-conspirators to obtain remote IT work with U.S. companies that generated more than $5 million in revenue. The indictment also charges Chinese nationals Jing Bin Huang (靖斌 黄), Baoyu Zhou (周宝玉), Tong Yuze (佟雨泽), Yongzhe Xu (徐勇哲 andيونجزهي أكسو), Ziyou Yuan (زيو) and Zhenbang Zhou (周震邦), and Taiwanese nationals Mengting Liu (劉 孟婷) and Enchia Liu (刘恩) for their roles in the scheme. A second U.S. national, Kejia “Tony” Wang of New Jersey, has agreed to plead guilty for his role in the scheme and was
charged separately in an information
unsealed today.
“The threat posed by DPRK operatives is both real and immediate. Thousands of North Korean cyber operatives have been trained and deployed by the regime to blend into the global digital workforce and systematically target U.S. companies,” said U.S. Attorney Leah B. Foley for the District of Massachusetts. “We will continue to work relentlessly to protect U.S. businesses and ensure they are not inadvertently fueling the DPRK’s unlawful and dangerous ambitions.”
According to the indictment, from approximately 2021 until October 2024, the defendants and other co-conspirators compromised the identities of more than 80 U.S. persons to obtain remote jobs at more than 100 U.S. companies, including many Fortune 500 companies, and caused U.S. victim companies to incur legal fees, computer network remediation costs, and other damages and losses of at least $3 million. Overseas IT workers were assisted by Kejia Wang, Zhenxing Wang, and at least four other identified U.S. facilitators. Kejia Wang, for example, communicated with overseas co-conspirators and IT workers, and traveled to Shenyang and Dandong, China, including in 2023, to meet with them about the scheme. To deceive U.S. companies into believing the IT workers were located in the United States, Kejia Wang, Zhenxing Wang, and the other U.S. facilitators received and/or hosted laptops belonging to U.S. companies at their residences, and enabled overseas IT workers to access the laptops remotely by, among other things, connecting the laptops to hardware devices designed to allow for remote access (referred to as keyboard-video-mouse or “KVM” switches).
Kejia Wang and Zhenxing Wang also created shell companies with corresponding websites and financial accounts, including Hopana Tech LLC, Tony WKJ LLC, and Independent Lab LLC, to make it appear as though the overseas IT workers were affiliated with legitimate U.S. businesses. Kejia Wang and Zhenxing Wang established these and other financial accounts to receive money from victimized U.S. companies, much of which was subsequently transferred to overseas co‑conspirators. In exchange for their services, Kejia Wang, Zhenxing Wang, and the four other U.S. facilitators received a total of at least $696,000 from the IT workers.
IT workers employed under this scheme also gained access to sensitive employer data and source code, including International Traffic in Arms Regulations (ITAR) data from a California-based defense contractor that develops artificial intelligence-powered equipment and technologies. Specifically, between on or about Jan. 19, 2024, and on or about April 2, 2024, an overseas co-conspirator remotely accessed without authorization the company’s laptop and computer files containing technical data and other information. The stolen data included information marked as being controlled under the ITAR.
Simultaneously with today’s announcement, the FBI and Defense Criminal Investigative Service (DCIS) seized 17 web domains used in furtherance of the charged scheme and further seized 29 financial accounts, holding tens of thousands of dollars in funds, used to launder revenue for the North Korean regime through the remote IT work scheme.
Previously, in October 2024, as part of this investigation, federal law enforcement executed searches at eight locations across three states that resulted in the recovery of more than 70 laptops and remote access devices, such as KVMs. Simultaneously with that action, the FBI seized four web domains associated with Kejia Wang’s and Zhenxing Wang’s shell companies used to facilitate North Korean IT work.
The FBI Las Vegas Field Office, DCIS San Diego Resident Agency, and Homeland Security Investigations San Diego Field Office are investigating the case.
Assistant U.S. Attorney Jason Casey for the District of Massachusetts and Trial Attorney Gregory J. Nicosia, Jr. of the National Security Division’s National Security Cyber Section are prosecuting the case, with significant assistance from Legal Assistants Daniel Boucher and Margaret Coppes. Valuable assistance was also provided by Mark A. Murphy of the National Security Division’s Counterintelligence and Export Control Section and the U.S. Attorneys’ Offices for the District of New Jersey, Eastern District of New York, and Southern District of California.
Kim Kwang Jin et al. Indictment – Northern District of Georgia
Today, the Northern District of Georgia unsealed a
five-count wire fraud and money laundering indictment
charging four North Korean nationals, Kim Kwang Jin (김관진), Kang Tae Bok (강태복), Jong Pong Ju (정봉주) and Chang Nam Il (창남일), with a scheme to steal virtual currency from two companies, valued at over $900,000 at the time of the thefts, and to launder proceeds of those thefts. The defendants remain at large and wanted by the FBI.
“The defendants used fake and stolen personal identities to conceal their North Korean nationality, pose as remote IT workers, and exploit their victims’ trust to steal hundreds of thousands of dollars,” said U.S. Attorney Theodore S. Hertzberg for the Northern District of Georgia. “This indictment highlights the unique threat North Korea poses to companies that hire remote IT workers and underscores our resolve to prosecute any actor, in the United States or abroad, who steals from Georgia businesses.”
According to the indictment, the defendants traveled to the United Arab Emirates on North Korean travel documents and worked as a co-located team. In approximately December 2020 and May 2021, respectively, Kim Kwang Jin (using victim P.S.’s stolen identity) and Jong Pong Ju (using the alias “Bryan Cho”) were hired by a blockchain research and development company headquartered in Atlanta, Georgia, and a virtual token company based in Serbia. Both defendants concealed their North Korean identities from their employers by providing false identification documents containing a mix of stolen and fraudulent identity information. Neither company would have hired Kim Kwang Jin and Jong Pong Ju had they known that they were North Korean citizens. Later, on a recommendation from Jong Pong Ju, the Serbian company hired “Peter Xiao,” who in fact was Chang Nam Il.
After gaining their employers’ trust, Kim Kwang Jin and Jong Pong Ju were assigned projects that provided them access to their employers’ virtual currency assets. In February 2022, Jong Pong Ju used that access to steal virtual currency worth approximately $175,000 at the time of the theft, sending it to a virtual currency address he controlled. In March 2022, Kim Kwang Jin stole virtual currency worth approximately $740,000 at the time of theft by modifying the source code of two of his employer’s smart contracts, then sending it to a virtual currency address he controlled.
To launder the funds after the thefts, Kim Kwang Jin and Jong Pong Ju “mixed” the stolen funds using the virtual currency mixer Tornado Cash and then transferred the funds to virtual currency exchange accounts controlled by defendants Kang Tae Bok and Chang Nam Il but held in the name of aliases. These accounts were opened using fraudulent Malaysian identification documents.
The FBI Atlanta Field Office is investigating the case.
Assistant U.S. Attorneys Samir Kaushal and Alex Sistla for the Northern District of Georgia and Trial Attorney Jacques Singer-Emery of the National Security Division’s National Security Cyber Section are prosecuting the case.
21 Searches of Known or Suspected U.S.-based Laptop Farms – Multi-District
Between June 10 and June 17, 2025, the FBI executed searches of 21 premises across 14 states hosting known and suspected laptop farms. These actions, coordinated by the FBI Denver Field Office, related to investigations of North Korean remote IT worker schemes being conducted by the U.S. Attorneys’ Offices of the District of Colorado, Eastern District of Missouri, and Northern District of Texas. In total, the FBI seized approximately 137 laptops.
Valuable assistance was provided by the U.S. Attorney’s Offices for the District of Connecticut, the Eastern District of Michigan, the Eastern District of Wisconsin, the Middle District of Florida, the Northern District of Georgia, the Northern District of Illinois, the Northern District of Indiana, the District of Oregon, the Southern District of Florida, the Southern District of Ohio, the Western District of New York, and the Western District of Pennsylvania.
***
The Department’s actions to combat these schemes are the latest in a series of law enforcement actions under a joint National Security Division and FBI Cyber and Counterintelligence Divisions effort, the DPRK RevGen: Domestic Enabler Initiative. This effort prioritizes targeting and disrupting the DPRK’s illicit revenue generation schemes and its U.S.-based enablers. The Department previously announced other actions pursuant to the initiative, including in
January 2025
and prior, as well as the filing of a civil forfeiture complaint in
early June 2025
for over $7.74 million tied to an illegal employment scheme.
As the FBI has described in Public Service Announcements published in
May 2024
and
January 2025
, North Korean remote IT workers posing as legitimate remote IT workers have committed data extortion and exfiltrated the proprietary and sensitive data from U.S. companies. DPRK IT worker schemes typically involve the use of stolen identities, alias emails, social media, online cross-border payment platforms, and online job site accounts, as well as false websites, proxy computers, and witting and unwitting third parties located in the U.S. and elsewhere.
Other public advisories about the threats, red flag indicators, and potential mitigation measures for these schemes include a
May 2022
advisory released by the FBI, Department of the Treasury, and Department of State; a
July 2023
advisory from the Office of the Director of National Intelligence; and guidance issued in
October 2023
by the United States and the Republic of Korea (South Korea). As described the May 2022 advisory, North Korean IT workers have been known individually to earn up to $300,000 annually, generating hundreds of millions of dollars collectively each year, on behalf of designated entities, such as the North Korean Ministry of Defense and others directly involved in the DPRK’s weapons programs.
The U.S. Department of State has offered potential
rewards for up to $5 million
in support of international efforts to disrupt the DPRK’s illicit financial activities, including for cybercrimes, money laundering, and sanctions evasion.
The details in the above-described court documents are merely allegations. All defendants are presumed innocent until proven guilty beyond a reasonable doubt in a court of law.
Was pulled in to a fun customer issue last Friday around disabling RC4 in Active Directory. What happened was, as you can imagine, not good: RC4 was disabled and half their environment promptly started having a Very Bad Day.
Twitter warning:
Like all good things this is mostly correct, with a few details fuzzier than others for reasons: a) details are hard on twitter; b) details are fudged for greater clarity; c) maybe I'm just dumb.
RC4 is a stream cipher. A stream cipher is kinda sorta like a one time pad (note: kinda, and sorta). A one-time pad is a cryptographic operation that takes one value and XORs it against a random value. A^B = C. A is your data, B is random noise. C is your encrypted cipher.
They're incredibly useful because the XOR operation is only reversible when you know A or B, and if B is suitably random that means you have to guess for all combinations of B. In other words you have to brute force it. As far as cryptography goes that's nearly perfection.
However, the trick with one-time pads is that you need as many random bits as you have data. If you have 10 data you need 10 random. If you 10k data you need 10k random. You cannot repeat the random, lest you introduce a pattern and code breakers just love patterns.
So a stream cipher could take a one-time pad and cut the key down to a fixed length, manipulating the key every operation. Let's say you have 100 data and 10 random. The first 10 data get XOR'ed to the 10 random, then the 10 random get XOR'ed to something else. Repeat 10 times.
This turns out to be incredibly simple to code and is incredibly fast relative to other crypto algorithms. However, the cost is that you're now doing key scheduling which means if you can predict the schedule you've broken the cipher.
RC4 fits the bill here. It's painfully simple to implement. Here it is in entirety. But it's also irreparably broken.
The thing with RC4 is that if you have enough data transformed by a single key you can eventually predict what the original plaintext is. This became a semi-practical attack in 2013 when some smart folks figured out how to apply this to TLS.
https://isg.rhul.ac.uk/tls/
For this attack to happen it requires observing billions of bytes of data. This is done easily enough with TLS, hence why folks jumped at disabling RC4 cipher suites. TLS isn't the only place RC4 is used, and RC4 is still broken, so it's just good form to disable it everywhere.
So now we have Active Directory and RC4 is enabled by default. In 2021?! How dare. Weeeeeelllll, RC4 isn't quite that bad in this case. Like, it's bad, but not super bad because each key will only ever be used for encrypting small amounts of data.
The current attacks require observing *lots* of data encrypted with a single RC4 key. For Kerberos we're talking maybe 8-16kb at most, not the gigabytes required for the attack. So it's bad, but not end of the world bad.
But that's no excuse. Why is it still there? Well, it turns out the RC4 cipher suite has a unique property: it doesn't require a salt when doing key agreement.
Huh?
So Kerberos has two legs between a client and KDC. AS and TGS.
AS is how you authenticate yourself: here's password gimme krbtgt.
TGS is: here's krbtgt gimme service ticket.
Now we don't ever just send the password to the server. What we do is we encrypt a thing using the password as a key and fire that thing over to the server. The server also has your password and can decrypt. If it decrypts then we agree we both know the password.
But Active Directory doesn't store the password itself. It stores a key derived from the password. That is, take the password and hash it, and store that hashed value. You encrypt against this hashed value.
So lets go back in time, circa mid 90's when Active Directory was being built. Back then, in the real world, Windows authentication was NTLM. NTLM kinda sorta worked similarly where you derived a key from a password and used that key to sign some stuff.
That derivation was and is admittedly very lame. It's
md4(password)
. So in NT Server in the directory database was username + md4(password), and that's it.
And taking the cryptographic properties of NTLM out the picture for a moment, it still kinda sucked as an authentication protocol. It didn't offer server authentication and it wasn't easily extendable for future changes. So Active Directory switched to Kerberos.
But we wanted upgrades to be seamless. When you left work Friday afternoon logging in to NT and came back Monday morning we wanted you to be able to log in to AD without any fuss, despite the fact that we changed literally everything out from under you.
And Kerberos at the time was using DES (eww). This posed a problem. You can't transform an MD4 key into a DES key. The protocol certainly didn't let you do weird things like DES(MD4(password)), so Windows created the MD4+RC4 crypto system.
The magic of this is that the NTLM key and Kerberos key are interchangeable so no password changes were required. Over time as users changed their passwords Active Directory would derive these newer, stronger, keys and would inform clients to prioritize these stronger keys.
This turns out to be phenomenally powerful because it transparently migrates users to stronger keys without breaking anyone, at a small cost of delaying weeks or months as password changes occur. Is it perfect? No. Does it keep the masses happy? Yes.
But there's another small problem: salts. Salts add randomness to passwords, making their hashed form incomparable to other hashes. hash('password', salt1) <> hash('password', salt2). We do this for all sorts of reasons, but the primary is so it's harder to guess the password.
Important
This lack of salt and the use of MD4 for password to key derivation is what makes the RC4 cipher suite in Kerberos dangerous. The RC4 portion itself is kinda meh in the overall scheme of things.
This is because MD4 itself is a pretty lousy hash algorithm, and it's easier to guess the original password when compared to the AES ciphers. Passwords that are longer than 12 characters are generally safe from these kinds of attacks. I go into detail in
Protecting Against Credential Theft in Windows
.
But salts complicate the protocol. Both client and KDC need to take the password and salt and derive the same key, so both need to know them. Seems simple enough? Just use a well known value...?
Ah no, they need to be unique, otherwise two identical passwords with the same salt will form the same hash. Oops. Okay, compute a salt from the username. Aha, here we go, this works. It's unique and both parties know it. And it doesn't change! Oh wait, it can.
Users change their usernames all the time because names change all the time. If the salt were just the username that means we'd have to recompute the hash in AD every time the username changed, which means we'd need to know the password and only admins change usernames so 🤔.
Okay, so what if the KDC only changes the salt on password change and just tells us what it is whenever we authenticate? And so this is how Kerberos works. The client "pings" (no not ICMP) the KDC and the KDC says here ya go.
And so long story short (hahahaha, oh.) we come to the customer issue. RC4 doesn't require a salt. AES requires a salt. You can see in the previous screenshot the ARCFOUR instance doesn't provide a salt because there just isn't one.
If you use RC4 you don't need to ping the KDC. You can just derive the key from the password and go. Therefore if you disable RC4 conversely you must always know the salt.
This poses a small challenge. Sometimes you don't want the system to know the password itself, so you provide these things called keytabs. They're files with some structured data that map users to derived keys. You derive the keys upfront once so you only need the password once.
Sometimes these files are generated without having line of sight to a KDC too so the tool generating the file has to guess which salt to use. Sometimes the tool guesses the wrong salt. Oops.
So you generate these files and deploy the thing and the system chugs along just fine until you disable RC4 and the whole things comes crashing down. The system was using RC4, and RC4 worked because there wasn't a salt. When it switched to AES it required the salt and...oops.
Well that's a pretty lame experience for anyone, let me tell you. Why did the tool guess the wrong salt? I've already told you: because the name changed.
Active Directory lets you change your username. You actually have two different usernames. Your sAMAccountName and your UPN. You can use either to log in with Kerberos, but Active Directory will only derive your salt from a single value: the sAMAccountName + Realm.
Well it happens that their UPN is in the form samaaccountname@customrealm.com, but their realm is anotherrealmentirely[.]com. As such their salt was ANOTHERREALMENTIRELY.COMsamaccountname. This happens when you add custom UPN suffixes.
The password never changed so the salt never had a chance to change (not that it would). And the tool to generate the keytab just derived the salt from the user principal name, which in most cases is fine, but in this case wasn't.
And so you might think to yourself that this isn't that big a deal you haven't ever changed the usernames or the suffixes or anything like that. It turns out there's exactly one scenario that every environment hits: when you promote the first DC in a forest.
When you create a forest the domain admin is the machine's local administrator account. I'm not talking implicitly, I'm talking the local admin is copied into the AD database including it's password. The password that was set when you built the machine, before it had a realm.
And the domain admin is part of the protected users group so RC4 is already disabled for them.
If the admin is a member of the Protected Users group, that means RC4 is disabled for them, and that then means they MUST use AES, which means they MUST use a salt, which means the salt had to exist when the machine was first built. Oy.
So the realm is the computer name.
Anyway. Here's some useful links as you consider getting rid of RC4.
A brand-new Hell Gate Podcast will be dropping later today! You won't want to miss it. Listen and subscribe
here
, or wherever you get your podcasts.
But first, a message from our sponsor:
Seeking someone special for picnics in Prospect Park? Makeouts at MoMA? Karaoke in K-Town? That's why we built
oh hi ♡—
to set up every cool, chill, and single New Yorker with someone who's just as cool and sexy as they are!
Download the app, introduce yourself, and we'll fix you up →
For months, the fate of Aaron Alexis's Bed-Stuy cannabis store
has been in limbo
, after the state's Office of Cannabis Management (OCM)
abruptly changed its interpretation
of a rule stating how far dispensaries must be from nearby schools.
Previously, the OCM had measured the distance of a dispensary from a school's front door; the tweak now measured the distance from the school's entire grounds.
That bureaucratic (and somewhat mystifying) rule change from OCM, which occurred at the end of July, upended Alexis's plans. He had spent hundreds of thousands of dollars of his own money to start his business, signed a lease and built out his store, and was just days away from getting his Conditional Adult Use Retail Dispensary (CAURD) license. Now he was being told that the store was just a little too close to schools in the neighborhood.
Alexis wasn't alone—OCM told more than 40 pending licensees and more than 100 existing dispensary operators across the state, most of whom are, like Alexis, people of color with prior weed convictions who were prioritized for licenses, that they were in violation of the newly tweaked regulations.
But shop operators like Alexis can now breathe a sigh of relief: Earlier this week, as part of a lawsuit filed by several impacted dispensary owners challenging the changes to the buffer zone rule, the state
agreed to a deal
—until February 15 of next year, dispensary owners can remain, or in Alexis's case, open, in their current spaces. (The news was
first reported
by the New York Times.)
‘It used to weigh me down’: UK readers on why they do or don’t carry a wallet
Guardian
www.theguardian.com
2025-09-12 14:19:42
With research suggesting fewer than half of adults carry a wallet, four people reveal if they still do and what’s inside Fewer than half of British adults now carry a physical wallet, according to recent research, with many carrying payment cards on their phones or smartwatches instead. But while di...
But while digital wallets such as Apple Pay or Google Pay are the default payment method among generation Z and millennials, many people over the age of 44 still rely on physical debit and credit cards.
Four readers told us about their wallets.
‘Unnecessary’
Alosh K Jose says the move to online and contactless payments after the Covid pandemic means it is unnecessary to carry a wallet.
Photograph: Alosh K Jose/Guardian Community
“It used to weigh me down,” says Alosh K Jose, from Newcastle upon Tyne, adding that he now rarely uses a physical wallet. “It became an additional, needless thing to carry in my pockets.”
Jose says the move to online and contactless payments after the Covid pandemic means it is unnecessary to carry a wallet. “All my bank cards are on my phone,” said the 31-year-old, who runs a company delivering cricket sessions in the local community.
Despite getting stuck on a train in Spain during the
huge power outage
that hit parts of Europe in April, Jose does not feel the need too carry cash.
“My fiancee and I were travelling from Barcelona to Madrid and had to wait five hours on the train before we got off. We only had €10 [£8.70] in cash but some people gave us a bit of money so we could get on a bus,” he says.
“If the same were to happen in Newcastle, even without physical cash there’s no language barrier so I think it’d be fine. Maybe I’d think differently if I was travelling further or on holiday abroad.”
‘I don’t want to leave the window wide open for misuse of my sensitive information’
Roger, who still uses a physical wallet, says he feels vulnerable taking his phone out of his pocket.
Photograph: Roger/Guardian Community
In Buckinghamshire, Roger, a retired IT worker, still carries both a wallet and a separate coin purse. “Putting my cards on my smartphone would mean having all my eggs in one basket and becomes a single point of failure,” he says.
Apart from having some cards that have no electronic equivalent and are necessary for him to carry, the 69-year-old says he feels vulnerable taking his phone out of his pocket. “Flashing it to pay for something in a shop strikes me as a dangerous thing to do and I risk dropping it too.
“I worked in IT and security and I recognise that there are windows of opportunity for misuse when it comes to sensitive information. I just don’t want to leave that window wide open [using a digital wallet].”
Among the cash, payment and loyalty cards in his wallet, Roger carries a snippet from the letters page of the Times from the 1980s: “I’ve been a morris dancer since I was 20 and the letter says something about me, I suppose.”
A snippet from the letters page of the Times from the 1980s.
Photograph: Roger/Guardian Community
‘I just like using a physical card – it’s about control’
Gen Zer Georgina finds it shocking that so so few people carry a wallet.
Photograph: Georgina/Guardian Community
Georgina, 26, finds it shocking that so few people carry a wallet. “I carry a purse on me at all times as I prefer to own physical items over digital copies,” she says.
In her purse she carries debit cards; a driver’s licence; railcard, supermarket loyalty cards; £20 in emergency cash along with loose change; and a “battered business card for a taxi company”.
Georgina goes against the grain by not using a digital wallet.
Photograph: Guardian Community
As a gen Zer, Georgina, who lives in Leeds and helps to develop and deploy online tech training courses, goes against the grain by not using a digital wallet.
“Call me old-fashioned but I hate the idea of it,” she says. “All my friends use their phones to pay for things and I can see it’s convenient – I think they just think it’s a bit quirky that I don’t.
“I like physical things like using a card and miss things like paper concert tickets. It’s about control as I don’t want to be too reliant on my phone. I remember when you used to have to ask people if they take cards, but now you need to ask if they take cash. It’s wild.”
‘My wallet
is a generous phone case’
Before she received her first smartphone during the pandemic, Sara Hayward used to carry a wallet ‘twice the size’ of her phone case.
Photograph: Sara Hayward/Guardian Community
Sara Hayward, a 61-year-old artist from Worcester, says her wallet “has morphed into a generous phone case”.
Before she received her first smartphone during the pandemic, Hayward used to carry a wallet “twice the size” of her case and, as an artist, often had a digital camera with her. Now her phone case is a combination of all of these – and more.
Sara Hayward still carries physical cards but tucked into the case of her phone.
Photograph: Sara Hayward/Guardian Community
“I keep my bank card, airport taxi card, supermarket loyalty card, local stately home garden season ticket, note to self stating annual multi trip travel insurance information, GHIC card, Polaroid snapshot of me, my daughter and my son’s girlfriend at a recent Mallorcan wedding, receipts as I’m self employed, and emergency cash.
“My phone has short videos of my mum before she passed away four years ago. It’s like a living wallet having her on there.”
Hayward does not use any digital payment methods as physical cards feels more “secure”. The perfect compromise has one drawback though: “There’s no room for my lipstick and tissue.”
UK defence secretary John Healey has outlined new plans to send thousands of interceptor missiles to Ukraine every month, with the Ukrainian-developed UAV to be shared with the UK to help in the fight against Russia.
Speaking at DSEI, Healey outlined ‘Project Octopus’, a new partnership between the UK and Ukraine. Under the project, Ukraine would share technology developed for a new interceptor drone that had proved highly effective against Iranian-made, Russian-deployed Shahed one-way attack drones and cost less than 10% of the Russian systems destroyed.
According to Healey, the UK would in turn “rapidly develop” this Ukrainian interceptor drone – with the IP and technology shared with the UK – to mass produce it. Thousands of small interceptor drones are planned to be sent to Ukraine every month.
“It demonstrates that wartime necessity really is the mother of constant invention,” he said. “It [Project Octopus] means we have access to the best and developing battlefield technology for our own forces”.
The agreement followed investment from Ukraine’s largest drone manufacturer, UKRSPECSYSTEMS, which announced that it would invest £200 million (US$271.2 million) into two new UK facilities – the first major investment by a Ukrainian defence company in the UK, according to Healy.
At DSEI, other UK-Ukraine drone partnerships reared their heads, as the UK sought to boost drone output for its armed forces. This includes a
joint venture
between the UK firm Prevail Partners and Ukrainian manufacturer Skyeton for its Raybird UAV. The drone is, as
Shephard
reported, to be submitted for Project Corvus as a potential bid to replace the Watchkeeper drone.
“We know that whenever equipment is in the hands of the war fighter, whoever can get that new technology into their hands fastest has the edge. We’ve proved we can do it with Ukraine through the excellent work of Task Forces Kindred. We now must do it for ourselves in Britain,” he emphasised.
Nepal's "Gen Z Protests" Topple Government Amid Anger over Corruption & Inequality
Democracy Now!
www.democracynow.org
2025-09-12 13:51:04
Following massive, youth-led anti-corruption demonstrations in Nepal, the country’s former Chief Justice Sushila Karki looks set to become interim prime minister. This week, protesters set fire to the Parliament and other government buildings, and at least 21 people were killed in a police cra...
Immigration raids are spreading across the country. The agencies meant to protect public health are being dismantled from within. Public broadcasting is being defunded... Today, Democracy Now!'s independent reporting is more important than ever. Because we never accept corporate or government funding, we rely on viewers, listeners and readers like you to sustain our work.
Can you start a monthly donation?
Monthly donors represent more than 20 percent of our annual revenue.
Every dollar makes a difference
. Thank you so much.
Democracy Now!
Amy Goodman
Non-commercial news needs your support.
We rely on contributions from you, our viewers and listeners to do our work. If you visit us daily or weekly or even just once a month, now is a great time to make your monthly contribution.
Following massive, youth-led anti-corruption demonstrations in Nepal, the country’s former Chief Justice Sushila Karki looks set to become interim prime minister. This week, protesters set fire to the Parliament and other government buildings, and at least 21 people were killed in a police crackdown. The protests continued even after the government lifted its ban on social media platforms and Prime Minister K.P. Sharma Oli resigned.
“We don’t really know what is happening at the moment … since most of our state institutions have been either destroyed or are nonfunctional,” says Pranaya Rana, a writer and journalist based in Kathmandu. “We really are counting on the new generation, the Gen Z, who led the protests, to take us forward.”
Please check back later for full transcript.
The original content of this program is licensed under a
Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License
. Please attribute legal copies of this work to democracynow.org. Some of the work(s) that this program incorporates, however, may be separately licensed. For further information or additional permissions, contact us.
Non-commercial news needs your support
We rely on contributions from our viewers and listeners to do our work.
Please do your part today.
"A Historic Moment in Brazil": Jair Bolsonaro Gets 27 Years for 2022 Coup Plot
Democracy Now!
www.democracynow.org
2025-09-12 13:37:42
Brazil’s Supreme Court has sentenced former President Jair Bolsonaro to more than 27 years in prison for plotting a military coup and seeking to “annihilate” democracy in Brazil following his election defeat in 2022. The sentencing marks the first time a former Brazilian head of st...
Immigration raids are spreading across the country. The agencies meant to protect public health are being dismantled from within. Public broadcasting is being defunded... Today, Democracy Now!'s independent reporting is more important than ever. Because we never accept corporate or government funding, we rely on viewers, listeners and readers like you to sustain our work.
Can you start a monthly donation?
Monthly donors represent more than 20 percent of our annual revenue.
Every dollar makes a difference
. Thank you so much.
Democracy Now!
Amy Goodman
Non-commercial news needs your support.
We rely on contributions from you, our viewers and listeners to do our work. If you visit us daily or weekly or even just once a month, now is a great time to make your monthly contribution.
Brazil’s Supreme Court has sentenced former President Jair Bolsonaro to more than 27 years in prison for plotting a military coup and seeking to “annihilate” democracy in Brazil following his election defeat in 2022. The sentencing marks the first time a former Brazilian head of state is brought to trial and convicted for attempting to overthrow the government. Bolsonaro and his co-conspirators, who were also sentenced to prison, hatched a plan that involved using armed forces to assassinate the President-elect Luiz Inácio Lula da Silva and Supreme Court Justice Alexandre de Moraes.
The decision was made amid political pressure from the Trump administration to drop the case against Bolsonaro. Secretary of State Marco Rubio pledged that the U.S. would “respond accordingly,” calling the ruling a witch hunt. “Latin American countries need to be united and have a very strong position to defend democracy and to defend our sovereignty and independence,” says Maria Luísa Mendonça, director of the Network for Social Justice and Human Rights in Brazil.
Please check back later for full transcript.
The original content of this program is licensed under a
Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License
. Please attribute legal copies of this work to democracynow.org. Some of the work(s) that this program incorporates, however, may be separately licensed. For further information or additional permissions, contact us.
Non-commercial news needs your support
We rely on contributions from our viewers and listeners to do our work.
Please do your part today.
A public list of bookmarks.
Pages that I want to find again easily and share with people.
Description
Inspired by
matklad
and
Drew DeVault
, I wanted a permenant place to
collect links that have helped me in order to build knowledge and signal boost
them.
Graphics
Rendering Crispy Text on The GPU
https://osor.io/text
How to render text on the GPU, focused on the parts that I find interesting.
With a number of surprisingly elegant tricks.
https://pluralistic.net/2025/07/31/unsatisfying-answers/
I've noticed a lot of people in discussions miss the key meaning behind the term
Enshittification
: Enshittification is deliberate and systemic. We can't stop
it by just writing better software!
https://rice-boy.com/3rdvoice/
Interesting fantasy post apocalypse.
More than 1 great webcomic on this site! Also, peak webcomic site design.
Everyone do it like this please.
Mehdi Hasan on Death of Two-State Solution, Possible U.S. War with Venezuela & More
Democracy Now!
www.democracynow.org
2025-09-12 13:28:00
Democracy Now! speaks with Mehdi Hasan, editor-in-chief and CEO of Zeteo, about Israel’s recent move to expand settlements in the West Bank in an effort to erase the possibility of a Palestinian state. “They are doing everything in their power to make sure that a two-state solution can n...
This is a rush transcript. Copy may not be in its final form.
AMY
GOODMAN
:
I wanted to switch gears a bit, although, I mean, there are, of course, connections. You have the latest news, where the Israeli prime minister also talked about Charlie Kirk as — I think he called him “a lionhearted friend of Israel” who “fought the lies and stood tall for Judeo-Christian civilization,” he said. In the occupied West Bank, Israeli forces rounded up over 1,500 Palestinians in Tulkarem Thursday, ordering a curfew for the city’s residents. The crackdown came as Israeli Prime Minister Netanyahu approved a plan to dramatically expand illegal West Bank settlements, greenlighting the construction of 3,400 new homes on land Palestinians want for a future state. Netanyahu spoke at a signing ceremony Thursday.
PRIME
MINISTER
BENJAMIN
NETANYAHU
:
[translated] We said there would not be a Palestinian state, and we say again there won’t be a Palestinian state. … This place is ours. We will take care of our country and our security and our heritage.
AMY
GOODMAN
:
So, you have Netanyahu at the signing ceremony saying that he would — there is no place for a Palestinian state. Your response, Mehdi Hasan?
MEHDI
HASAN
:
One of the only good things, Amy, about this Israeli government, which is the most far-right, racist, genocidal government in Israel’s history, which is saying a lot — one of the only good things about this Israeli government is that they just — you know, they say the quiet part out loud. They do say what they’re thinking. They do say what they mean and mean what they say. So, when they say there will be no Palestinian state, that’s the truth. They don’t plan to have a Palestinian state. They are opposed to a Palestinian state, which is much more refreshingly honest than some of those Israelis in the past who have said, “Oh, yeah, we support a two-state solution,” while carrying on building the same settlements that Netanyahu and Smotrich and co. are doing.
It just reminds us of how much gaslighting, Amy, there is in this country, in our media, in our Congress, where Democrats, in particular, hide behind the two-state solution crutch. “Oh, yes, my solution, a two-state solution.” Well, there is no two-state solution. The Israeli government says there is no two-state solution. They are doing everything in their power to make sure that a two-state solution can never happen, practically, by, you know, cutting up the land, building Israeli settlements that will prevent the establishment of any future Palestinian state. So, in that sense, you know, thank you, Netanyahu, for saying what we already knew, that you are someone who’s opposed to a Palestinian state, who has blocked a Palestinian state multiple times, including right now.
By the way, Amy, I should also point out, Bezalel Smotrich is the one who announced those settlements in the West Bank, and Netanyahu is now echoing what Smotrich says. Again, there’s a faction in this country that wants to say, “Oh, ignore Smotrich. He’s just a fringe figure.” No, Bezalel Smotrich is the finance minister of Israel. He’s in control of the West Bank. And what he’s doing, Netanyahu is following. So this is the entire Israeli government. This is their worldview.
By the way, you mentioned 1,500 people taken in the West Bank. And there’s a word for that. It’s called “hostages.” There are Palestinian hostages being held by Israel without charge, disappeared. No one knows where they are. They’ve committed no crimes. They are hostages in the same way that the people Hamas took were hostages.
AMY
GOODMAN
:
I wanted to also switch to another issue. Last week, the U.S. attacked a boat in the southern Caribbean, killing at least 11 people. President Trump claimed the boat was carrying drugs from Venezuela, but offered no proof. The Pentagon recently sent warships to the region, after Trump secretly authorized use of military force in Latin America under the guise of the “war on drugs.” In response, Congressmember Ilhan Omar has introduced a new War Powers Resolution seeking to block the Trump administration from conducting future military strikes in the Caribbean. This is Congressmember Omar speaking to you, Mehdi, Thursday in a
Zeteo
town hall.
REP
.
ILHAN
OMAR
:
It is Congress that declares war, and we have not been given that, that authority, by this president. And it’s, I think, really important for us to insert our authority in declaring war. What we are seeing with multiple strikes throughout the world that the president has authorized is that he does not have the authority to be able to do so, and specifically the strike that was carried out in the Caribbean against the Venezuelan vessel. I think it’s important for the people to recognize that we cannot just go out and terminate people. You know, this is — this is not something that is allowed under international law, and it’s certainly not allowed under U.S. law.
AMY
GOODMAN
:
So, that’s Ilhan Omar speaking to you, Mehdi. AP is
reporting
the Venezuelan boat was heading back to shore when the Trump administration bombed it, and that might have been bombed twice. And the whole question being raised: If it was a drug boat, why were there so many people on it? Were these, in fact, migrants? Mehdi Hasan, the significance of what’s taken place, and also the latest news that the House passed legislation to repeal the 1991 and 2002 Iraq authorizations for use of military force, the
AUMF
, in a bipartisan —
MEHDI
HASAN
:
Yes.
AMY
GOODMAN
:
— vote Wednesday, moving against two pieces of legislation that have vastly expanded the president’s ability to use military force in the U.S.’s forever wars in the Middle East?
MEHDI
HASAN
:
Well, let’s start with the AUMFs that you mentioned. Yes, they have been used to expand power and force of the U.S. state. They have been abused by presidents of both parties, stretched beyond all imaginable use based on the original intent of those AUMFs. I’m glad they were repealed, but look at how long it took, you know, over 20 years for the “war on terror” one, for the 2002 one for Iraq, over 30 years for the original Gulf War one.
This is why I’m skeptical that Ilhan Omar’s War Powers Resolution will go anywhere. She’s much more optimistic than I am. I interviewed her yesterday, as you showed. She thinks she’ll get a lot of votes for it. Unfortunately, our Congress loves endless war. Our Congress loves to hand over its war-making power to the president, both parties. And therefore, it is very worrying when you have a president like Donald Trump in the White House and the power he has.
And, you know, Barack Obama, with his drone strike policy, laid the groundwork for Donald Trump’s drone strike policy. Donald Trump, in his first term, actually carried out more drone strikes than Barack Obama did in his two terms. People don’t know that. But, of course, all of that was set up for Trump.
And you look at that boat attack. It does look more and more like an act of mass murder, 11 people killed. We don’t know their names. We don’t know who they were. We don’t know what they are supposedly accused of, just a generic “They are narcoterrorists. They are drug traffickers.” Based on what? The whole point of the United States is that the president doesn’t just get to kill people and we believe him on his say-so, especially this president, who lies about everything.
We just
learned
recently, Amy, from
The New York Times
that in his first term, he sent the SEALs into North Korea to plant a listening device, and those Navy SEALs ended up killing a boat full of unarmed Koreans, North Koreans, and then coming out and not telling the world, not telling the United States, not telling Congress. So, why would we believe anything Donald Trump says on the national security front? You trust you trust Marco Rubio? You trust Peter Hegseth? You trust Donald Trump? No.
In fact,
The New York Times
is
reporting
this week that the boat was turning around, and they still attacked it. This was a boat that was 2,000 miles away from the U.S. coastline. There’s no scenario in which you can say, “It was an imminent threat to the U.S. That’s why we attacked it.” It was thousands of miles away, and it was heading in the other direction and, as you say, had 11 people on board, which is very strange. Most drug boats don’t have that many people on board. So, I hope there is some kind of investigation, an international one, if not a U.S. one, because it looks like Donald Trump may have just murdered 11 innocent people.
AMY
GOODMAN
:
Mehdi Hasan, I want to thank you for being with us, award-winning journalist, editor-in-chief and
CEO
of
Zeteo
. We’ll link to your new
piece
, “Hypocritical Conservatives Are Using Charlie Kirk’s Horrific Murder to Cynically Smear the Left.”
Again, this breaking news: President Trump says that the suspected shooter is in custody. And also this breaking news: Trump says he’ll send the National Guard to Memphis, Tennessee.
Coming up, Brazil’s Supreme Court has sentenced former President Jair Bolsonaro to 27 years in prison after being convicted of plotting a coup to remain in power after losing the 2022 election. Stay with us.
[break]
AMY
GOODMAN
:
The late, great Odetta, performing in our firehouse studio September 11th, 2002, a year after the September 11th attacks. Yesterday was the 24th anniversary of those attacks.
The original content of this program is licensed under a
Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License
. Please attribute legal copies of this work to democracynow.org. Some of the work(s) that this program incorporates, however, may be separately licensed. For further information or additional permissions, contact us.
I've been working on a
toy compiler
lately so
I've been thinking about
ASTs
! It's
a new thing for me and I've gotten a bit obsessed with the idea of
simplifying both the representation of the tree itself as well as the
code to interpret it.
ASTs are (typically)
recursive
data-types; this
means that within the data-type they have an embedded instance of the
same type! The simplest version of a recursive tree we can look at is
actually a simple list! A list is a recursive (degenerate) tree where
every node has 0 or 1 branches. Here's how the definition of a simple
List AST might look in Haskell:
dataList a =Cons a (List a) |End
Contrary to what your kindergarten teacher taught you, this is one
case where it's okay to use the term in its own definition!
A slightly more complex AST for a toy calculator program might look
like this:
dataOp=Add|MultdataAST=BinOpOpASTAST|NumInt
In this case we've defined a recursive tree of math operations where
you can add or multiply numbers together. Here's how we'd represent this
simple math expression
(1 + 2) * 3
:
Maybe not the easiest for a human to read, but it's easy for the
computer to figure out! We won't bother writing a parser in this post,
instead we'll look at other possible ways we can represent these ASTs
with data structures that give us tools to work with them.
Recursion Schemes
Recursion schemes are a pretty complex subject peppered with
zygohistomorphic
prepromorphisms
and things; but don't fret, we won't go too deep
into the topic, instead we'll just touch on how we can use the general
recursion folding function
cata
to interpret generic ASTs
in a really clean fashion!
The core notion of the
recursion-schemes
library
is to factor out the recursion from data-types so that the
library can handle any complicated recursive cases and make it easy for
you to express how the recursion should behave.
There's a bit of a catch though, we don't get all that for free, we
first need to refactor our data-type to
factor out the
recursion
. What's that mean? Well basically we need to make our
concrete
data-type into a
Functor
over
its recursive bits. It's easier to understand with a concrete example;
let's start with our
List
example from earlier:
- data List a = Cons a (List a) | End+ data ListF a r = Cons a r | End
See the difference? We've replaced any of slots where the type
recursed
with a new type parameter
r
(for
*r*ecursion
). We've also renamed our new type to
ListF
as is the convention with recursion schemes. The
F
stands for
Functor
, representing that this
is the version of our data-type with a Functor over the recursive
bits.
How's our AST look if we do the same thing? Let's take a look:
data Op = Add | Mult- data AST =- BinOp Op AST AST - | Num Int+ data ASTF r =+ BinOpF Op r r + | NumF Int deriving (Show, Functor)
Pretty similar overall! Let's move on to representing some
calculations with our new type!
Avoiding Infinity using Fix
If you're a bit of a keener you may have already tried re-writing our
previous math formula using our new AST type, and if so probably ran
into a bit of a problem! Let's give it a try together using the same
math problem
(1 + 2) * 3
:
We can write the expression out without too much trouble, but what
type is it?
The type of the outer layer is
ASTF r
where
_
represents the recursive portion of the AST; if we fill
it in we get
ASTF (ASTF r)
, but the
r
ALSO
represents
ASTF r
; if we try to keep writing this in we end
up with:
ASTF (ASTF (ASTF (ASTF (ASTF (ASTF ...)))))
which
repeats ad nauseum.
We really need some way to tell GHC that the type parameter
represents infinite recursion! Luckily we have that available to us in
the form of the
Fix
newtype!
We'll start out with the short but confusing definition of
Fix
lifted straight from the
recursion-schemes
library
newtypeFix f =Fix (f (Fix f))
Short and sweet, but confusing as all hell. What's going on? Well
basically we're just 'cheating' the type system by deferring the
definition of our type signature into a lazily evaluated recursive type.
We do this by inserting a new layer of the
Fix
data-type in
between each layer of recursion, this satisfies the typechecker and
saves us from manually writing out an infinite type. There are
better
explanations
of
Fix
out there, so if you're really set
on understanding it I encourage you to go dig in! That said, we really
don't need to fully understand how it works in order to use it here, so
we're going to move on to the fun part.
Here's our expression written out using the
Fix
type,
notice how we have a
Fix
wrapper in between each layer of
our recursive type:
At this point it probably just seems like we've made this whole thing
a lot more complicated, but hold in there! Now that we've factored out
the recursion and are able to represent our trees using
Fix
we can finally reap the benefits that
recursion-schemes
can
provide!
Using
cata
The recursion-schemes library provides combinators and tools for
working with recursive datatypes like the
ASTF
type we've
just defined. Usually we need to tell the library about how to convert
between our original recursive type (
AST
) and the version
with recursion factored out (
ASTF
) by implementing a few
typeclasses, namely the
Recursive
type and the
Base
type family; but as it turns out any
Functor
wrapped in
Fix
gets an implementation
of these typeclasses for free! That means we can go ahead and use the
recursion-schemes
tools right away!
There are all sorts of functions in
recursion-schemes
,
but the one we'll be primarily looking at is the
cata
combinator (short for
catamorphism
). It's a cryptic name,
but basically its a fold function which lets us collapse our recursive
data-types down to a single value using simple functions.
Here's how we can use it:
interpret ::FixASTF->Intinterpret = cata algebrawhere algebra ::ASTFInt->Int algebra (Num n) = n algebra (BinOpFAdd a b) = a + b algebra (BinOpFMult a b) = a * b
Okay so what's this magic? Basically
cata
knows how to
traverse through a datatype wrapped in
Fix
and "unfix" it
by running a function on each level of the recursive structure! All we
need to do is give it an
algebra
(a function matching the
general type
Functor f => f a -> a
).
Notice how we never need to worry about evaluating the subtrees in
our AST?
cata
will automatically dive down to the bottom of
the tree and evaluate it from the bottom up, replacing the recursive
portions of each level with the
result
of evaluating
each subtree. It was a lot of setup to get here, but the simplicity of
our algebra makes it worth it!
Using Free in place of Fix
Using
Fix
and
recursion-schemes
is one way
to represent our AST, but there's another that I'd like to dig into:
Free Monads!
Free monads are often used to represent DSLs or to represent a set of
commands which we plan to interpret or run
later on
. I
see a few parallels to an AST in there! While not inherently related to
recursion we can pretty easily leverage Free to represent recursion in
our AST. I won't be going into much detail about how Free works, so you
may want to read up on that first before preceeding if it's new to
you.
Let's start by defining a new version of our AST type:
dataOp=Add|MultderivingShowdataASTFree a =BinOpFreeOp a aderiving (Show, Functor)
Notice that in this case we've removed our
Num Int
branch, that means that the base
ASTFree
type would recurse
forever if we wrapped it in
Fix
, but as it happens
Free
provides a termination branch via
Pure
that we can use as a replacement for
Num Int
as our
Functor's fixed point (i.e. termination point).
Here's our original expression written using Free:
Notice how in this case we've also extracted the type of our terminal
expression (
Int
) into the outer type rather than embedding
it in the
AST
type. This means we can now easily write
expressions over Strings, or Floats or whatever you like, we'll just
have to make sure that our interpreter can handle it.
Speaking of the interpreter, we can leverage
iter
from
Control.Monad.Free
to fill the role that
cata
did with our
Fix
datatype:
interpFree ::FreeASTFreeInt->IntinterpFree = iter algwhere alg (BinOpFreeAdd a b) = a + b alg (BinOpFreeMult a b) = a * b
Not so tough! This may be a bit of an abuse of the Free Monad, but it
works pretty well! Try it out:
>>> interpFree simpleExprFree9
You can of course employ these techniques with more complex ASTs and
transformations!
Hopefully you learned something 🤞! If you did, please consider
checking out my book: It teaches the principles of using optics in
Haskell and other functional programming languages and takes you all
the way from an beginner to wizard in all types of optics! You can get
it
here
. Every
sale helps me justify more time writing blog posts like this one and
helps me to continue writing educational functional programming
content. Cheers!
Over the past couple of months, Björn and I have been working on
improving state machine code generation
in the rust compiler, a rust project goal for 2025H1. In late June,
PR 138780
was merged, which adds
#![feature(loop_match)]
.
This post shows what
loop_match
is and why we're extremely excited about it.
Motivation
We first ran into rust's sup-par code generation for state machines when we were porting
zlib-rs
from C to rust. When looking at state machines written in C, it is quite common to see a
switch
with cases where one will implicitly fall through into the next.
switch (a) {
case 1:
// do work
a += 1;
/* implicit fallthrough */
case 2:
// do more work
a += 1;
/* implicit fallthrough */
case 3:
break;
default:
}
When
a
is
1
, the code for the
case 1
branch is executed, but then C will automatically continue executing the code for
case 2
: the default behavior is to fall through into the next branch. The
break
keyword can be used to prevent this implicit fallthrough.
This behavior is so subtle that when it is used deliberately, it is often explicitly documented in mature C code bases. Coming from Rust (and Haskell before that) I never really understood why C had this fallthrough behavior. And even if maybe there was a good reason for falling through, surely it should not be the default (e.g.
continue
could be used to signify fallthrough and the default could be to
break
).
But now, having read more C state machine code, my theory is that a lot of early C was basically building state machines. The fallthrough is an unconditional jump to the next switch branch, and hence very efficient. If anyone actually knows the original motivation, please reach out!
In any case, rust does not (and should not) have implicit fallthrough from one case into another. Instead, we'd normally write state machines like so:
enum State { A, B, C }
fn run(mut state: State) {
loop {
match state {
State::A => {
// do work
state = State::B;
}
State::B => {
// do more work
state = State::C;
}
State::C => break,
}
}
}
To jump from the
A
branch to the
B
branch, we update the
state
variable, jump back to the top of the loop, and then match on
state
to reach the
B
branch. The observable behavior is the same as with implicit fallthroughs, but in terms of compiler optimizations, this rust code is much worse.
To move from one state to another, we must first update the state, then jump back to the top of the loop (an unconditional jump) and then branch on the state (an unconditional jump).
On older CPUs this pattern destroys the branch predictor, because the branch on the state is hard to predict. Modern CPUs are a lot more capable though, and the branch predictor seems to be able to handle this OK these days.
Rather, the issue we've found is that the indirection of jumping to the top and then branching introduces additional code paths that are never used in practice. We can see that once we reach state
B
we will never go back to state
A
, but on an assembly level that code path is present. For small examples the compiler can figure this out, but for real-world state machines the additional code paths appear to inhibit other optimizations.
Our solution:
#[loop_match]
So, what should jumping directly from one match branch to another look like in rust? Inspired by
labeled switch
in zig, our
RFC
initially proposed new syntax, but we eventually settled on attributes as a lower-friction approach that allowed us to experiment with the actual implementation.
Aside
: my lesson from writing that RFC is to just never propose syntax. The problem with language syntax is that everyone can (indeed, will) have an opinion on it. Only a few people meaningfully contribute to the actual feature design.
Hence,
#[loop_match]
:
enum State { A, B, C }
fn run(mut state: State) {
#[loop_match]
loop {
state = 'blk {
match state {
State::A => {
// do work
#[const_continue]
break 'blk State::B; // direct jump to the B branch
}
State::B => {
// do more work
#[const_continue]
break 'blk State::C; // direct jump to the C branch
}
State::C => break,
}
}
}
}
The
#[loop_match]
annotation is applied to the
loop
. The body of that loop must be an assignment to the state variable, where the right-hand side of the assignment is a labeled block containing a match on the state variable.
To jump from one state to another,
#[const_continue]
can be used on a
break
from the labeled block with the updated state value. The value that is given to the break
must
be a constant. In the compiler we perform compile-time pattern matching on this value to determine which branch of the
match
it will end up in, and insert a direct jump to that branch.
The syntax is certainly not the prettiest, but:
we emit better code that uses direct jumps and does not introducing unused code paths.
when
#[loop_match]
and
#[const_continue]
are configured out (e.g. with
#[cfg_attr(feature = "loop_match", loop_match)]
), the code behaves like before.
Benchmarks
So, how much does this help? As always, it depends.
Your algorithm must actually look like a
loop
with a
match
to benefit at all. Even then the performance gain depends on for instance how many states there are, and how often you switch between states. Let's look at two practical examples.
An email address parser
We benchmarked this
generated email address parser
with 133 states. This code is a perfect fit for
#[loop_match]
. The results are impressive:
nightly
is a baseline that uses the nightly compiler, but no unstable features.
enable-dfa-jump-thread
uses the
-Cllvm-args=-enable-dfa-jump-thread
flag to optimize state machines.
loop-match
enables and uses
#![loop_match]
.
We see that
loop-match
is by far the fastest. You can reproduce this benchmark by cloning the repository linked above and running
./run.sh
.
Why is
loop_match
faster than the LLVM flag? We believe that the underlying reason is that the LLVM flag runs on the whole program, which has some downsides versus our per-loop approach:
The analysis is more computationally expensive, because it must go over the whole program. This is why the analysis is not enabled by default.
The analysis must be more conservative, or it might make the final program slower.
This analysis runs at the very end of the compilation pipeline, which means that earlier passes can't use its results.
The large increase in
branch_misses
for both optimized programs appears to be highly CPU-specific. On other machines (where the speedup is actually larger) we don't observe such a large increase. We do have some ideas for more optimal code generation, even when the state is updated with a value that is not compile-time known, which may help here.
Decompression in
zlib-rs
For
zlib-rs
, the state machine optimizations matter mostly when input arrives in (very) small chunks. For larger amounts of input the implementation falls down into a heavily optimized loop, and the state machine logic isn't used that much.
Here we take the most extreme case, where input arrives in chunks of 16 bytes. Admittedly that somewhat exaggerates the results, but on the other hand, anything that makes a library like zlib faster is a significant achievement.
Interestingly, in this case combining
#[loop_match]
and
-enable-dfa-jump-thread
is actually slightly slower than using just
#[loop_match]
. So far we've never seen any significant benefit to combining the two flags.
git clone git@github.com:trifectatechfoundation/zlib-rs.git
cd zlib-rs
git checkout 012cc2a7fa7dfde450f74df12a333c465d877e1c
sh loop-match.sh
Next steps
The
loop_match
feature is still experimental. Since the initial PR was merged, fuzzing has found some issues that we're working through. Issues and PRs related to
#[loop_match]
get the
F-loop_match
label.
The type of
state
is currently restricted to basic types: we support integers, floats,
bool
,
char
and enum types that are represented as an integer value. Adding support for more types is complex and requires more substantial changes to the compiler (e.g. a refactoring of how nested or-patterns are handled).
And finally, we have to show that this attribute actually pulls its weight. Therefore, if you have a project that looks like it would benefit from this feature, please let us know and/or give
#[loop_match]
a try and report how it went.
Future work and thank yous
Our goal for
loop-match
and other performance work is to unblock the use of Rust for performance-critical applications, in support of Trifecta Tech Foundation's mission to make critical infrastructure software safer.
We thank
Tweede golf
for getting this project started and
NLnet Foundation
and
AWS
for their continued support by funding
milestones 1 and 2
, respectively. The work on these milestones continues, notably: improving the
loop-match
mechanism and adding experimental support to c2rust.
Mehdi Hasan: Trump Is Weaponizing the Murder of Charlie Kirk to Go After the Left
Democracy Now!
www.democracynow.org
2025-09-12 13:15:11
President Trump announced on Friday that a suspect was in custody for the killing of far-right activist Charlie Kirk. Although the motive has not yet been established, Trump has escalated his attacks on the political left, saying, “We just have to beat the hell out of them.” Democracy No...
This is a rush transcript. Copy may not be in its final form.
AMY
GOODMAN
:
President Trump has just announced on Fox News that the suspected gunman who shot the conservative activist Charlie Kirk, killing him, has been caught. Trump said, quote, “I think with a high degree of certainty we have him in custody,” unquote.
Trump’s comments came a day after the
FBI
announced a $100,000 reward for information leading to an arrest. Officials also released photos and video of the suspected gunman who shot Kirk during an outdoor event at Utah Valley University. A bolt-action rifle was also recovered in a wooded area near the campus. In one video, the suspected gunman is seen jumping from a roof on campus and running away.
On Thursday, President Trump said he’ll honor Charlie Kirk with a posthumous Presidential Medal of Freedom, the nation’s highest civilian honor. Trump also escalated his attacks on the political left, saying, quote, “We just have to beat the hell out of them.”
PRESIDENT
DONALD
TRUMP
:
We have a great country. We have radical left lunatics out there, and we just have to beat the hell out of them.
AMY
GOODMAN
:
On the floor of the House, Republican Representative Bob Onder of Missouri described the political left as, quote, “pure evil.”
REP
.
BOB
ONDER
:
Well, everything has changed. If we didn’t know it already, there is no longer any middle ground. Some on the American left are undoubtedly well-meaning people. But their ideology is pure evil. They hate the good, the truth and the beautiful, and embrace the evil, the false and the ugly.
AMY
GOODMAN
:
This call comes as some lawmakers, including Congressmember Alexandria Ocasio-Cortez, are canceling or postponing public events out of safety concerns.
To talk about all of this and more, we’re joined by Mehdi Hasan, editor-in-chief and
CEO
of
Zeteo
, where his new
piece
is headlined “Hypocritical Conservatives Are Using Charlie Kirk’s Horrific Murder to Cynically Smear the Left.”
Mehdi, welcome back to
Democracy Now!
Why don’t you lay out what you’re seeing in these last few days, as we talk about this breaking news that the suspected gunman has been caught?
MEHDI
HASAN
:
Thanks, Amy, for having me.
The problem with this administration, of course, is you can’t trust anything they say. Kash Patel put out multiple statements over the last 48 hours suggesting that somebody’s been caught, somebody’s in custody. They leaked to
The Wall Street Journal
that there was trans ideology on the weapon, and then walked it back. We have an administration of gaslighters and serial liars. So, unfortunately, in the old days, even if a president lied, you could try and take the bureaucracy or the law enforcement people, maybe — maybe sometimes — at their word. Now you have to start from a position of pure skepticism. So I don’t believe anything Trump says until I see more verification. I do hope they’ve caught the person.
The problem is, Amy, that since the — from the moment Charlie Kirk was horrifically murdered, on camera, a horrific act, inexcusable act, on Wednesday in Utah — from the moment that happened, Republicans, conservatives, prominent figures in this country on the right went to work to blame this on the left, even though the killer was not in custody — apparently is now. Let’s see the alleged killer. They — no killer in custody, no motive, and yet for the last 36, 48 hours, we’ve been told again and again that the left did this, the left killed Kirk, the left has blood on its hands.
And I wrote that
piece
for
Zeteo
because I was deeply frustrated at what I was seeing. It’s not just frustrating. It’s dangerous, right? Your response to a political assassination, to political violence, cannot be to ratchet up more political violence, more dehumanization and demonization.
And the reality is, of course, as I say, we don’t know the motive of the killer. Let’s say the killer turns out to be someone on the left. Even then, that doesn’t mean the right is somehow scot-free here. And that’s why I wrote my piece, pointing out that the vast majority of right-wing political violence in this country comes from Trump supporters, comes from people on the far right, comes from all sorts of people who have horrific views about minorities and white supremacists. And I laid down the evidence in my piece.
For example, this summer, just a few weeks ago, I know the right wing has been erasing her killing, but Melissa Hortman, the speaker emerita of the Minnesota House, was murdered in her home with her husband. Another lawmaker was shot and almost killed with his partner. That was done by a Trump supporter this summer. Trump didn’t even bother to show up at the funeral. No one mentions Melissa Hortman’s death on the right when they’re talking about political violence. We’ve erased January 6th. We’ve erased the attack on Josh Shapiro’s home earlier this year. We’ve erased multiple attacks over the years that have been attributed to or that the suspect turned out to be some kind of Trump supporter.
And I think that is why I wrote that piece, because there’s a real rewriting of history going on. It’s what far-right regimes do after, you know, tragedies like this: They try and weaponize them to go after their enemies. And Trump’s made that very clear — in all his statements, “the radical left.” This is a guy who has incited violence himself, including on January the 6th.
AMY
GOODMAN
:
I wanted to turn to Hunter Kozak. He’s the Utah Valley University student who posed a question to Charlie Kirk about gun violence just before Kirk was shot and killed.
HUNTER
KOZAK
:
Five is a lot, right? I’m going to give you — I’m going to give you some credit. Do you know how many mass shooters there have been in America over the last 10 years?
CHARLIE
KIRK
:
Counting or not counting gang violence?
HUNTER
KOZAK
:
Great.
AMY
GOODMAN
:
On Thursday, that young man — he was 29 years old — Hunter Kozak, the Utah Valley University — I think he was a student — posted his message response to what happened after he asked the question.
HUNTER
KOZAK
:
And people have obviously pointed to the irony that I was — the point that I was trying to make is how peaceful the left was, right before he got shot. And that — that only makes sense if we stay peaceful. And as much as I disagree with Charlie Kirk — I’m on the record for how much I disagree with Charlie Kirk — but, like, man, dude, he is still a human being. Have we forgotten that?
AMY
GOODMAN
:
That’s Hunter Kozak, who posed the question. He started by asking about how many trans mass shooters Charlie Kirk thought they were, and then talked about that percentage as the number of mass shooters in this country. But as he says, was horrified, as here is Charlie Kirk answering a question about gun violence, then is shot dead. Your response to this young man, who’s in a lot of pain? He said, in fact, though, he disagrees with almost everything, is known for opposing —
MEHDI
HASAN
:
Yeah.
AMY
GOODMAN
:
— Charlie Kirk, himself a TikToker. His wife just gave birth to their second child. He sees their families, you know, both of them having two children. And he said, “But I’m absolutely against violence and for his freedom of expression.”
MEHDI
HASAN
:
Amy, we all are. I mean, 99% of the people in this country, I hope, are against politically motivated murders. I mean, it’s horrific. What happened to Charlie Kirk is horrific on a human level, on a political level, on multiple levels. And, you know, people are going around saying, “Well, you know, he didn’t believe in empathy, so I don’t care.” Well, just the fact that he didn’t believe in empathy is irrelevant. I believe in empathy. Most of us should believe — have empathy. And I do have empathy for his wife and kids. Two kids are going to grow up without their father. The fact that their father had vile political views that I disagree with, the fact that their father said I should be deported from the U.S., is irrelevant. All right? You don’t kill people for their speech, ever. And that young man gave a very eloquent statement there.
The irony of him being killed after taking a question on gun violence and trying to make it about gangs, I mean, Amy, right now everything in American politics just feels bizarre and ironic and unprecedented. You know, if you sat in a Netflix TV writers’ room and said, “Hey, this is a script for a political drama about politics in the United States,” and it was the script of the last five or 10 years, the TV writers would throw you out of the room and say, “This is ridiculous. We can’t make this TV show. This is so unrealistic — the plot twists, the turns.” But that’s our daily life right now. I mean, we’re all going crazy seeing, you know, what happens on a daily basis. You know, it’s beyond anything we see on TV or in the movies these days.
And I worry that everything’s going to get worse. I was on the
BBC
just a couple nights back, and, you know, the question they asked was: Is America going to come together after this? That’s what other countries are wondering. That’s what would happen in most normal countries after a tragedy like this. Unfortunately, the U.S. is not a normal country right now. And I suspect not only are we not going to come together, we’re going to go further apart, because the president is someone who takes this opportunity to incite more. I mean, everything Donald Trump has said since this murder has been unhelpful at best, dangerous and destructive at worst. He’s not the right leader whenever there’s a tragedy, whenever there is a murder or a terrorist act. That’s always been one of my great criticisms of Trump — I have many. But he’s not the right person to lead a nation when there is a tragedy or a crisis.
AMY
GOODMAN
:
Mehdi, in 2023, Charlie Kirk called for you to be deported over your views on the
COVID
-19 pandemic while you were working at —
MEHDI
HASAN
:
Yeah.
AMY
GOODMAN
:
— MSNBC. I just wanted to play a clip from
The Charlie Kirk Show
.
MEHDI
HASAN
:
So we need to reassert what the actual truth of the matter is, especially if we are to be prepared for the next pandemic when it inevitably comes.
CHARLIE
KIRK
:
Wow, who is that neurotic lunatic? Who is that guy? Send him back to the country he came from? Holy cow! Get him off TV. Revoke his visa.
AMY
GOODMAN
:
So, that was Charlie Kirk. And again, the horror of his murder right now. Your response then, Mehdi, and as you reflect on this now?
MEHDI
HASAN
:
Yeah, I responded at the time pointing out how racist that statement was. Charlie Kirk was very anti-immigrant. He was very anti-Muslim. People forget this stuff. But again, you know, I’ve spent the last 48 hours condemning his killing. I have been — I’ve found the posts celebrating his death — very few of them; I know the Republicans are trying to exaggerate. There are deaths. There are obviously posts online celebrating his death. I found them distasteful, inappropriate. It’s not something I would do. And yet, I think to myself, had I been the one shot in the neck and passing away, I wonder whether — what Kirk would have said about me. This is the reality of where we live.
I mean, we’re in this weird situation, Amy, now where some liberals are going to another extreme, which is we should all condemn the killing of Charlie Kirk, but we don’t need to participate in the whitewashing of his record or the kind of — this suggestion that he’s some kind of free speech martyr. He was not a supporter of free speech. You just saw that clip. I said something on
MSNBC
he did not like — I, an American citizen. He said I should be deported from the United States. Is that someone who sounds like they support free speech? He was super anti-Muslim. Just a couple of days ago, he was posting about Islam being the sword with which the left slits the throat of America. He called Muslims conquerors, invaders. His rhetoric was horrific. He put targets on people’s backs.
But again, I don’t measure my own views or my own responses to tragedies by the standard set by Charlie Kirk or Donald Trump or anyone else. The fact that he may have had a more gleeful response to my death than I do to his is irrelevant. As I say, none of us should celebrate the death of a human being. None of us should celebrate political violence, because it’s a threat to all of us and to this country.
And I think it’s interesting that so many people are now trying to suggest that this guy — I’ve seen people saying, “Oh, he never did anything. He just went and had good-faith debates with college students.” Just not true. He supported the — you know, he supported the deportation of Mahmoud Khalil, a green card legal resident who was punished for his speech, nothing else, by the Trump administration.
So, look, even us having this conversation, Amy, will be clipped somewhere by a Republican and say, “Look! Look! They’re celebrating his death. They’re criticizing him.” No, criticizing someone’s views is not celebrating the death. We can do two things at once. We can walk and chew gum. We can say it’s absolutely outrageous that Charlie Kirk was murdered for his views, and we have absolute empathy for his wife and kids and friends and family. But we can also say those views were horrific. We’re not going to suddenly say, because he was murdered, his views are somehow good. No, bad people can be unjustly murdered. Bad people can be innocent when it comes to being killed, because even bad people shouldn’t be killed for their views.
The original content of this program is licensed under a
Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License
. Please attribute legal copies of this work to democracynow.org. Some of the work(s) that this program incorporates, however, may be separately licensed. For further information or additional permissions, contact us.
The Treasury Is Expanding the Patriot Act to Attack Bitcoin Self Custody
We warned a couple of months ago when the Trump administration's "Crypto Brief" was released that there was some language in the brief that advised the government to expand the Patriot Act to account for digital assets. Well, it looks like FinCen and the Treasury have been working on guidelines and a rough outline is shared above courtesy of The Rage, and they are absolutely horrid.
It seems that FinCen and the Treasury are preparing to outlaw the use of CoinJoin, atomic swaps, single address use, and transaction broadcast timing delays. All of which are common
best use
practices that I would recommend any bitcoiner leveraging self-custody practice. This is an all out attack on financial privacy within bitcoin. If enacted, any user who leverages these tools will be flagged as a suspicious, any attempts to send a UTXO that has touched any of these tools will be rejected by regulated services, and could potentially be sent to prison.
This is an absurd affront to common sensibilities and freedom in the digital age. The fact that they want to prevent people from using single addresses for individual UTXOs is patently absurd. Not only is it a massive infringement on privacy, but it makes bitcoin usage less economically efficient and degrades the security of every bitcoiner. Loading up a single address with too many UTXOs degrades the entropy of a public-private key pair and makes it easier to brute force a user's private key.
Instead of expanding the Patriot Act, it should be abolished. Instead of trying to eliminate financial privacy for the 99.9% of law abiding citizens in this country, the government should be actively trying to foster an environment in which it can be improved. The proposed solutions will do nothing but put good Americans in harm's way and degrade the security of their savings.
We shouldn't have to live in a world where standards cater to the lowest common denominator, in this case criminals, and make things worse off for the overwhelming majority of the population. It's crazy that this even has to be said. The onus is on law enforcement to be so good at their jobs that they are able to prevent crimes from happening before they occur and effectively bring criminals to heel after they commit crimes. It shouldn't be on a neutral protocol and the industry being built on top of it that, when used effectively, provides people with a stable monetary system that respects user privacy and equips them with the tools to receive and spend in a way that provides them with peace of mind.
Why should everyone have to suffer because of a few bad apples? Isn't that letting the terrorist win?
Bitcoin Is Becoming Less Volatile as It Integrates Into Traditional Finance Infrastructure
Mel Mattison revealed a fascinating shift in Bitcoin's market dynamics that challenges conventional crypto wisdom. He pointed out that Bitcoin futures now exhibit lower volatility than platinum futures - a remarkable transformation for an asset once synonymous with wild price swings. The proliferation of ETFs, options, futures, and other traditional financial instruments has fundamentally altered Bitcoin's behavior, creating what Mel calls "volatility suppression." This institutionalization comes with trade-offs: while reducing dramatic downswings, it also caps explosive upside potential.
"Bitcoin is becoming a TradFi security instrument and it's getting TradFi vol." -
Mel Mattison
Mel argued that the relationship between volatility and returns means investors must recalibrate expectations. Where 100% annual gains once seemed routine, he now considers 50% returns "massive" for this new era of Bitcoin. This maturation reflects Bitcoin's evolution from speculative experiment to financial infrastructure - less exciting perhaps, but ultimately more sustainable for long-term adoption.
Check out the
full podcast here
for more on China's gold strategy, Fed independence battles, and housing market manipulation plans.
Research Proposes Bitcoin for Mars Trade Standard - via
X
Secure Your Bitcoin The Hard Way
Tom Honzik has helped 1,000+ people secure more than 5,000 BTC. Now, TFTC and Unchained are teaming up for a live online session on bitcoin custody.What you’ll learn:
Biggest mistakes that cause lost coins
Tradeoffs of exchanges, ETFs, singlesig, and multisig
How to get optimal security without blindly trust custodians or DIY risk
Stick around for the AMA to ask Tom Honzik and Marty Bent anything—from privacy considerations to the tradeoffs of different multisig quorums.
Obscura – The World’s Best VPN Built by Bitcoiners
Created by Carl Dong (former Bitcoin Core contributor), unlike other VPNs, it
can’t
log your activity by design, delivering verifiable privacy you can trust.
Outsmarts internet censorship
: works even on the most restrictive Wi-Fi networks where other VPNs fail.
Pay with bitcoin over Lightning:
better privacy and low fees.
No email required:
accounts are generated like bitcoin wallets.
No trade-offs
: browse freely with fast, reliable speeds.
Exclusive Deal for TFTC Listeners:
Sign up at
obscura.net
and use code
TFTC25
for
25% off your first 12 months
.
Now available on
macOS, iOS, and WireGuard
, with more platforms coming soon — so your privacy travels with you wherever you go.
Ten31, the largest bitcoin-focused investor, has deployed $200M across 30+ companies through three funds. I am a Managing Partner at Ten31 and am very proud of the work we are doing. Learn more at
ten31.vc/invest
.
Final thought...
Rest in peace, Charlie Kirk. Pray for humanity and for peace.