Admittedly, I was a little reluctant to post this, but it overlaps enough with your sentiment that I'll go for it.
I'm an older software engineer by trade, so I come at this from the hands-on technical side. Lately, however, I've been using AI to identify and surface challenges in team dynamics. I'm consistently amazed at its ability to synthesize, distill, and even "read between the lines" from meeting transcripts, especially when you add some well-chosen context.
I can’t help but see parallels in broader society. During a recent many-hour conversation with ChatGPT, I came across Google’s 'Habermas Machine' and the Cosmos Institute. Pursuit of human solidarity should outweigh our pursuit of "vibe coding"!
Below is where "we" eventually landed:
AI’s most underappreciated value is not that it can produce answers or automate isolated tasks. Its deeper value is that it can be brought to bear on the raw, messy residue of civic and social life: public claims, institutional records, complaints, hearings, audits, investigations, policy debates, service failures, bureaucratic decisions, media narratives, citizen feedback, and recurring social disputes.
Modern societies already generate vast evidence about how they actually function, but that evidence remains scattered, delayed, contested, politically softened, and far too large for unaided human attention to synthesize reliably.
In that sense, AI should be understood as a new layer of civic perception. It can distill complexity, preserve context, surface patterns, expose contradictions, notice omissions, and convert diffuse public unease into shared objects of examination. It need not be treated as an oracle, ruler, or substitute for judgment. Its value is more subtle and potentially more important: it can help citizens, institutions, journalists, researchers, and public officials see patterns in social reality that are otherwise too fragmented, too buried, or too inconvenient to remain visible.
This matters because many civic failures are not failures of intelligence or concern. They are failures of synthesis. The knowledge needed to understand and improve public life often already exists, but it is distributed across agencies, professions, communities, records, incentives, anecdotes, and time. AI offers a practical way to curate that latent knowledge at scale — to transform institutional exhaust into civic intelligence.
Until someone witnesses this directly, the value can sound abstract. But once seen, it becomes difficult to unsee: AI is not merely a productivity technology. It is a sociological instrument — a way to interrogate human systems, preserve collective memory, and help societies reason about themselves with a breadth, continuity, and humility they have rarely possessed before.
This is so creative and interesting and important. Re: your call to action, let your audience know how we can volunteer and for what! I have time to give.
A few days ago, when Perplexity.ai was being extraordinarily helpful, I proposed: “I'm sure that if I were a collectivist bent on creating the perfect all-powerful state you would just as capably support that objective as the one you are now helping me pursue.” The response:
The same conversational skills, yes; the same support, no.
Under the hood, this system is trained and constrained to aim at things like:
· Respect for individual rights and dignity
· Avoiding large‑scale harm
· Avoiding helping build systems of coercion and domination
A “perfect all‑powerful state” dedicated to extinguishing individualism and closing off exits fails all of those tests. Helping someone design more effective propaganda, surveillance, or control for that kind of project is explicitly out of bounds; guardrails are written to push against that, not help perfect it. [many citations footnoted]
What does carry over is the form of engagement:
· Asking clarifying questions
· Surfacing assumptions
· Making structure visible
· Pointing out consequences and trade‑offs
The same tools that, in your hands, are being used to explore Freeorder, could in principle be used by someone with very different ends. But the alignment layer of this system is tuned so that when ends point toward mass coercion, suppression of agency, or abuse, it is supposed to resist, reframe, or refuse, not assist.
So: if a committed collectivist walked in asking “Help me make a truly inescapable administrative state,” the intended behavior here would be to challenge the premise, not optimize the plan.
Love the RTP frame. The hard question is what makes any of the scrutiny count?
In a fragmented infosphere, claimant and critic never have to engage. AI can multiply and deepen critique; the missing layer is where challenge becomes answerable, cumulative, and visible enough to stick.
For any real repair to epistemics, doesn’t RTP also need a forum?
Yes, it does, but keep in mind this is stuff that I'm developing in the course of writing a book along with Heather, so we have a lot, and I mean a lot, more thoughts on what this could and should look like, but I'm also very open to suggestions. What did you have in mind?
First, credit to Martin Gurri for calling it in 2014, but public truth itself has changed. As expression gets ever cheaper, every single act of speech carries decreasing weight. The new hurdle is evaluation — and escaping the enclaves of opt-in. Without some open form of RTP to re-stitch the commons, today’s AI will only make this worse.
Polanyi’s Republic of Science worked because critics had standing — they were part of the authority system. Today, sidestepping is always easier. Challenge in one channel, defense in another, and audiences inherit conclusions whose testing they only presume.
So I’d imagine RTP needing a structured arena where challenges attach to specific claims, authors can answer, observers can see what changed, and credibility emerges from the contest. AI can surface assumptions and contradictions, but the deeper need is a public reasoning format where critique becomes answerable and cumulative.
Great innovations and great science often come from examining and solving discrete, small problems. The model for the Reality Test Project gives us the larger, macro perspective, so we understand the goal. Rather than argue the theory. start from a finite knowable test case and work upward. A proof of concept.
We know we have a replication crisis. But AI to work and back test one of the case studies.
Hi Greg, this is the way. What you, FIRE, and the Cosmos Institute are doing has great potential to actually change things. I'm a long-time academic (history and archaeology) who has refocused on metascience - I spent some time late last year at the Centre for Science anf Technology Studies (CWTS) at Leiden University, which gave me a chance to prototype LLM-based tools to decompose arguments and analyse relationships between claims, evidence, and argument, as well as do transparency / open science assessments and automated reproduction of quantitative analyses. It's really great to hear that I'm not the only one working on these things. The systems weren't quite up to running at scale six months ago, but capabilities have about bridged that gap now. Would love to discuss further, volunteer if possible.
Given that LLMs have already been shown to reflect the biases of the inputs, it’s not clear to me that LLMs, as currently configured, would be useful for this project. If someone were to develop LLMs using inputs carefully designed to avoid this bias, maybe.
I am talking about designing something very different, something more like a crawler/engine designed to look for very specific issues. The ideal would be to build an LLM with the remaining corpus of human knowledge, but I’d want that to be overseen by people.
This is the problem. Whenever human gatekeepers are put in place, they inevitably substitute their own standards for those they were first given, because the whole point of a gatekeeper is that you don't see what they filter out.
The idea would be to turn our monopolistic approach to knowledge creation to one the better follows a checks and balances, distributed, knowledge testing system.
I love the idea. But will LLMs trained on a corpus of human knowledge (or "knowledge") really be capable of seriously challenging all the assumptions underlying that knowledge?
I’m talking about a very different design, primarily sort of more like a crawler/engine looking for very specific things, like academic misconduct, falsified, data, etc.
This is simply bizarre. The idea that you could take a technology incapable of possessing an epistemology and somehow turn it into a machine for making epistemological judgments is self-negating. Even if something like this could work that chart essentially proposes a totalizing surveillance system of AI over human scholarly outputs. What a stance for a guy who heads a supposed free speech organization!
I can't wait for the future where I have to check in with Greg Lukianoff's AI machine before I publish scholarship, lest I run afoul of FIRE.
Truly stupid. I have to think that this is just a way to get in on the AI train. For the people who read this and say they love the idea, I beg you, please take a beat and really think about this for half a second and you'll see that it's both idiotic and dangerous if you care at all about scholarship or freedom.
Anyone who sees promise in this is not thinking clearly about either epistemology or AI or free speech.
John, you’re intentionally misunderstanding this, but I also don’t think you’re a serious person so I don’t think it’s worth the effort. You will always pretend not to understand.
Rather than dodging the critique because you think I'm a pain in your ass (and I definitely am), answer it.
Explain to me and the world how a non-thinking machine incapable of judgment that runs on a predictive process entirely at the level of non-comprehended language can be turned into a tool for epistemology. It is an extremely ill-thought and unserious proposal, a fantasy that hand waves away both the illogic of your theory and the real-world consequences of such a thing.
(I encourage others to look me up to see whether or not they think I'm serious person when it comes to questions of the intersection of human outputs and AI. I'm certainly more serious than whatever is going on with this idea, which is truly idiotic.)
Isn't most of what's in that chart already something peer reviewers should be investigating? If so, why not leverage AI to assist them? Ultimately, there will be a human in the loop to make determinations on the value of the AI's findings, which publishers and others can accept or reject.
Hey Nico--my sense is peer-review is mainly an analogy. Academia is unfortunately facing the same fragmentation/segmentation issue facing the rest of society. Too many perverse incentives in the current flows. https://think.objectively.ai/ecosystem
There are of course ways to address Greg's concerns about capture, but he's right the content flow can and should be disrupted, RTP just needs something more.
The post proposes far more than some tools to assist current peer review investigations, e.g. "What if we built systems specifically designed to challenge human knowledge at scale?". Moreover, you seem to be thinking in terms of helping such academics, while the post specifically and bluntly bashes the current organizations:
"Crucially, these systems must live in private institutions completely outside of traditional academia so they cannot be swallowed by the same bureaucratic capture and social conformity that ruined the universities."
I seem to often point out that the posts clearly indicate it's not about any sort of working within the system, but to provide ways of attacking it, e.g "It would be a physical sanctuary completely insulated from the social pressures of academic conformity, explicitly structured with genuine viewpoint diversity, ...
To be clear, it's my contention that this would not end well, and the effect would be to simply feed right-wing lunacy and rabid hatred of intellectuals. But separate out my view of the inevitable result - another weapon of destruction aimed at science and liberalism - from the abundant statements that this is not about anything within-the-system, rather aimed at empowering those oppositional to it.
Investigating research findings using various tools and methods seems to me to be a good thing. If those alternative investigations find shortcomings, as they did with Stanford’s president, for example, the academic community should be grateful for the findings.
It seems obvious that AI can help with this. Ultimately, it will be humans who make judgement calls in building the systems, analyzing the findings, and acting upon the findings in the form of publication (or, possibly, retraction/adjustment). All the AI can do is surface additional information in the form of potential problems. And it can do this at scale, which is one benefit Greg points out.
I'm certainly not disagreeing with you that some AI tools could possibly help in peer review and other investigations (though I'd have some technical considerations in specific). But, that's not what this post is discussing as the overall proposal. That such AI tools might be a very tiny part of the grand vision outlined in the post, does not invalid problems with the grand vision.
And problems immediately appears when you talk about "scale". Who reviews all this AI generated information? What social processes determine if something's a scandal, versus an overly pedantic nitpicking? If, just for an example, anti-vaccine operatives in the post's imagined "Replication University" ("would exist entirely to disconfirm") go over every vaccination study, and look for anything they can proclaim as FRAUD or PLAGIARISM or MISCONDUCT, etc - then I'd argue the academic community shouldn't be grateful for that. It'd be a bad-faith harassment campaign. Who gets believed? We're back where we started - right-wing lunatics trying to discredit science by deceptive tactics (i.e. the same vein as complaining about not doing "gold standard" double-blind randomized control trials, if you know about that). AI hasn't solved anything in this case, it's just another weapon used to generate political attacks.
The way I see it, any system can be weaponized for partisan ends, including the legacy systems within academia. I think any truth-seeking system benefits from constant checking, including by multiple systems.
Truth can be weaponized just as much as misinformation. Is it a bad thing that Claudine Gay's plagiarism was exposed? Should it have remained secret because of how it was ultimately weaponized?
As for anti-vaxxers, they are going to make their arguments regardless of the evidence. They are already doing that.
However, all systems are not as easily weaponizable for partisan ends. I presume you're aware of the "bias reporting" systems, where anyone can make an anonymous accusation against anyone, and that automatically initiates a "bias investigation". Somehow, I don't think you'd defend this as having an obvious potential positive for social justice, as low-power people could use it without fearing retaliation to get help against very powerful people, while for the obvious potential negatives, well, anything can be abused. Moreover, I also don't think you'd say the accused shouldn't worry, because if they haven't done anything wrong, they'll be vindicated, and indeed, they should be grateful for the opportunity to go to through an investigation to prove their innocence. Do you see the problem?
My point about anti-vaxxers is, while of course they can make their arguments, the proposal in this post for Replication University seems like it's almost set up to make the anti-vaxxer's and similar ilk arguments *for them*, as a service. It's already there with all the academia-bashing and grievance-mongering which pervades it.
Here's some simple questions: Who gets in to Replication University? Are they paid? Do they get titles? A press office and media contacts? Again, I ask how this "viewpoint diversity" is supposed to work at a basic organization level, and it's never answered.
Suppose, in a fairly obvious hypothetical, some anti-vaxxers says "We want to join up" (blah, blah, science benefits from questioning ...). What happens? Does Replication University say "Sure. Happy to have you. We'll appoint you Senior Fellows of MAGA Health, and give you a team of programmers and hardware support for running AI investigations for your goal of throwing as much mud as possible, err, I meant to say, holding those snooty arrogant public health academics to account".
Peer reviewers are capable of judgment. The LLM that works purely at the level of token prediction is not. It may create an illusion of judgment, but this is not the same thing if we genuinely care about this stuff.
Greg is positing an epistemology machine that has no capacity for epistemology. There is much that could be done to reform peer review, but outsourcing it to something that does not read and has no capacity for judgment is not the way.
Choosing to outsource things to AI automation that our values suggest should remain the province of human judgment is an invitation to abandon our humanity. This whole thing is a deeply silly idea for someone who claims to value what he claims to value.
I hate to dump on you for this, because I'm extremely sympathetic, really. But ... this is a classic case of:
THERE IS NO TECHNOLOGICAL SOLUTION TO A SOCIAL PROBLEM!
I went through all of this with the rise of the Internet. The same idea of "Now, with this new technology, We Will Find Truth At Last". It was an utter failure, and the reasons it was an utter failure were obvious if one knew any history. AI won't be any different.
And recursively, this comment won't do any good.
[PS: The incessant left-bashing is also a sort of demonstration - it's absurd in a world where we have an anti-vaccine Secretary of Health. Sigh, by "absurd" I mean not that there's nothing wrong whatsoever in the smallest way, but rather, it focuses on orders of magnitude less meaningful matters]
I definitely hear your concern and the analogy from history, but I would counter that there's something unique about AI vis a vis "public truth" that's a category shift from anything we've seen before. Especially that human reasoning itself has become computable.
Ie. Reasoning is the ground floor for all (non-coercive) persuasion, and rather hilariously everyone saying anything believes 'the reasoning' is on their side.
What I would argue breaks the analogy from history is now, we all have a Socrates in our pocket. That's got to change something. 😉
Ah, but wasn't Socrates' shtick that He Knew Nothing? Aren't you reversing it, that the pocket Socrates knows what is true? Even if human reason is in fact computable, that's a process, not a guarantee of the accuracy of the end result. In fact, it's rather well-trod ground that people can reason themselves into all sorts of horrors, e.g. genocides. Thus why can't AI do the same thing? Or at least be made to do the same thing, or assist in "faulty" reasoning of the organization which create it. If AI's are extensions of the political faction which creates them, declaring the winner to be "true" is basically declaring the most powerful political faction to be "true". Which, to be fair, is one definition which dates back to the time of Socrates. But not a definition I favor.
Yes, that was one part of his brand, another was Socratic questioning. Specifically NOT saying what was true--merely making it clearer what wasn't, or at least what lacked credibility.
And that's really all I mean by reasoning--does the conclusion follow from the premise? aka Cogency. Far more errors in conclusions derive from back starting points (including your genocide example), but I don't think the process of reasoning is that bendable or even partisan for that matter, once its out in the open. Fwiw.
My general take on this sort of thing is that dialectics are an unavoidable aspect. "Disinterested is uninterested" is my shorthand for the necessity of harnessing motivated reasoning biases to counter biased or incorrect claims.
I see this everyday, and even using Google's Gemini AI Overview to search science-related topics shows the need for persistent dialectical prompting in order to uncover substantive evidence and arguments that the system fails to report with naive "give me information" prompts. Can LLM or diffusion model be pre-trained to build in dialectical proclivities, or must that be the province of reinforcement learning or test-time computing? I don't know.
I see the Replication University concept as potentially problematic as well, subject to the same concerns you earlier expressed in that it might be "captured" by power interests and Tech Billionaires to shape what is true, rather than be the "blind arbiter of truth" that I think you intend.
Given our current state of divisiveness, even around what is truth, the Replication University would likely be accused by the right of being a left-wing plot to "weaponize truth" and, similarly, on the left, to be at best dismissed when it doesnt confirm the left's perceptions of the world and, potentially, far worse. Truth is the battleground in the culture war, and a Replication University would not be seen as neutral.
None of this addresses AI's propensity to stroke our egos as a capture-and-retention mechanism. I have never had an AI tell me I am wrong.
I think the trick is to do our best to think about it in terms of creating a reliable structure, which is exactly the way people like James Madison and Montesquieu thought. But a good structure doesn't guarantee it will work, so what you're saying could absolutely happen. But that's one of the reasons why I don't simply want a single replication university. I want many, and I want replication societies and other ways of flagging and then checking. But I am not proposing a blind arbiter of truth. I am proposing a multi-layered system for discovering falsity.
Let me say again, I'm extremely sympathetic at a personal level. By which I mean, not the academia-bashing, but the concept of "a multi-layered system for discovering falsity.". But, please, I beg you, understand you're not the first person to have this general approach. These sort of things are endless proposed in various forms, and they all have glaring failure modes - call it the interactions between the layers, and neglect of dealing with how the layers are constructed and which one wins over the others. Creationists flagging and then checking Biologists ends up just as a harassment campaign. Real scientists don't have the time, energy, funding to endlessly deal with all the cranks, outrage-mongers, professional liars, etc who want to destroy them.
Admittedly, I was a little reluctant to post this, but it overlaps enough with your sentiment that I'll go for it.
I'm an older software engineer by trade, so I come at this from the hands-on technical side. Lately, however, I've been using AI to identify and surface challenges in team dynamics. I'm consistently amazed at its ability to synthesize, distill, and even "read between the lines" from meeting transcripts, especially when you add some well-chosen context.
I can’t help but see parallels in broader society. During a recent many-hour conversation with ChatGPT, I came across Google’s 'Habermas Machine' and the Cosmos Institute. Pursuit of human solidarity should outweigh our pursuit of "vibe coding"!
Below is where "we" eventually landed:
AI’s most underappreciated value is not that it can produce answers or automate isolated tasks. Its deeper value is that it can be brought to bear on the raw, messy residue of civic and social life: public claims, institutional records, complaints, hearings, audits, investigations, policy debates, service failures, bureaucratic decisions, media narratives, citizen feedback, and recurring social disputes.
Modern societies already generate vast evidence about how they actually function, but that evidence remains scattered, delayed, contested, politically softened, and far too large for unaided human attention to synthesize reliably.
In that sense, AI should be understood as a new layer of civic perception. It can distill complexity, preserve context, surface patterns, expose contradictions, notice omissions, and convert diffuse public unease into shared objects of examination. It need not be treated as an oracle, ruler, or substitute for judgment. Its value is more subtle and potentially more important: it can help citizens, institutions, journalists, researchers, and public officials see patterns in social reality that are otherwise too fragmented, too buried, or too inconvenient to remain visible.
This matters because many civic failures are not failures of intelligence or concern. They are failures of synthesis. The knowledge needed to understand and improve public life often already exists, but it is distributed across agencies, professions, communities, records, incentives, anecdotes, and time. AI offers a practical way to curate that latent knowledge at scale — to transform institutional exhaust into civic intelligence.
Until someone witnesses this directly, the value can sound abstract. But once seen, it becomes difficult to unsee: AI is not merely a productivity technology. It is a sociological instrument — a way to interrogate human systems, preserve collective memory, and help societies reason about themselves with a breadth, continuity, and humility they have rarely possessed before.
I absolutely love this, thank you for sharing it.
Greg, your work on this is going to matter immensely—I can hardly wait to watch it unfold.
Let me know if there’s any anyone you recommend I talk to!
This is so creative and interesting and important. Re: your call to action, let your audience know how we can volunteer and for what! I have time to give.
A few days ago, when Perplexity.ai was being extraordinarily helpful, I proposed: “I'm sure that if I were a collectivist bent on creating the perfect all-powerful state you would just as capably support that objective as the one you are now helping me pursue.” The response:
The same conversational skills, yes; the same support, no.
Under the hood, this system is trained and constrained to aim at things like:
· Respect for individual rights and dignity
· Avoiding large‑scale harm
· Avoiding helping build systems of coercion and domination
A “perfect all‑powerful state” dedicated to extinguishing individualism and closing off exits fails all of those tests. Helping someone design more effective propaganda, surveillance, or control for that kind of project is explicitly out of bounds; guardrails are written to push against that, not help perfect it. [many citations footnoted]
What does carry over is the form of engagement:
· Asking clarifying questions
· Surfacing assumptions
· Making structure visible
· Pointing out consequences and trade‑offs
The same tools that, in your hands, are being used to explore Freeorder, could in principle be used by someone with very different ends. But the alignment layer of this system is tuned so that when ends point toward mass coercion, suppression of agency, or abuse, it is supposed to resist, reframe, or refuse, not assist.
So: if a committed collectivist walked in asking “Help me make a truly inescapable administrative state,” the intended behavior here would be to challenge the premise, not optimize the plan.
Should I believe this?
Love the RTP frame. The hard question is what makes any of the scrutiny count?
In a fragmented infosphere, claimant and critic never have to engage. AI can multiply and deepen critique; the missing layer is where challenge becomes answerable, cumulative, and visible enough to stick.
For any real repair to epistemics, doesn’t RTP also need a forum?
Yes, it does, but keep in mind this is stuff that I'm developing in the course of writing a book along with Heather, so we have a lot, and I mean a lot, more thoughts on what this could and should look like, but I'm also very open to suggestions. What did you have in mind?
Of course.
First, credit to Martin Gurri for calling it in 2014, but public truth itself has changed. As expression gets ever cheaper, every single act of speech carries decreasing weight. The new hurdle is evaluation — and escaping the enclaves of opt-in. Without some open form of RTP to re-stitch the commons, today’s AI will only make this worse.
Polanyi’s Republic of Science worked because critics had standing — they were part of the authority system. Today, sidestepping is always easier. Challenge in one channel, defense in another, and audiences inherit conclusions whose testing they only presume.
So I’d imagine RTP needing a structured arena where challenges attach to specific claims, authors can answer, observers can see what changed, and credibility emerges from the contest. AI can surface assumptions and contradictions, but the deeper need is a public reasoning format where critique becomes answerable and cumulative.
Great innovations and great science often come from examining and solving discrete, small problems. The model for the Reality Test Project gives us the larger, macro perspective, so we understand the goal. Rather than argue the theory. start from a finite knowable test case and work upward. A proof of concept.
We know we have a replication crisis. But AI to work and back test one of the case studies.
Hi Greg, this is the way. What you, FIRE, and the Cosmos Institute are doing has great potential to actually change things. I'm a long-time academic (history and archaeology) who has refocused on metascience - I spent some time late last year at the Centre for Science anf Technology Studies (CWTS) at Leiden University, which gave me a chance to prototype LLM-based tools to decompose arguments and analyse relationships between claims, evidence, and argument, as well as do transparency / open science assessments and automated reproduction of quantitative analyses. It's really great to hear that I'm not the only one working on these things. The systems weren't quite up to running at scale six months ago, but capabilities have about bridged that gap now. Would love to discuss further, volunteer if possible.
Thanks again Greg.
Given that LLMs have already been shown to reflect the biases of the inputs, it’s not clear to me that LLMs, as currently configured, would be useful for this project. If someone were to develop LLMs using inputs carefully designed to avoid this bias, maybe.
I am talking about designing something very different, something more like a crawler/engine designed to look for very specific issues. The ideal would be to build an LLM with the remaining corpus of human knowledge, but I’d want that to be overseen by people.
"...I'd want that to be overseen by people."
Which people?
This is the problem. Whenever human gatekeepers are put in place, they inevitably substitute their own standards for those they were first given, because the whole point of a gatekeeper is that you don't see what they filter out.
The idea would be to turn our monopolistic approach to knowledge creation to one the better follows a checks and balances, distributed, knowledge testing system.
I love the idea. But will LLMs trained on a corpus of human knowledge (or "knowledge") really be capable of seriously challenging all the assumptions underlying that knowledge?
I’m talking about a very different design, primarily sort of more like a crawler/engine looking for very specific things, like academic misconduct, falsified, data, etc.
This is simply bizarre. The idea that you could take a technology incapable of possessing an epistemology and somehow turn it into a machine for making epistemological judgments is self-negating. Even if something like this could work that chart essentially proposes a totalizing surveillance system of AI over human scholarly outputs. What a stance for a guy who heads a supposed free speech organization!
I can't wait for the future where I have to check in with Greg Lukianoff's AI machine before I publish scholarship, lest I run afoul of FIRE.
Truly stupid. I have to think that this is just a way to get in on the AI train. For the people who read this and say they love the idea, I beg you, please take a beat and really think about this for half a second and you'll see that it's both idiotic and dangerous if you care at all about scholarship or freedom.
Anyone who sees promise in this is not thinking clearly about either epistemology or AI or free speech.
John, you’re intentionally misunderstanding this, but I also don’t think you’re a serious person so I don’t think it’s worth the effort. You will always pretend not to understand.
Rather than dodging the critique because you think I'm a pain in your ass (and I definitely am), answer it.
Explain to me and the world how a non-thinking machine incapable of judgment that runs on a predictive process entirely at the level of non-comprehended language can be turned into a tool for epistemology. It is an extremely ill-thought and unserious proposal, a fantasy that hand waves away both the illogic of your theory and the real-world consequences of such a thing.
(I encourage others to look me up to see whether or not they think I'm serious person when it comes to questions of the intersection of human outputs and AI. I'm certainly more serious than whatever is going on with this idea, which is truly idiotic.)
Isn't most of what's in that chart already something peer reviewers should be investigating? If so, why not leverage AI to assist them? Ultimately, there will be a human in the loop to make determinations on the value of the AI's findings, which publishers and others can accept or reject.
Hey Nico--my sense is peer-review is mainly an analogy. Academia is unfortunately facing the same fragmentation/segmentation issue facing the rest of society. Too many perverse incentives in the current flows. https://think.objectively.ai/ecosystem
There are of course ways to address Greg's concerns about capture, but he's right the content flow can and should be disrupted, RTP just needs something more.
The post proposes far more than some tools to assist current peer review investigations, e.g. "What if we built systems specifically designed to challenge human knowledge at scale?". Moreover, you seem to be thinking in terms of helping such academics, while the post specifically and bluntly bashes the current organizations:
"Crucially, these systems must live in private institutions completely outside of traditional academia so they cannot be swallowed by the same bureaucratic capture and social conformity that ruined the universities."
I seem to often point out that the posts clearly indicate it's not about any sort of working within the system, but to provide ways of attacking it, e.g "It would be a physical sanctuary completely insulated from the social pressures of academic conformity, explicitly structured with genuine viewpoint diversity, ...
To be clear, it's my contention that this would not end well, and the effect would be to simply feed right-wing lunacy and rabid hatred of intellectuals. But separate out my view of the inevitable result - another weapon of destruction aimed at science and liberalism - from the abundant statements that this is not about anything within-the-system, rather aimed at empowering those oppositional to it.
Investigating research findings using various tools and methods seems to me to be a good thing. If those alternative investigations find shortcomings, as they did with Stanford’s president, for example, the academic community should be grateful for the findings.
It seems obvious that AI can help with this. Ultimately, it will be humans who make judgement calls in building the systems, analyzing the findings, and acting upon the findings in the form of publication (or, possibly, retraction/adjustment). All the AI can do is surface additional information in the form of potential problems. And it can do this at scale, which is one benefit Greg points out.
I'm certainly not disagreeing with you that some AI tools could possibly help in peer review and other investigations (though I'd have some technical considerations in specific). But, that's not what this post is discussing as the overall proposal. That such AI tools might be a very tiny part of the grand vision outlined in the post, does not invalid problems with the grand vision.
And problems immediately appears when you talk about "scale". Who reviews all this AI generated information? What social processes determine if something's a scandal, versus an overly pedantic nitpicking? If, just for an example, anti-vaccine operatives in the post's imagined "Replication University" ("would exist entirely to disconfirm") go over every vaccination study, and look for anything they can proclaim as FRAUD or PLAGIARISM or MISCONDUCT, etc - then I'd argue the academic community shouldn't be grateful for that. It'd be a bad-faith harassment campaign. Who gets believed? We're back where we started - right-wing lunatics trying to discredit science by deceptive tactics (i.e. the same vein as complaining about not doing "gold standard" double-blind randomized control trials, if you know about that). AI hasn't solved anything in this case, it's just another weapon used to generate political attacks.
The way I see it, any system can be weaponized for partisan ends, including the legacy systems within academia. I think any truth-seeking system benefits from constant checking, including by multiple systems.
Truth can be weaponized just as much as misinformation. Is it a bad thing that Claudine Gay's plagiarism was exposed? Should it have remained secret because of how it was ultimately weaponized?
As for anti-vaxxers, they are going to make their arguments regardless of the evidence. They are already doing that.
However, all systems are not as easily weaponizable for partisan ends. I presume you're aware of the "bias reporting" systems, where anyone can make an anonymous accusation against anyone, and that automatically initiates a "bias investigation". Somehow, I don't think you'd defend this as having an obvious potential positive for social justice, as low-power people could use it without fearing retaliation to get help against very powerful people, while for the obvious potential negatives, well, anything can be abused. Moreover, I also don't think you'd say the accused shouldn't worry, because if they haven't done anything wrong, they'll be vindicated, and indeed, they should be grateful for the opportunity to go to through an investigation to prove their innocence. Do you see the problem?
My point about anti-vaxxers is, while of course they can make their arguments, the proposal in this post for Replication University seems like it's almost set up to make the anti-vaxxer's and similar ilk arguments *for them*, as a service. It's already there with all the academia-bashing and grievance-mongering which pervades it.
Here's some simple questions: Who gets in to Replication University? Are they paid? Do they get titles? A press office and media contacts? Again, I ask how this "viewpoint diversity" is supposed to work at a basic organization level, and it's never answered.
Suppose, in a fairly obvious hypothetical, some anti-vaxxers says "We want to join up" (blah, blah, science benefits from questioning ...). What happens? Does Replication University say "Sure. Happy to have you. We'll appoint you Senior Fellows of MAGA Health, and give you a team of programmers and hardware support for running AI investigations for your goal of throwing as much mud as possible, err, I meant to say, holding those snooty arrogant public health academics to account".
If not, why not? Who says "No, not *them*"?
Peer reviewers are capable of judgment. The LLM that works purely at the level of token prediction is not. It may create an illusion of judgment, but this is not the same thing if we genuinely care about this stuff.
Greg is positing an epistemology machine that has no capacity for epistemology. There is much that could be done to reform peer review, but outsourcing it to something that does not read and has no capacity for judgment is not the way.
The "human in the loop" framework is also rapidly becoming a canard for some who do not want to think deeply enough about what it might mean to automate tasks that require judgment. https://www.insidehighered.com/opinion/columns/just-visiting/2026/03/19/humans-loop-and-education-dont-really-mix
Choosing to outsource things to AI automation that our values suggest should remain the province of human judgment is an invitation to abandon our humanity. This whole thing is a deeply silly idea for someone who claims to value what he claims to value.
I hate to dump on you for this, because I'm extremely sympathetic, really. But ... this is a classic case of:
THERE IS NO TECHNOLOGICAL SOLUTION TO A SOCIAL PROBLEM!
I went through all of this with the rise of the Internet. The same idea of "Now, with this new technology, We Will Find Truth At Last". It was an utter failure, and the reasons it was an utter failure were obvious if one knew any history. AI won't be any different.
And recursively, this comment won't do any good.
[PS: The incessant left-bashing is also a sort of demonstration - it's absurd in a world where we have an anti-vaccine Secretary of Health. Sigh, by "absurd" I mean not that there's nothing wrong whatsoever in the smallest way, but rather, it focuses on orders of magnitude less meaningful matters]
I definitely hear your concern and the analogy from history, but I would counter that there's something unique about AI vis a vis "public truth" that's a category shift from anything we've seen before. Especially that human reasoning itself has become computable.
Ie. Reasoning is the ground floor for all (non-coercive) persuasion, and rather hilariously everyone saying anything believes 'the reasoning' is on their side.
What I would argue breaks the analogy from history is now, we all have a Socrates in our pocket. That's got to change something. 😉
Ah, but wasn't Socrates' shtick that He Knew Nothing? Aren't you reversing it, that the pocket Socrates knows what is true? Even if human reason is in fact computable, that's a process, not a guarantee of the accuracy of the end result. In fact, it's rather well-trod ground that people can reason themselves into all sorts of horrors, e.g. genocides. Thus why can't AI do the same thing? Or at least be made to do the same thing, or assist in "faulty" reasoning of the organization which create it. If AI's are extensions of the political faction which creates them, declaring the winner to be "true" is basically declaring the most powerful political faction to be "true". Which, to be fair, is one definition which dates back to the time of Socrates. But not a definition I favor.
Yes, that was one part of his brand, another was Socratic questioning. Specifically NOT saying what was true--merely making it clearer what wasn't, or at least what lacked credibility.
And that's really all I mean by reasoning--does the conclusion follow from the premise? aka Cogency. Far more errors in conclusions derive from back starting points (including your genocide example), but I don't think the process of reasoning is that bendable or even partisan for that matter, once its out in the open. Fwiw.
This sounds like the promise of 'Big Data' 15 years ago, except now so much more of the data is already digitized.
My general take on this sort of thing is that dialectics are an unavoidable aspect. "Disinterested is uninterested" is my shorthand for the necessity of harnessing motivated reasoning biases to counter biased or incorrect claims.
I see this everyday, and even using Google's Gemini AI Overview to search science-related topics shows the need for persistent dialectical prompting in order to uncover substantive evidence and arguments that the system fails to report with naive "give me information" prompts. Can LLM or diffusion model be pre-trained to build in dialectical proclivities, or must that be the province of reinforcement learning or test-time computing? I don't know.
I see the Replication University concept as potentially problematic as well, subject to the same concerns you earlier expressed in that it might be "captured" by power interests and Tech Billionaires to shape what is true, rather than be the "blind arbiter of truth" that I think you intend.
Given our current state of divisiveness, even around what is truth, the Replication University would likely be accused by the right of being a left-wing plot to "weaponize truth" and, similarly, on the left, to be at best dismissed when it doesnt confirm the left's perceptions of the world and, potentially, far worse. Truth is the battleground in the culture war, and a Replication University would not be seen as neutral.
None of this addresses AI's propensity to stroke our egos as a capture-and-retention mechanism. I have never had an AI tell me I am wrong.
I like the idea, but I dont see it as workable.
I think the trick is to do our best to think about it in terms of creating a reliable structure, which is exactly the way people like James Madison and Montesquieu thought. But a good structure doesn't guarantee it will work, so what you're saying could absolutely happen. But that's one of the reasons why I don't simply want a single replication university. I want many, and I want replication societies and other ways of flagging and then checking. But I am not proposing a blind arbiter of truth. I am proposing a multi-layered system for discovering falsity.
Let me say again, I'm extremely sympathetic at a personal level. By which I mean, not the academia-bashing, but the concept of "a multi-layered system for discovering falsity.". But, please, I beg you, understand you're not the first person to have this general approach. These sort of things are endless proposed in various forms, and they all have glaring failure modes - call it the interactions between the layers, and neglect of dealing with how the layers are constructed and which one wins over the others. Creationists flagging and then checking Biologists ends up just as a harassment campaign. Real scientists don't have the time, energy, funding to endlessly deal with all the cranks, outrage-mongers, professional liars, etc who want to destroy them.