Hacker News

by Ryan Harman

ArXiv's Next Chapter (blog.arxiv.org)

251 points by subset 15 hours ago

m-hodges 9 hours ago

I always struggle to figure out what role arXiv should play in my information diet. On the one hand I support Open Access research. On the other hand, peer review is vital, and a substantial quantity of “papers” on arXiv are just blog posts in a LaTeX trench coat.

tim-kt 9 hours ago

If you know the authors of your specific area of research, arXiv is a nice way to read their new papers when they are (mostly) done but the submission to a journal is not finished yet.

_alternator_ 3 hours ago

This. In my experience, you have to replace peer review with reputation for preprints. That's highly imperfect, and it tends to lead to dismissing of good but work by less well-known researchers as "not peer reviewed", while well-known researchers (or researchers at well-known institutions) basically get a fast track to citations.

Despite the imperfections, I found arXiv indispensable for my research. In particular, mathematics has a slow peer review cycle (it's hard to read and understand, and many referees require that they fully understand a paper to accept it, which imo is a little flawed, but that's the culture). I had several papers that were under review for more than a year (single journal, only one round of revisions), and arXiv was my only showcase. Both works ended up very highly cited, but publication delays would have been an even bigger problem if arXiv wasn't there.

yawnxyz an hour ago

they also keep the papers as a pre-edited, free version of the peer reviewed equivalent

modeless 9 hours ago

Do people browse arxiv or monitor new posts like reddit or something? I only visit when I encounter a link to it or when I search for a specific paper.

kmaitreys 7 hours ago

It depends on the kind of people. Most normal people don't do that, it's not a reddit-like platform after all.

But most researchers and grad students (like me) often subscribe to daily mailing list of the papers dropping that day from their particular field. Having a cursory read at the paper titles and then opening the papers further relevant to you is a morning ritual for many.

PaulHoule an hour ago

cschmidt 2 hours ago

I suggest Scholar Inbox.

https://www.scholar-inbox.com/landing

It is a recommendation system for new papers that come out each day. If you train it a bit by specifying what you like and don't like you'll get a pretty reliable feed.

embedding-shape 8 hours ago

I use the RSS feeds to watch for papers mentioning terms I'm curious about, do a casual skim for anything interesting and maybe end up finding a paper per month or two that are useful to read more carefully. Lots of chaff for sure, but if you have some core interests it's quite useful.

abdullahkhalids 3 hours ago

Not all the time, but I certainly do to keep up with latest results. Usually, these days I go through SciRate, where the quantum computing community is very active in voting up good paper [1].

[1] https://scirate.com/arxiv/quant-ph

emadb 6 hours ago

I built a bluesky bot if someone is interested in having a live feed of the articles.

You can find it here: https://bsky.app/profile/arxiv-daily-bot.bsky.social

Ariarule 6 hours ago

Yes, people do that. Karpathy made a utility to monitor it better years ago: https://github.com/karpathy/arxiv-sanity-preserver

jjgreen 9 hours ago

A bit too big and varied to browse, but you can get emails of all recent papers in your field(s) of interest with something like Scholars: https://app.scholars.io/newsletter I subscribe to "Functional Analysis" and get a weekly email listing 30-40 papers.

SiempreViernes 9 hours ago

Yeah, it is not too uncommon that people visit the new listings (or subscribe to the email version) to (try to) keep track of what is going on in your field.

Supposing of course your field roughly matches one of the categories.

pks016 2 hours ago

I get google scholar alerts according to authors.

rubidium 6 hours ago

I did when I was in academia. Would open each day and check what new papers were in my field. It was fun, and I learned a ton.

I kept it up out of habit for a year after grad school. Then moved on.

evanb 7 hours ago

I’m RSS-subscribed to a few sections relevant to my research.

alphabeta3r56 6 hours ago

RSSFeed yes

bonoboTP 33 minutes ago

Have you personally reviewed for big conferences or submitted and received reviews? It's a very noisy process that does toss out the lowest effort clueless stuff, but doesn't discriminate all that well between "meh" and "interesting", junior reviewers (the bulk) want proof of blood, sweat and tears. They want novel model modules and algo tweaks and complain about novelty that it's just A plus B, missing the point... They surely don't catch wrong results or incorrect claims because the catastrophic problems that invalidate papers are often in the implementation, not the nice math equations that motivate it.

In other words, Arxiv is what you use when you want to inform yourself on new research, conferences are for furthering your career by getting closer to your PhD graduation, expand your CV etc. And then to network and mingle with researchers in person and try to get hired.

gspr 9 hours ago

One growing role, especially in mathematics, is that of a host for "overlay journals": https://www.insmi.cnrs.fr/en/cnrsinfo/epijournaux-en-mathema...

I really like the idea. In short: arXiv, HAL and similar sites host the papers without any peer review (short of perhaps stopping crank spam) or access control. They're freely available to anyone. Authors then submit arXiv IDs (or similar) to the reviewers of "overlay journals", which then review and accept or not. The overlay journal accepts a paper by just adding it to its list of accepted arXiv identifiers, and that's that.

This ensures accessibility for all, keeps peer review, yet takes a lot of the practical hurdles away from actually running a journal. A journal can now just be a group of people who give thumbs up or down to arXiv identifiers, and if that group's conclusion start having weight in the community then it's become an important journal. Maybe they give away their listings for free, maybe they charge to read the reviews – it's really up to them what the business model (if any) will be.

It's really nice.

IanCal 3 hours ago

I’ve been arguing for this for a long time, glad to see this sort of thing start.

Papers “being in” a journal hasn’t made sense for a long time, but curation is valuable as is staking reputation on something.

People I was with called some of this “badges”, there is no reason why a paper cannot be reviewed by a set of people who say “this is new and innovative stuff in the field and highly important if true, but we’re not making claims about the stats” and a different set able to say “the stats here is spot on but we don’t know how relevant it is in biology” and another to say “we can rerun the code and get the same analysis results out, but we don’t know if the analysis is doing anything useful”. Right now we have journals making some combination of claims, and authors have to pick a single journal.

Once you view journals as a list of papers, the exclusivity seems weird. Once you see that journals are then a set of identifiers added to a paper, or rather statements about a paper, there’s lots of interesting ways you can imagine more useful things than current publishing.

Borealid 2 hours ago

I think the DOI system provides a stable identifier for a paper that is not specific to arXiv?

prepend 6 hours ago

It’s a useful tool. But its “value” is about the same as a github repo with your pdf.

It doesn’t need much funding or staff and not quite sure why they’re going through all this rigmarole and independence. I almost think they’d be better off like Apache where there ade very few employees.

montebicyclelo 9 hours ago

Well, some blog posts are worth citing.

m-hodges 9 hours ago

Of course some blog posts are worth citing. Then cite them as blog posts.

My point is that a LaTeX PDF can launder epistemic status. An unreviewed argument starts to look like established research merely because it adopts the visual grammar of a paper.

bonoboTP 23 minutes ago

dooglius 13 minutes ago

montebicyclelo 9 hours ago

zzleeper 5 hours ago

poslathian 6 hours ago

The bibliography is more important, imo, than the peer review. I get the most use of arxiv surfing references and citations.

esafak 3 hours ago

Unless you are in research I would not bother; you are trying to drink from a firehose. Let other people do the curating for you.

gowld an hour ago

arXiv enables peer review!

arXiv users are the peers doing the review.

"Peer review" has existed for centuries before journals created their own bad for-profit version.

colechristensen 4 hours ago

"peer review is vital"

I suggest knowing some people who have written works for peer review and done peer review themselves.

Some people outside academia give peer review quite the undeserved aura.

There's a lot of trash on ArXiv, how much of it is in your diet should depend on your ability to evaluate the quality of research.

tokai 9 hours ago

Actually arXiv is frustrating from an open access angel. It is very much possible to put up documents without open licensing so the content is not always fulfilling the open access definition.

augment_me 9 hours ago

Peer review WAS vital for a long time. Maybe the world looks different now, maybe LLMs can find value in things better than humans. When you make an assumption it's good to think about why you do so, in this case it seems to be for historical reasons.

replygirl 8 hours ago

likewise, taking a wrecking ball to systems refined over centuries should come with some burden of proof for the positive claim that a tool can replace an institution. most times this has happened before, we've had to strengthen credentialing requirements to stop people from dying

vlovich123 8 hours ago

jdw64 12 hours ago

I'm always grateful to arXiv. It allows non-scientists like me to access high-quality papers anytime. Thank you, always

kergonath an hour ago

There’s a lot of stuff on Researchgate. And with the evolution of European grants, there are a few publicly-available repositories, like hal.science (funded by the French government and the default repository for public research in France, I think you have to be with some kind of research institution so it’s not quite as open as arxiv but there are plenty of good articles there).

xdertz 10 hours ago

It is also valuable for scientists as it is often a 'directors cut' version of the paper. Journal submissions are heavy edited and shortened to fit into the page limits.

kergonath an hour ago

When that’s the case, the preprints would be just as short. We don’t really like unnecessary pain so we write short manuscripts from the beginning, if we plan to submit in such a journal. Usually, the longer versions get published somewhere else anyway.

emil-lp 10 hours ago

I don't know which field you're talking about, but in general, math and cs journals do not have page limits.

By the way, one of my favorite pastimes is to download the latex source for papers on arxiv and read all the commented-out stuff.

% we should make sure this theorem is actually true

BeetleB 3 hours ago

honzaik 10 hours ago

infinet 3 hours ago

I am thankful for arXiv only made minor adjustments to its UI over the years, and I hope arXiv keep it that way.

estebarb 2 hours ago

I really miss the crimson red. New one makes me think they are mourning someone.

WalterGR 11 hours ago

“ArXiv declares independence from Cornell” (science.org)

811 points | 3 months ago | 291 comments

rw2 11 hours ago

Should charge AI for training on top of it or get them to donate. A small amount can fund them easily.

jltsiren 11 hours ago

That would be a trap. It's healthier for a non-profit to have many small funders than a few large ones.

nok22kon 11 hours ago

exactly, the only reason Mozilla exists today is as a legal shield against an anti-browser monopoly suit against Google. that's the product they sell, and Google is paying hundreds of millions per year for this valuable service

khurs 8 hours ago

charcircuit 10 hours ago

prepend 6 hours ago

Part of the promise of open access and open science is that the information is free and open to all. Including robots.

I submit to open things because I want my material to be openly available. If I wanted restrictions, I would submit to gated journals.

hodgehog11 10 hours ago

Papers submitted to arXiv under its most permissive license should always be free, as in beer, speech, freedom. For researchers that contribute to it, that is the intention for a reason. It is to serve public and corporate good without restriction.

This isn't me siding with AI companies by the way; it's a slippery slope argument.

i_cannot_hack 3 hours ago

> It is to serve public and corporate good without restriction.

Sometimes those two are in conflict, such that it will not be possible to satisfy both simultaneously.

nok22kon 11 hours ago

as if they would pay.... they would pirate the contents as they already did

brookst 10 hours ago

They’ve never paid for any content?

NishanStepak 3 hours ago

I have always liked arXiv's articles on information science and library science. I hope they continue publishing quality research.

themikejr 3 hours ago

Any examples or greatest hits you would care to share?

pbronez 40 minutes ago

Thought-provoking closing paragraph from the linked Cornel Chronicle article about the transition:

“It’s now difficult to prepare for the world three months from now if the median LLM-produced computer science paper is better than that produced by the median grad student.”

https://news.cornell.edu/stories/2026/06/digital-research-re...

latentframe 9 hours ago

The big challenge will maybe be governance more than infrastructure : staying community driven while becoming an independent nonprofit is not trivial

TomasBM 10 hours ago

ArXiv is a good complement to the modern peer review, IMO. As long as someone "vouches" for you, and you adhere to its minimal standards, you're able to post a paper. Other readers can decide whether the paper is worth their attention, and whether the presented ideas or results are valuable.

It's also good that it doesn't gatekeep with the paywalls that you can pretty much only afford by affiliating yourself with a toll-paying institution.

Obviously, there are plenty of flaws with this system:

1. If you're associated with a brand (e.g., Google, MIT) or have a recognizable co-author (e.g., Yann LeCun), you'll get attention and citations no matter what.

2. "Vouching" can also just mean accepting someone's email request without ever having met or known them.

3. It puts the effort on the readers to decide whether each paper is valuable, and particularly scientifically valuable, for which most readers will be unequipped.

4. "Minimal standards" can be gamed by AI-generated submissions.

I'd love to see a synthesis of arXiv, open-access publishing and artifact reviews, like the following:

- Have a number of reviewers on retainer, or design a reward system similar to bug bounties. The reward mechanism probably shouldn't be based on money or allow a winner-takes-all strategy.

- Have a number of badges with respect to the quality and value of the paper. For example: validated by peers (i.e., reviewed by at least 3 peers with minimum borderline accept consensus), valuable (i.e., reviewed by at least 5 peers with a valuable indicator), etc.

- Allow vouched comments on the platform, and moderate for self-promotion, toxicity, etc. Obviously a big ask.

- Improve the "vouching" system, or add badges like "vouched by X people" or "vouched by established scientist".

Hope their new organization will implement some of these improvements.

Vinnl 10 hours ago

I volunteered for a project [1] with roughly this philosophy. Traditional publishing currently serves three purposes:

- Organise peer feedback - Publish the work - Recognise good work, helping with both discovery and credit

That latter part especially is what allows publishers to charge the ridiculous markup that they do.

But with "modern" technology, feedback and publishing really doesn't require all that infrastructure - email and arXiv can easily be used to self-organise that. So we built a system of recognition that does not block publication, and can be used as a layer on top of arXiv and any other venue, allowing peers to vouch ("endorse") for a work.

I had even proposed and implemented an integration for arXiv Labs that got accepted, but then never merged. I should follow up on that...

[1] https://plaudit.pub/

TomasBM 10 hours ago

> I had even proposed and implemented an integration for arXiv Labs that got accepted, but then never merged. I should follow up on that...

You definitely should - looks like what I roughly had in mind.

Thanks for sharing!

GoblinSlayer 10 hours ago

>3. It puts the effort on the readers to decide whether each paper is valuable, and particularly scientifically valuable, for which most readers will be unequipped.

You say it as if replication crisis doesn't exist and publish or perish is not a thing.

TomasBM 10 hours ago

Actually, the replication crisis shows how difficult (or underinvested) the process of reviewing is.

Removing this (often very basic) peer review doesn't somehow fix the problem. The solution lies in more thorough reviews and replication studies, not in everyone deciding for themselves.

gspr 9 hours ago

You can even combine arXiv and peer review very neatly: https://news.ycombinator.com/item?id=48744030

TomasBM 3 hours ago

I like this - thanks for sharing!

piokoch 11 hours ago

That worries me a bit. ArXiv was and is great and so useful to humanity, giving access to otherwise closed knowledge, hold by publishers cartel, that I would not like to see it is turning into a "non-profit" of OpenAI kind...

tancop 10 hours ago

openai had billionaire "donors" who understood the company was going to operate as a PBC with a positive return for them instead of a true nonprofit.

the heel turn to unlimited for profit was only possible because of their unique structure and the fact they were already selling commercial products. arxiv is not selling anything so theres no financial incentive to take over.

kaizenite an hour ago

Yeah yeah yeah, are you buying arXiv at IPO?

tokai 9 hours ago

This is exactly the play book that messed up scientific communication last time. Journals and research societies run by researchers and their institutions was spun off, sold, and made independent which in turn made it possible for a few publishers to gobble up everything.

Hacker News

by Ryan Harman

ArXiv's Next Chapter (blog.arxiv.org)

m-hodges 9 hours ago [-]

tim-kt 9 hours ago [-]

_alternator_ 3 hours ago [-]

yawnxyz an hour ago [-]

modeless 9 hours ago [-]

kmaitreys 7 hours ago [-]

PaulHoule an hour ago [-]

cschmidt 2 hours ago [-]

embedding-shape 8 hours ago [-]

abdullahkhalids 3 hours ago [-]

emadb 6 hours ago [-]

Ariarule 6 hours ago [-]

jjgreen 9 hours ago [-]

SiempreViernes 9 hours ago [-]

pks016 2 hours ago [-]

rubidium 6 hours ago [-]

evanb 7 hours ago [-]

alphabeta3r56 6 hours ago [-]

bonoboTP 33 minutes ago [-]

gspr 9 hours ago [-]

IanCal 3 hours ago [-]

Borealid 2 hours ago [-]

prepend 6 hours ago [-]

montebicyclelo 9 hours ago [-]

m-hodges 9 hours ago [-]

bonoboTP 23 minutes ago [-]

dooglius 13 minutes ago [-]

montebicyclelo 9 hours ago [-]

zzleeper 5 hours ago [-]

poslathian 6 hours ago [-]

esafak 3 hours ago [-]

gowld an hour ago [-]

colechristensen 4 hours ago [-]

tokai 9 hours ago [-]

augment_me 9 hours ago [-]

replygirl 8 hours ago [-]

vlovich123 8 hours ago [-]

jdw64 12 hours ago [-]

kergonath an hour ago [-]

xdertz 10 hours ago [-]

kergonath an hour ago [-]

emil-lp 10 hours ago [-]

BeetleB 3 hours ago [-]

honzaik 10 hours ago [-]

infinet 3 hours ago [-]

estebarb 2 hours ago [-]

WalterGR 11 hours ago [-]

rw2 11 hours ago [-]

jltsiren 11 hours ago [-]

nok22kon 11 hours ago [-]

khurs 8 hours ago [-]

charcircuit 10 hours ago [-]

prepend 6 hours ago [-]

hodgehog11 10 hours ago [-]

i_cannot_hack 3 hours ago [-]

nok22kon 11 hours ago [-]

brookst 10 hours ago [-]

NishanStepak 3 hours ago [-]

themikejr 3 hours ago [-]

pbronez 40 minutes ago [-]

latentframe 9 hours ago [-]

TomasBM 10 hours ago [-]

Vinnl 10 hours ago [-]

TomasBM 10 hours ago [-]

GoblinSlayer 10 hours ago [-]

TomasBM 10 hours ago [-]

gspr 9 hours ago [-]

TomasBM 3 hours ago [-]

piokoch 11 hours ago [-]

tancop 10 hours ago [-]

kaizenite an hour ago [-]

tokai 9 hours ago [-]

m-hodges 9 hours ago

tim-kt 9 hours ago

_alternator_ 3 hours ago

yawnxyz an hour ago

modeless 9 hours ago

kmaitreys 7 hours ago

PaulHoule an hour ago

cschmidt 2 hours ago

embedding-shape 8 hours ago

abdullahkhalids 3 hours ago

emadb 6 hours ago

Ariarule 6 hours ago

jjgreen 9 hours ago

SiempreViernes 9 hours ago

pks016 2 hours ago

rubidium 6 hours ago

evanb 7 hours ago

alphabeta3r56 6 hours ago

bonoboTP 33 minutes ago

gspr 9 hours ago

IanCal 3 hours ago

Borealid 2 hours ago

prepend 6 hours ago

montebicyclelo 9 hours ago

m-hodges 9 hours ago

bonoboTP 23 minutes ago

dooglius 13 minutes ago

montebicyclelo 9 hours ago

zzleeper 5 hours ago

poslathian 6 hours ago

esafak 3 hours ago

gowld an hour ago

colechristensen 4 hours ago

tokai 9 hours ago

augment_me 9 hours ago

replygirl 8 hours ago

vlovich123 8 hours ago

jdw64 12 hours ago

kergonath an hour ago

xdertz 10 hours ago

kergonath an hour ago

emil-lp 10 hours ago

BeetleB 3 hours ago

honzaik 10 hours ago

infinet 3 hours ago

estebarb 2 hours ago

WalterGR 11 hours ago

rw2 11 hours ago

jltsiren 11 hours ago

nok22kon 11 hours ago

khurs 8 hours ago

charcircuit 10 hours ago

prepend 6 hours ago

hodgehog11 10 hours ago

i_cannot_hack 3 hours ago

nok22kon 11 hours ago

brookst 10 hours ago

NishanStepak 3 hours ago

themikejr 3 hours ago

pbronez 40 minutes ago

latentframe 9 hours ago

TomasBM 10 hours ago

Vinnl 10 hours ago

TomasBM 10 hours ago

GoblinSlayer 10 hours ago

TomasBM 10 hours ago

gspr 9 hours ago

TomasBM 3 hours ago

piokoch 11 hours ago

tancop 10 hours ago

kaizenite an hour ago

tokai 9 hours ago