AI uBlock Blacklist (github.com)

190 points by rdmuser 14 hours ago

quiet35 5 hours ago

I like the idea and even considered contributing to the list, but this stopped me:

> NAQ (Never Asked Questions)

> My website is on your list!

> Cry about it.

That's quite a suspicious attitude. Clearly the maintainer believes he is infallible. I understand the emotions behind this, but this is not how a public blacklist should be maintained.

well_ackshually 2 hours ago

> but this is not how a public blacklist should be maintained.

Cry about it.

There's nothing in that repo that even pretends to be flawless, impartial or anything else. The sheer amount of mental denial of service that having to deal with SEO slopshitters opening issues saying that they promise their substack is totally written by hand makes this an impossible task.

Ban first, ask questions later. If you find that some rules are unfair, edit them yourself, for your personal usage.

TonyTrapp 5 hours ago

Yuuup. My personal website has been inaccessible to a few friends, they thought my server was down. It turned out they had some blocklist (not related to AI) installed on their PiHole, and for whatever reason my website was on that list. It is, in fact, to this day, because my request to unblock it went completely unanswered. I still don't know why the website is on the list.

jorvi 4 hours ago

Go to the Adguard GitHub (or use the extension) and report it. And get all your friends to switch to Adguard extension and Adguard Home (Pi Hole alternative) as blockers.

Easylist and its sublist are notorious for being poorly maintained and ignoring issues opened against it. Adguard is much more active in maintaining its lists. Especially Adguard its language blocklists have much, much less breakage and missed ads than Easylist.

VladVladikoff 5 hours ago

Perhaps it got hacked and was hosting malware without you being aware? They are pretty good at hiding it from the site owner (showing the original website to you, but not to others).

TonyTrapp 5 hours ago

the_biot 4 hours ago

I would add that with this attitude and how new this initiative is, there's very little chance it will still be updated 5 years from now. Really this sort of thing needs to come from Easylist or similar, who have a track record of maintaining these for years.

Larrikin 2 hours ago

I don't understand the need for the author to commit the rest of his life to this or start a foundation. It is a good list for now and if its never updated again, that seems fine.

DrammBA 2 hours ago

You forgot:

> A personal list for uBlock Origin

Drupon 3 hours ago

Probably because there's about the same chance of them being innocent as the "Help I was wrongfully banned by VAC :(((" posts in the Counterstrike community.

matheusmoreira 2 hours ago

Reminder that false positives are not only possible but likely. I remember one instance where you could get people banned by sending them a specific string of characters over chat. Anticheat was scanning the entire contents of RAM looking for it.

These days anticheat software is likely to snap at anything. Who knows what they think of the development tools Hacker News users are likely to have on their computers? They really hate virtual machines for example. There's no telling how they'd react to a debugger or profiler.

Drupon 27 minutes ago

ycombinatrix 40 minutes ago

If the website is not AI slop, presumably they would remove it from the list.

NeutralCrane 4 hours ago

Also seems a bit hypocritical given the screed about how such a list is necessary because the AI content might output hallucinations or damaging content without review.

But if it’s the author’s blocklist that is wrong, unverified, and causing harm to others? Cry about it.

dhayabaran 5 hours ago

The false positive problem gets worse over time too. Domains get sold, sites pivot, old content gets removed. A blocklist with no removal process and a "cry about it" attitude in the FAQ is basically a one-way reputational blackhole. At minimum it needs an expiry or re-review mechanism. Even browser safe browsing lists re-check URLs periodically.

throwatdem12311 6 hours ago

Ublock Origin also already has an “AI widget” blocklist you can enable. Literally the only extension that keeps me on Firefox because of how useless it is on Chromium.

rdmuser 14 hours ago

A new more grounded list focused on specifically blocking content farms and similar low quality sites.

A nice alternative to this very broad anti ai list: https://github.com/laylavish/uBlockOrigin-HUGE-AI-Blocklist

Edit: Oh I should mention I found it through reddit and there is some good discussion there where they describe how they find stuff etc: https://www.reddit.com/r/uBlockOrigin/comments/1r9uo3j/autom...

Dwedit 7 hours ago

The broad list seems to just be a hater list. It's not trying to cover cases of deception (passing off AI material as if it's something else), as it includes sites which are very open about what kind of content is on there.

malfist 4 hours ago

Would you say the same about a block list that blocks anything else? I don't care how obvious an ad is, I don't want to see it. Same with social widgets or cookie consent banners, or newsletter sign-ups.

But I wouldn't call the person that maintains the news letter popup block list as "newsletter hater"

gruez 19 minutes ago

hogwasher 4 hours ago

The purpose of the broad list is removing AI-generated content from search results, so that the user doesn't have to wade through (as much) slop to find the human-created content they're looking for.

While I applaud the honesty of sites that are open about their content being AI generated, that type of content is never what I'm looking for when I search, so if they're in my search results it's just more distraction/clutter drowning out whatever I'm actually looking for. Blocking them improves my search experience slightly, even though there is of course still lots of other unwanted results remaining.

Granted, I definitely count as an AI hater (speaking of LLM's specifically). But even if I weren't, I don't think I'd be seeking it out specifically using a search engine; why would I do that when I could just go straight to chatgpt or whatever myself? Search is usually where people go to find real human answers (which is why appending "reddit" to one's searches became so common). So I see this as a utility thing, more than a "I am blocking all this just because I hate it" thing. Although it can be both, certainly.

Edit: removed an off-topic tangent

smusamashah 6 hours ago

So there is a spreadsheet of websites. That is very interesting. There was an article here sometime ago about a media group who have so many super SEOd websites. They all have common footer text. I searched and added as many as I could find in uBlacklist. I have a gist listing them and how I searched for them. You might find that useful.

Edit: https://gist.github.com/SMUsamaShah/6573b27441d99a0a0c792431...

xnx 9 hours ago

Hasn't been updated in 5 months

rdmuser 9 hours ago

Oh good point I also overlooked that with the anti ai list.

The big anti ai list also seems to be focused on hiding links from ddg/bing/google where this new more focused list just blocks sites. I tend to like block ones vs hiding because they pop up a nice warning no matter where I came from and I can still decide to ignore it if I want so they is more user agency instead of just quietly hiding a unclear chunk of the net from search engines.

amelius 6 hours ago

At least we're not yet in the phase where we have a whitelist for the internet.

papichulo2023 6 hours ago

We were close but the app dominance declined.

lifthrasiir 13 hours ago

Not necessarily disagreeing the whole principle...

> All I hear is skill issue. Imagine needing an AI to write stuff.

Grammarly users (and underrepresented non-English speakers) would complain.

QuadmasterXLII 8 hours ago

There’s not a single group who’s ever been told skill issue that didn’t complain

tclancy 5 hours ago

Sure, but there also plenty of times “get gud!” is used for gate keeping. Life is on a continuum, man.

rdmuser 13 hours ago

Personally I find that I prefer badly written english or auto-translated stuff written in languages foreign to me over ai generated or even just ai polished works I've seen. There is just so much more character, depth and variance there vs ultra ai generic or slop text.

That being said this project seems focused on content farms not people who just need a little help writing so this whole conversation is a bit of a side tangent.

flkiwi 8 hours ago

One of my coworkers is EXTREMELY capable but functionally almost illiterate. He’s recently discovered that he can put an idea in Copilot and have it generate an email. So now instead of brief, correct, but difficult to parse emails we receive 20-paragraph, bulleted, formatted OpenAI slop. It’s been a very strange thing to see, like someone getting extraordinarily bad cosmetic surgery.

ploum 5 hours ago

dawnerd 6 hours ago

SpicyLemonZest 5 hours ago

wolvoleo 5 hours ago

lifthrasiir 13 hours ago

I mean, I know it is probably tongue in cheek but that never-asked-question was particularly out of place. Massively generated AI contents are usually not THAT thoughtful anyway.

dangus 7 hours ago

This specific list from this specific author isn’t worth using since they refuse to remove items from the list if domain ownership changes.

E.g., bought a domain that previously hosted AI content.

E.g., Whitehouse.com used to be a porn site, now it’s not.

duskdozer 7 hours ago

If you don't know English and you want to write English anyway, please just use a machine translator.

mrweasel 5 hours ago

From experience: If you don't know Danish, please don't ever use machine translators to translate from English. Regardless of what some people may think, they make mistakes, so many mistakes.

I get why it's tempting, good translators are expensive, and few and far between. A friend of my is a professional translator and she's not exactly in need of work, but a lot of customers look at her prices and opt for machine translations instead and the result not always impressive. Errors range from wrong words, bad sentence structure to an inability to correctly translate cultural references.

embedding-shape 5 hours ago

runarberg 3 hours ago

victorbjorklund 6 hours ago

And the machine translator is using AI to translate the text

GaggiX 7 hours ago

Why? A model correcting your errors is a powerful tool to learn the language, much better than just writing the phrase in your native language.

UqWBcuFx6NV4r 7 hours ago

…what? no? why?

jofzar 9 hours ago

I use Grammarly at work (it's mostly to make sure our brand guidelines are kept) and I don't find that it (defaultly) corrects too far into the ai slop territory. It's mostly just making sure your sentence is correct.

Op is going after AI slop bot farms like android authority

rererereferred 9 hours ago

I mean, the reason we use grammarly is because we recognize we have a skill issue.

notepad0x90 3 hours ago

Love this, I wish there were more and broader categories of sites one could block. You can always temporarily allow sites.

In the enterprise space, there are URL reputation providers. They categorize sites based on different criteria, and network administrators block or warn users based on that information.

In my humble opinion, there needs to be a crowdsourced fund (or ideally governments would take this seriously and fund it on behalf of people) for enabling technologies that allow user friendly internet experiences. Browsers, frameworks, vpn providers, site-reputation, deceptive content, dns-providers, email providers,trusted certificate authorities(no,google and microsoft shouldn't get to police that), nation-state or corporate affiliations,etc... You shouldn't need to setup a pi-hole.

Imagine a $1B/yr non-profit fund for this stuff. if 10M people paid $10/mo that's $1.2B/yr. Proton has $97M revenue in 2024 and 100M total accounts (I don't know how many pay but the spread is roughly $1/user). I really think now is the time to talk about this when so many are wary of US tech giants and looking for other opportunities.

meindnoch 5 hours ago

Also need a rule that filters out HN submissions from that Simon Wilson guy.

nicbou 5 hours ago

Why? He posts high-quality content that's interesting if you care about that field. It's not my cup of tea, but it's pretty far from what this list tries to block.

selridge 4 hours ago

Because they are scared of the future. That’s why.

eclipticplane 5 hours ago

His articles are _about_ AI though, not AI slop?

greyman 3 hours ago

Meta question: do you guys feel the adblockers will maybe not be that important in the future? As for myself, I ended up to use just a few websites, but those are reputable and I don't mind a few ads they provide. The only adblock which is still very much needed is one for Youtube.

diath 3 hours ago

According to uBlock Origin it blocked 9.5 million requests to ads/third party trackers since I installed it. So yes, it's very much needed.

Grom_PE 2 hours ago

I feel that blocking, substituting, and even inserting user-defined resources for a website must be a native browser feature.

xboxnolifes 3 hours ago

I dont think this is a sign of the times or the future. I think its just your own personal browsing habits.

ramon156 5 hours ago

I would rather have a whitelist that adds a nice tag at the end of the link, indicating that overall it has high quality content. This also forces you to periodically check the sites you've whitelisted

dimava 5 hours ago

Also check the https://botblock.ai/ , AI extension to detect AI replies on twitter

add-sub-mul-div 5 hours ago

That's a curious one, Twitter is worthless anyway. Before AI bots proliferated, the change to rank paid accounts high in replies turned it into a de facto entry level $8/month advertising tier.

ossa-ma 5 hours ago

Glad we're moving in this direction, I've also got a tool that I use to determine if writing is AI using common tropes and reconstruct the OG prompt from it: https://tropes.fyi/aidr

mh- 5 hours ago

Haha, that's a neat idea. Thanks for sharing.

https://tropes.fyi/aidr/b184cf3a

https://tropes.fyi/aidr/9b132f92

jadar an hour ago

I feel like this is a bit of a sinking ship. I suppose if you want to avoid known sources of slop then this works … but beyond that it’s a bit of a lost cause. It’s like sports betting — once it’s there then there’s no saying who is (ab)using it.

semiinfinitely 5 hours ago

Tragic twist: repo was entirely AI generated

mixtureoftakes 4 hours ago

media.tenor.com/oW5zO_6gu5gAAAAi/theomegaoof-emoji.gif

afcool83 8 hours ago

Admirable idea and execution…but it does apply opposing evolutionary/economic pressure for AI-slop to become less detectable over time. AI will learn and adapt.

Metaphorically speaking, it’s the Borg we’re dealing with, not the Klingons. All Janeway did was slow the Borg’s progress.

mapontosevenths 7 hours ago

Cory Doctorow wrote a story ~20 years ago about how the first sentient machines would be spam bots because their job is to pass as human, and anti-spam systems provide competitive evolutionary pressure.

He may not be too far off.

tetris11 7 hours ago

mapontosevenths 6 hours ago

alansaber 4 hours ago

It's actually rather difficult for SoTA models to shift tone without losing performance on various datasets, so not such a one-sided arms race.

Dwedit 7 hours ago

What happens if a legitimate site (forums, wiki, etc) gets mass-spammed with slop?

harladsinsteden 7 hours ago

I ceases to be legitimate.

metalman 7 hours ago

flip it, and build green(organic) lists perhaps work towards having sites than dont just, not use AI, but never talk about it it's not just AI, search is a scam, no mojo in the world can extract the contact info for the business next door and the mountains of porncoin, scamulous garbage and hate news taking up a full 50% of whats left, does in fact make a determined effort to greenwall a section of the web something to consider

firebot 7 hours ago

Firefox already feeling more responsive.

filldorns 4 hours ago

Come on guys, 2026 and you still using "blacklist". Why not BlockList?

charonn0 3 hours ago

Because changing blacklist to blocklist, master to main, etc. is a meaningless act of virtue signalling.

Thanemate 2 hours ago

I'd argue it's not meaningless because the point wasn't to show inclusion but power. Nobody went for master's degrees, "master" as a rank in video games, or anything else.

Reminds me of [1]twitch.tv trying to remove "blind playthrough" as a tag to encourage inclusive language.

1. https://www.reddit.com/r/Twitch/comments/k7dvgw/twitch_remov...