Hacker News

by Ryan Harman

Apertus – Open Foundation Model for Sovereign AI (apertvs.ai)

497 points by T-A 20 hours ago

maxloh 19 hours ago

Other fully open LLMs include Allen AI's OLMo 3.1 and MBZUAI's K2 Think V2, both of which have released their full training pipelines and datasets.

Nvidia Nemotron is also an open training source model, though a portion of its dataset remains proprietary.

Quoting lambda's comment:

> Note that the Nemotron models are generally stronger than Olmo and K2 Think V2 (according to Artificial Analysis benchmarks), and there is a lot of overlap in their datasets (lots of datasets are based on the same sources with different filtering, Olmo and K2 Think V2 both have used some Nemotron datasets).

> But yeah, Nemotron is a modern and fairly capable LLM, even the 122b is more capable than Deepseek R1 (a 671b model) on most benchmarks, and there's also the recently released 550b Ultra now.

https://news.ycombinator.com/item?id=48492439

soundworlds 16 hours ago

Allen AI do not get enough love. They are doing GenAI how it should have always been done.

In fact, if the frontier companies had taken their approach, it would have started much slower, but I think we would be far more advanced by 2035. Instead we have a majority of society that wants to see AI fail.

dvt 13 hours ago

> Instead we have a majority of society that wants to see AI fail.

Do you talk to regular people? I work out of coffee shops routinely and literally like 90% of laptops have ChatGPT or Claude open. I was shocked at how many of my friends love the silliest of AI features (like Slack bot summarizing your day or your upcoming meetings), and a lot of decks, proposals, SOW's, etc. are (at least in part) generated with AI these days.

dofm 3 hours ago

johngossman 4 hours ago

jstummbillig 12 hours ago

alfiedotwtf 10 hours ago

scjody 4 hours ago

apercu 3 hours ago

a_136_chiffa 4 hours ago

LLMs were invented by AI2, before Transformers were a thing - with RNN-based ELMO.

polytely 7 hours ago

I don't want AI to fail but I would like to see Altman anf Musk fail for example. I'm very uneasy with the power hungry silicon valley freaks that are running the show at these labs/companies. Hassabis seems like the only one that is not actually Evil.

waffletower an hour ago

sawjet 13 hours ago

Is there any evidence that "a majority of society wants AI to fail"

Or is it just vibes?

markerz 13 hours ago

intended 9 hours ago

hit8run 14 hours ago

Care to elaborate on this?

AndrewKemendo 16 hours ago

Fully agree with this and they were leading robotic learning as well even back to 2019.

IsaacSim was (and might still be) the best robotic learning sim and I ran MLAgents.

typ 9 hours ago

> an open training source model

It's always funny to see people tempted to call open-blobs/open-weights, which are literally shareware like WinRAR or Adobe PDF Viewer, open source, and then need to invent a new term for what is actually open source.

maxloh 4 hours ago

Nemotron is vastly different from standard open-weight models. Its entire training pipeline is open-sourced, while other vendors typically only release the model weights.

wrs an hour ago

vcryan 17 hours ago

Maybe I'll give Nemotron another try. Yesterday I used the latest one on OpenRouter and it was bad - worse than StepFun

SwellJoe 19 hours ago

I like the idea, and it has become more pressing that everyone outside the US think about tech sovereignty because the US has become an unsafe place to keep your data, but the impression I get from Apertus is that it moves at the speed of a committee. I have no expectation they'll deliver a competitive model. At least, not competitive with current models. Maybe competitive with models a year ago (though they haven't even done that yet, right?).

nezuzen 19 hours ago

"the US has become an unsafe place to keep your data"

I empathize with this but curious what would make any other country a better safehaven for your data? I personally like the EU's approach to data safeguards, but are there other locales/data protections you have in mind that would keep your data "safe".

mark_l_watson 4 hours ago

I live in the USA and I use a European LLM as a daily driver: Proton’s lumo+ that does a good job packaging a Mistral model for general chat, with good searchable chat history — all with adequate privacy guarantees. Well worth the money.

I purchase open model tokens for agent programming assistance, and I like lumo+ for everything else.

Another option is DuckDuckGo’s Duck.ai subscription, but I slightly prefer ProtonMail’s lumo+ packaging as a product.

eric_cc 4 hours ago

kitd 10 hours ago

The law varies from country to country, but at least I vote for the legislators creating the laws governing my local sovereign AI.

tensor 13 hours ago

Putting aside reliable rule of law, as others have pointed out, it seems unwise to keep your data in a country that has repeated threatened to annex or invade yours.

digitaltrees 18 hours ago

The rule of law exists in other countries in a way it does not in the US right now.

SubiculumCode 17 hours ago

MrDrMcCoy 17 hours ago

Iceland and Switzerland are probably the best places to keep your data safe. I'd put Norway, Sweden, Germany, and the Netherlands after that, though I don't have much specifics on how good they are at privacy these days.

SilverSlash 15 hours ago

OkWing99 14 hours ago

I think US is the only country that's asked to limit their frontier model access based on the Citizenship of the user.

Let's say Gemini gets to AGI by tomorrow, will my Google account access, or Gemini apps access and data be blocked if I'm not a US citizen? (Anthropic did it with a 5% better model).

If US is classifying the model access based on citizenship, that's similar to treating it as a Defense capability.

sawjet 13 hours ago

PeterStuer 11 hours ago

Most people have had to reluctantly accept their own totalitarian state will control them. They do not want another state to have the same or even more power over them.

AndrewKemendo 16 hours ago

No country is safe. You need to host your own end to end on your own infrastructure if you want to be free.

Stallman was correct in the 80s and is correct now about libre software

jhancock 16 hours ago

From a legal perspective the US may be safer than other places if the US is the one seeking your data. The US doesn't need legal process to authorize digging into your foreign server.

From a practical perspective, I'm not sure any servers are safe anywhere...depending on who may want your data.

markhahn 13 hours ago

mrshu 18 hours ago

By far the most impactful product of the Apretus project are the people. To quote a memorable line from Dominique Paul (https://www.thisiscrispin.com/):

> What most people miss IMO is that this is not a team who is doing this for the fourth time like virtually any other LLM provider and who could learn from its own past experiences. I bet if the team would do another model training they could get way better results at one fourth of the costs.

pferde 19 hours ago

For a model that claims to focus on many languages, it's quite unreliable when it comes to simple questions like "how to say X in language Y" or "how to conjugate verb X in language Y". It keeps hallucinating words that do not exist, and when corrected, it only hallucinates a new lie.

8note 18 hours ago

it probably doesnt know what language each set of words is referencing.

i doubt they are including a lot of training data labeled with the language.

"how to say X in language Y" is a different task from saying X in language Y

einpoklum 9 hours ago

Actually, it isn't all that different. There are only two words separating "how to say X in language Y" from "say X in language Y". And this "vulgar" metric is actually quite relevant for an LLM, which answers based on conversational context.

throwaw12 19 hours ago

Looks like their instruct models are Llama3.1 fine tune from last year. Is there any progress on new models?

My last hope for soverign AI is from Chinese open models

kordlessagain 19 hours ago

Sovereign AI is not about using just one model. It's about using the right model for the right job, and getting them to talk through the solution TOGETHER before presenting the answer.

If you want to mix models like this, check out https://github.com/deepbluedynamics/nemesis8

wg0 13 hours ago

You might dismiss it as nothing but the Linux analogy does not work here either. It is more than that and direct threat to commercial AI labs and their business model. These labs are milking bunch of foundational papers for years now and the end is near.

Going forward would be such open source, open data and open recipe models possibly someday even with the training being crowd sourced if not inference like the BitTorrent model.

Lastly, even Chinese models (GLM, Deepseek, MiMax) work really really good and any user would testify that they do not miss OpenAI/Anthropic/Gemini at all if they're using those Chinese models which is argument enough that with such models, no one is going to miss Chinese models as well.

zitterbewegung 16 hours ago

Sort of interesting license not sure if anyone will do it long term.

The training data and the Apertus LLM may contain or generate information that directly or indirectly refers to an identifiable individual (Personal Data). You process Personal Data as independent controller in accordance with applicable data protection law. SNAI will regularly provide a file with hash values for download which you can apply as an output filter to your use of our Apertus LLM. The file reflects data protection deletion requests which have been addressed to SNAI as the developer of the Apertus LLM. It allows you to remove Personal Data contained in the model output. We strongly advise downloading and applying this output filter from SNAI every six months following the release of the model.

reconnecting 18 hours ago

A chat interface where you can try Apertus:

https://chat.publicai.co

einpoklum 9 hours ago

You will need to register with an email and password though, i.e. your sessions will be recorded and identified.

Also even after you do that, and start a chat, you currently get:

  "JSON.parse: unexpected character at line 1 column 1 of the JSON data"

so it's not quite there yet.

Bobaso 5 hours ago

Apertus V1 performance were sub-par. The Team is working on v2 ATM. Looking forward to testing it.

khalic 5 hours ago

I don't know, I'm implementing a translation system right now, and Apertus is very good for the model size. I wished they added some chain of thought training to increase precision and context understanding.

yreg 20 hours ago

previous thread: https://news.ycombinator.com/item?id=45108401

jawns 18 hours ago

I am curious about how opt-outs and PII removal work.

Who confirms those requests are legit?

naklitechie 12 hours ago

What's the community's take on Sovereign AI being funded by states around the world?

Why the emphasis on sovereign? Open is good enough. No?

khalic 4 hours ago

It was in reaction to the possible threat of main actors restricting use. The latest US gov stunt with Fable just made it concrete and pressing.

luplex 7 hours ago

Sovereignty is a political buzzword. From the political point of view, you want your country to be as independent as possible. This means you need the capabilities to build and deploy good AI models. Initiatives like this are more about capability-building and less about LLM-building.

Why do we need capabilities in Europe? Because Trump and Xi can't be trusted to keep providing us with new frontier models in the next years.

trvz 19 hours ago

The previous version of this model has been pretty bad, but claimed to adhere to copyright laws. However, based on my testing, that's not true either. So in my view this is completely useless.

embedding-shape 19 hours ago

As long as the following remains true, this release ends up a bigger contribution to science at large than most other models trained "behind closed doors":

> Fully open model: open weights + open data + full training details including all data and training recipes

coder543 19 hours ago

Is a recipe useful if no one likes it?

There are equally open, much more useful models out there: https://artificialanalysis.ai/?models=nvidia-nemotron-3-ultr...

khalic 5 hours ago

simonw 19 hours ago

It uses fineweb, which is derived from Common Crawl, which is an unlicensed scrape of web pages.

reedciccio 5 hours ago

You don't need a license to scrape the public web and analyze it, turn it into tokens and other transformations. Let's not expand copyright beyond the horrible monster it already is.

simonw 3 hours ago

markhahn 13 hours ago

I'm curious how you test; could you explain? Do you have a set of factoids that should be subject to copyright, but are somehow literally (whole work) generated by the model in question?

neom 17 hours ago

I'm curious to know what stuff like this means for cohere? Their whole value prop is Sovereign AI. It seems they spent a lot of money developing models but own none of their own infra, what is the point of a country spending a lot of money on coheres solutions when stuff like this is becoming increasingly available and usable? Feels like I must be missing something here??

uberex 12 hours ago

Being childish I https://oss.zuericitygpt.ch/?q=hello+talk+like+a+pirate

atemerev 19 hours ago

I use it extensively. It is not ready for agentic use, but as a generic driving model for RAG use cases, it is pretty competent. You can build useful software with it.

MASNeo 19 hours ago

I use Apertus including as the driver for an agent, not a coding agent. Find it useful enough. What was your Challenge?

atemerev 8 hours ago

Legal consulting.

dTal 18 hours ago

It's good that there is a movement for open LLMs, but it's not where the battleground is right now. The battleground is local vs service LLMs, and we are losing that battle badly despite all the software being here now and viable, entirely because UX sucks.

How many normal people do you know who use "ChatGPT"? A lot, probably.

How many even know what "Gemma" is, let alone have downloaded llama.cpp, a GGUF file from Hugginface, and run "llama-server" from a text console with all the correct command arguments? How many are thinking about this use case when speccing out their next computer? Where is the breathless marketing copy boasting x tok/s?

We are sleepwalking into slavery.

627467 18 hours ago

"Normal people" have never bothered to host their own: photos, music, videos, documents, comunications, etc. To the point that for many their computer is essentially a thin client into someone else's server. Why would we think this same people would care about "personal" inference?

trollbridge 16 hours ago

Normal people can go open an account at DeepSeek or Xiaomi and chat away for free. Or, for that matter, a couple other models like z.ai's (GLM-5.2 isn't in the free tier, though, but neither is GPT-5.5-Pro), or Qwen, which does have 3.7-Max for free with no account on their chatbot interface.

Yes, I realise this isn't "running a local model", but it's using models that can be grabbed and run locally. For my pipelines, I feel far more confidence when I use an open model (even one like GLM-5.2 that would be expensive for me to run) since I have a backup plan if the hosted/cloud option becomes unworkable for me. If that happens to me with Opus, I have zero options.

cdata 17 hours ago

If our strategy to avoid "slavery" involves "normal people" taking the local-vs-managed choice seriously, we have already lost.

This choice is made for us. The deciding factors will be convenience and economics.

My sense is that just like Web 2.0 SaaS we are destined for servitude.

A better strategy is to play an assymetrical game IMO. Don't let your would-be master write the rules by which you play.

yeeeloit 16 hours ago

> A better strategy is to play an assymetrical game IMO. Don't let your would-be master write the rules by which you play.

What do you mean by this? Do you have an example in the given context?

8note 18 hours ago

normal people dont really have the hardware to run local models

dTal 6 hours ago

Anyone with an M-series Apple computer can run something very competently. Mac Pro users can run 30B class models which is good enough for the vast majority of practical everyday purposes, far better than the original ChatGPT was. Anyone with a gaming computer is in a similar situation. The rest of us can still run stuff, just not as big or as fast.

sosodev 17 hours ago

They have it, we just haven’t enabled them. The smart model with a chat box is the wrong abstraction for local. Ideally we would have it built into applications as a clear and easy to use opt-in feature. Like allowing a user to index a folder on their hard drive and then search it semantically via embeddings. You could do that on fairly low end hardware these days. Like 2GB of RAM with any processor made within the last 10 years.

manithree 17 hours ago

They may not right now, but the whole point of Microsoft's Copilot+ PC standard (even though it's somewhat anemic) is to run models locally. Apple Silicon with enough unified memory is capable. Not to mention modern iPhones and Pixels have fairly capable NPUs and routinely run local models. So, we may not be to the point where most normal people have the hardware to run local models, but it is rapidly approaching.

Danox 14 hours ago

As time goes on, they’re almost certainly will be very capable local models in the long run we (general computer users) aren’t going back to the era of mainframe computing no matter how much OpenAI, Meta or Google would like us to.

dTal 6 hours ago

trollbridge 16 hours ago

Gamers can run Qwen 3.6 quantised models now.

You would also be shocked what's possible on a 64GB Mac Studio, which isn't that unattainable.

conception 17 hours ago

Google Edge Gallery is turn key for people and on the device most people chatgpt on. Just like with most Google Stuff “edge gallery” is maybe the worst name possible for “run AI on your phone”!

theptip 18 hours ago

Why do you feel the important part _now_ is where the weights get run?

I can see this as a future battleground but access to frontier models (which you cannot run locally) seems a lot more relevant today.

dTal 6 hours ago

Because the local LLMs available today are already fantastic, and the difference between no LLM and an open weights LLM is much smaller than the gap between an open LLM and a so-called "frontier" model.

It's important that people get used to the idea that your interactions with a language model are a highly personal thing. LLMs can perceive and categorize us in ways we can't even imagine, far more violently than the simple algorithmic feeds which have already corroded public discourse so much. LLMs can control us. LLMs warp the information landscape more radically than even the internet did. Even now you are likely underestimating their role in future society.

The principles of software freedom are becoming existentially important.

itkovian_ 17 hours ago

You can’t run a closed llm locally. Strange to frame the dichotomy as between local and open. One begets the other.

idiotsecant 18 hours ago

Better UX does not buy you a datacenter farm to train state of the art cutting edge models. Right now the only people who can do that are the technobility class.

dTal 18 hours ago

It does not, but it might encourage more people to care. Worrying about training is a luxury when you are starting from a baseline of "OpenAI spies upon me and controls my access". Let's focus on getting every Tom, Dick and Harry 1) on board with LLMs, because they're happening, 2) habitually using local software.

trollbridge 16 hours ago

The same used to be true of being able to program computers and compile software.

Of course the frontier will always be unattainable, but that's like pointing out that I couldn't buy my own Cray supercomputer.

azinman2 18 hours ago

> We are sleepwalking into slavery.

That’s a bit hyperbolic…

MrDrMcCoy 16 hours ago

Some hyperbole is useful. The problem is real and serious, though short of the specific verbiage.

0gs 18 hours ago

it's funny because i made this thing (called enough) that aims to make it easy for non-technical people to get up and running with local models quickly, but it is impossible to figure out how to break through the noise. every thread and comment like this breaks my heart a lil bit

dTal 6 hours ago

Link? You have to tell us if you want to break through the noise!

0gs 3 hours ago

double0jimb0 18 hours ago

Yea, anyone who understands what makes products actually usable is opting to get paid for said skill.

bsder 16 hours ago

> we are losing that battle badly despite all the software being here now and viable, entirely because UX sucks.

Yep. I'm an old time Linux sysadmin, but I am COMPLETELY baffled as to what I can or cannot run on my 32GB R9700 with 128GB main CPU memory.

If I want something Claude or Codex like what do I use that would be useful? If I want a chat system, what do I use? Images--apparently ComfyUI for setup but after that what do I do?

I don't even mind spinning up something in the cloud for a bit, but I need to know how I'm going to get data up and down without racking up massive bandwidth charges.

I'd love to do some tinkering, but the field is moving so fast and so full of charlatans that cleaning the dross out is almost impossible.

entrope 5 hours ago

For coding, Qwen3.6-27B with MTP should fit in 32GB with almost full context length for Unsloth's 5-bit quantization. That's my preferred choice for a local coding agent on similar hardware: the quality delta compared to a MoE model is IMO worth the extra wait. (And I haven't found a model with 70B-120B parameters that works better for coding.) For general chat, maybe gpt-oss-120b? It should have more general knowledge than a 30B-class model; I've used it to suggest itineraries for trips and to review the completeness of small requests for proposals.

I don't have recommendations for images because I haven't played with those.

markhahn 12 hours ago

these days, even completely mainstream distros (Fedora here) include ollama, which leverages a wide range of hardware and range of models. (it's generally useful to install a more recent ollama, though.) there are free coding harnesses too.

dTal 6 hours ago

wmf 18 hours ago

LM Studio

JSR_FDED 15 hours ago

From a sovereign AI perspective, how does this compare to Mistral?

luplex 7 hours ago

It's a different country, and Switzerland is not even in the EU.

JSR_FDED 7 hours ago

True. What France (as an EU member) and Switzerland (not as an EU member) share is a desire for sovereign AI. I am interested how their efforts compare, and how their LLMs thus far compare.

holistio 17 hours ago

Knowledge cutoff is March 2024. Incredible.

uberex 11 hours ago

Does anyone care about this anymore with context windows and tool harnesses.

pizlonator 14 hours ago

> compliant at scale

The jokes write themselves.

_pdp_ 19 hours ago

I want to believe.

david_shi 16 hours ago

These models don't seem very competitive, who's their target audience?

poplarsol 16 hours ago

Europeans who fetishize "compliance".

markhahn 12 hours ago

residents of the universe who recognize the US as a supply-chain risk.

no, actually, from the docs it sounds mainly motivated by the country's unique linguistic requirements.

dangoodmanUT 17 hours ago

How are they going to be competitive with top models at 70B size?

kennywinker 13 hours ago

Qwen et al shows size isn’t actually the only useful metric for an llm.

nisten 17 hours ago

As an opesource AI researcher with a lot of models and datasets on huggingface I am very appreciative of these types of project but we are ignoring the elephant in the room here ( or lack of )

the swiss have no gpus

T-A 10 hours ago

the Apertus model was trained on the Alps supercomputer, operational at CSCS since September 2024, a data center of over 10'000 top-of-the-line NVIDIA Grace-Hopper chips

https://log.alets.ch/110/

kennywinker 16 hours ago

How is this a real problem? Genuine question, because i don’t really understand the urgency of everyone buying up ram and gpus as prices for those skyrocket.

I can run the 8B version of this swiss-ai model on a ten year old GPU. For the larger one, $2000 consumer hardware can run it fine. Beyond that, there are plenty of places where time on a GPU can be rented, and if the model is good, there will be hardware to run it.

pu_pe 10 hours ago

You can run it, but you can't train it. While this type of toy model could actually be trained in Swiss equipment, a state-of-the-art LLM probably could not.

My charitable reading of GP's point is that the bottleneck for true compute sovereignty is the chips, not the models.

khalic 5 hours ago

Do some research before posting that kind of stuff

markhahn 12 hours ago

why do you say the Swiss have no gpus?

markab21 17 hours ago

I'm mildly surprised that more people aren't using Nemo models for this reason. We've moved most of our processing to a combination of Nemo Ultra and Super, with some support for multi-model-specific tasks on Omni. The setup is working REALLY well for us, and I'm comfortable with the more measured pace of improvements. We work with many long-context problems, and the ecosystem is great.

There were a number of use cases where we needed to use Gemini (audio modality), and Ultra has been a VERY cost-effective alternative once we got through the nuances.

firstrowraver 9 hours ago

apertvs.ai? seriously?

andrewshadura 12 hours ago

Not to be confused with Apertium and Apertis.

iamyemeth 8 hours ago

> Conclusion There are 2 r's in the word "strawberry".

Not looking good so far

sigmoid10 7 hours ago

I guess they still use a tokenizer? Why would this kind of issue be solved? The model fundamentally can't see the word character by character like you do. For o200k tokenizers for example, what the model sees are 3 tokens: [302, 1618, 19772]. These are shown to you as ["st", "raw", "berry"] in the UI. The only way any model can infer individual characters is by using external tools or implicit knowledge picked up during training or (what many of the big labs apparently do) special training for these edge cases that fail once the next special case comes along.

maxloh 19 hours ago

Great to see more fully open LLMs.

I think a problem with open-weight models is that while you can improve them, you are not going to create the next generation of LLMs by fine-tuning. We are at the mercy of frontier labs for access to SOTA LLMs. For example, Anthropic recently started requiring identity verification for Claude [0], same for OpenAI [1].

If one day China's distillation labs stop releasing their LLMs as open-weight, I doubt American labs will continue to release free LLM weights without that competition.

That's where fully open pipelines shine: they enable the community to create the next generation of SOTA LLMs. That is the only way LLMs truly become sovereign.

[0]: https://news.ycombinator.com/item?id=48618455

[1]: https://news.ycombinator.com/item?id=48618606

anon373839 18 hours ago

> China's distillation labs

This notion that Chinese labs are merely distilling frontier models is quite an unwarranted slur. Those labs have published WAY more useful research than US labs on RL techniques, novel model architectures, training pipelines, etc. They have also hit intelligence-per-parameter densities that US labs have yet to attain.

Apart from that, merely training a model on outputs from another model, off policy and without the logits, doesn’t really work that well.

The Chinese labs know how to build frontier level models. GLM-5.2 shows that they no longer even need Nvidia chips to do it.

trollbridge 16 hours ago

It's one of those lies people tell themselves to make themselves feel better. "Oh, they're just copying my stuff."

Chinese labs are basically just telling everyone, out in the open, what they're doing and how to do it, and the answer from American frontier labs is "Well, they couldn't possibly be getting the results they're getting without just distilling our models," and the American labs aren't even trying to do some of the stuff like DS's aggressive caching to get costs down.

Vaslo 17 hours ago

I recently watched a video for one of these “Chinese Models” it kept insisting it was Claude when the user asked. Sorry, there’s no “slur” here but legit suspicion.

c0rruptbytes 17 hours ago

anon373839 16 hours ago

halJordan 17 hours ago

But have they? I understand that the Chinese side is illuminated and the American side is dark. I disagree that the Chinese labs have created anything that isn't in an American research lab or production dc. Sure the Chinese have published their findings and not for nothing. But are they novel? Unlikely imo

chriskanan 17 hours ago

dofm 19 hours ago

> We are at the mercy of frontier labs for access to SOTA LLMs

I disagree with this use of SOTA, and this topic is why.

Anthropic and OpenAI have “cutting-edge” models. These are beyond the state of the art but they are closed, secretive, hard to quantify.

The “state of the art” is open source, open weights models that can be inspected, studied, shared and critiqued, because that is what is meant by “the art” —- it is the knowledge and principles and evidence and materials available to all. The “state of the art” is the highest point of that.

I wish we could make this distinction and stop blessing two secretive, unverifiable loss-making companies with so much power.

(Putting that aside, I suspect — without evidence, mind you - that the endless march to solving models by making them bigger is not the solution anyway.)

MangoCoffee 16 hours ago

SOTA LLMs is less important than cheap token and Chinese AI labs is releasing model that is only about 6-8 months behind American AI labs.

Chinese's model like GLM is getting better for coding task and its cheaper. Microsoft Github copilot have to switch billing to token based. the cost of AI have increased since agent come into play. whoever can offer cheaper token to do task will win.

even Microsoft is looking into Deepseek for cheap token.

https://www.axios.com/2026/06/16/microsoft-copilot-cowork-to...

sockaddr 19 hours ago

Sorry but I think you’re requirement that something only be “the art” if any arbitrary person can critique it is off. The frontier labs are working on the state of the art but it’s just art that you aren’t allowed to see. Unfortunately.

Hacker News

by Ryan Harman

Apertus – Open Foundation Model for Sovereign AI (apertvs.ai)

maxloh 19 hours ago [-]

soundworlds 16 hours ago [-]

dvt 13 hours ago [-]

dofm 3 hours ago [-]

johngossman 4 hours ago [-]

jstummbillig 12 hours ago [-]

alfiedotwtf 10 hours ago [-]

scjody 4 hours ago [-]

apercu 3 hours ago [-]

a_136_chiffa 4 hours ago [-]

polytely 7 hours ago [-]

waffletower an hour ago [-]

sawjet 13 hours ago [-]

markerz 13 hours ago [-]

intended 9 hours ago [-]

hit8run 14 hours ago [-]

AndrewKemendo 16 hours ago [-]

typ 9 hours ago [-]

maxloh 4 hours ago [-]

wrs an hour ago [-]

vcryan 17 hours ago [-]

SwellJoe 19 hours ago [-]

nezuzen 19 hours ago [-]

mark_l_watson 4 hours ago [-]

eric_cc 4 hours ago [-]

kitd 10 hours ago [-]

tensor 13 hours ago [-]

digitaltrees 18 hours ago [-]

SubiculumCode 17 hours ago [-]

MrDrMcCoy 17 hours ago [-]

SilverSlash 15 hours ago [-]

OkWing99 14 hours ago [-]

sawjet 13 hours ago [-]

PeterStuer 11 hours ago [-]

AndrewKemendo 16 hours ago [-]

jhancock 16 hours ago [-]

markhahn 13 hours ago [-]

mrshu 18 hours ago [-]

pferde 19 hours ago [-]

8note 18 hours ago [-]

einpoklum 9 hours ago [-]

throwaw12 19 hours ago [-]

kordlessagain 19 hours ago [-]

wg0 13 hours ago [-]

zitterbewegung 16 hours ago [-]

reconnecting 18 hours ago [-]

einpoklum 9 hours ago [-]

Bobaso 5 hours ago [-]

khalic 5 hours ago [-]

yreg 20 hours ago [-]

jawns 18 hours ago [-]

naklitechie 12 hours ago [-]

khalic 4 hours ago [-]

luplex 7 hours ago [-]

trvz 19 hours ago [-]

embedding-shape 19 hours ago [-]

coder543 19 hours ago [-]

khalic 5 hours ago [-]

simonw 19 hours ago [-]

reedciccio 5 hours ago [-]

simonw 3 hours ago [-]

markhahn 13 hours ago [-]

neom 17 hours ago [-]

uberex 12 hours ago [-]

atemerev 19 hours ago [-]

MASNeo 19 hours ago [-]

atemerev 8 hours ago [-]

dTal 18 hours ago [-]

627467 18 hours ago [-]

trollbridge 16 hours ago [-]

cdata 17 hours ago [-]

yeeeloit 16 hours ago [-]

8note 18 hours ago [-]

dTal 6 hours ago [-]

sosodev 17 hours ago [-]

manithree 17 hours ago [-]

Danox 14 hours ago [-]

maxloh 19 hours ago

soundworlds 16 hours ago

dvt 13 hours ago

dofm 3 hours ago

johngossman 4 hours ago

jstummbillig 12 hours ago

alfiedotwtf 10 hours ago

scjody 4 hours ago

apercu 3 hours ago

a_136_chiffa 4 hours ago

polytely 7 hours ago

waffletower an hour ago

sawjet 13 hours ago

markerz 13 hours ago

intended 9 hours ago

hit8run 14 hours ago

AndrewKemendo 16 hours ago

typ 9 hours ago

maxloh 4 hours ago

wrs an hour ago

vcryan 17 hours ago

SwellJoe 19 hours ago

nezuzen 19 hours ago

mark_l_watson 4 hours ago

eric_cc 4 hours ago

kitd 10 hours ago

tensor 13 hours ago

digitaltrees 18 hours ago

SubiculumCode 17 hours ago

MrDrMcCoy 17 hours ago

SilverSlash 15 hours ago

OkWing99 14 hours ago

sawjet 13 hours ago

PeterStuer 11 hours ago

AndrewKemendo 16 hours ago

jhancock 16 hours ago

markhahn 13 hours ago

mrshu 18 hours ago

pferde 19 hours ago

8note 18 hours ago

einpoklum 9 hours ago

throwaw12 19 hours ago

kordlessagain 19 hours ago

wg0 13 hours ago

zitterbewegung 16 hours ago

reconnecting 18 hours ago

einpoklum 9 hours ago

Bobaso 5 hours ago

khalic 5 hours ago

yreg 20 hours ago

jawns 18 hours ago

naklitechie 12 hours ago

khalic 4 hours ago

luplex 7 hours ago

trvz 19 hours ago

embedding-shape 19 hours ago

coder543 19 hours ago

khalic 5 hours ago

simonw 19 hours ago

reedciccio 5 hours ago

simonw 3 hours ago

markhahn 13 hours ago

neom 17 hours ago

uberex 12 hours ago

atemerev 19 hours ago

MASNeo 19 hours ago

atemerev 8 hours ago

dTal 18 hours ago

627467 18 hours ago

trollbridge 16 hours ago

cdata 17 hours ago

yeeeloit 16 hours ago

8note 18 hours ago

dTal 6 hours ago

sosodev 17 hours ago

manithree 17 hours ago

Danox 14 hours ago

dTal 6 hours ago

trollbridge 16 hours ago

conception 17 hours ago