8086 Segmented Memory was a good idea (owl.billpg.com)

58 points by billpg 2 days ago

st_goliath 8 hours ago

> 8086 Segmented Memory Was a Good Idea.

Yet the article goes about the most ass backward way of explaining 8086 segments and constructs a convoluted mental picture of dividing memory into overlapping chunks.

It's really, really simple: segments on the 8086/88 are 64k sliding windows into an 1M address space. You can move them around at 16 byte granularity.

You need more than 64k for code + data? No problem, the CPU knows when it's fetching an instruction vs when it's fetching data, you can have two sliding windows: code (CS) and data (DS). Split them apart, and it's not much different than a Harvard-style machine and gives you access to more than 64k at a time.

Still need more? No problem, the CPU has a hardware stack with dedicated push/pop/call/ret instructions and a base pointer for stack indexing. It knows when it's accessing the stack, so we can split the data window into regular data (DS) and stack data (SS). Oh, you occasionally want to copy stuff between segments or somewhere else in memory? Well, to encode 3 segments we need 2 bits anyway, let's throw in an extra data window (ES) and some DS-to-ES copy instructions.

gdwatson 7 hours ago

It was a clever hack for porting existing code. But it doesn’t scale at all – you’ve just described adding four registers to a register-starved architecture in order to solve the issue for one CPU generation or so.

peterfirefly 6 hours ago

Segment prefixes were rarely needed and you didn't need to spend any of the precious mod/rm bits on segment registers. The GPR count was limited to 8 partly because of the 3 bits allocated to specifying them and partly because of limited die space. Segment registers only added slightly to the latter cost.

juancn 4 hours ago

Yeah, but it was probably the right call at the time.

Backward compatibility was a breath of fresh air at a time were code needed constant porting and rewriting. No two machines were alike.

It's one of the reasons the PC became so popular.

rob74 7 hours ago

Plus all this pointer juggling would have been more or less ok (or not ok, but doable) when programming in assembly, but for a compiler it would have been a recipe for disaster...

bluGill 4 hours ago

AshamedCaptain 4 hours ago

I think you are missing the entire point of the article (which I kinda agree with), and just repeating the popular wisdom.

In the era a machine with "object addressing" sounded like a perfectly valid futuristic design (what a Lisp machine strived to be; I guess today you would call it tagged memory of some kind). The 8086 is not that, but the original design would have allowed to evolve it into something like that.

The article's point is that since programmers simply treated it as a sliding window (instead of an opaque object handle), the plan could not be implemented, and the half-assed thing became stuck.

Having seen other Intel RISC designs, I fully agree with the premise.

AnimalMuppet 4 hours ago

But with only four segment registers, you couldn't treat it as that - not unless you only had four objects. So treating it as a sliding window was all you actually could do with it.

bonzini 3 hours ago

AshamedCaptain 34 minutes ago

wewewedxfgdf 7 hours ago

I seem to recall at the time that flat memory was self evidently a better idea. It's not like people were sitting around going "gee I can't think of any better way to do memory addressing that this" until some genius suggests "how about flat?!?!?" Everyone knew flat was best but were stuck with 8086 crap.

phkahler 3 hours ago

I was drooling over the MC68000. I had a pre-release reference manual from 1979 and it was the most awesome chip around. Atari and Commodore and Apple used it after their 6502 systems. Arcade games used it after their 6502 and 6809 days. The only reason x86 became popular was because IBM put it in the PC - not for any technical reasons.

trollbridge an hour ago

A 68k was roughly double the cost of an 8086, plus the additional cost for the rest of the supporting chips.

The PC was intended to be cheap and was competing with 8 bit machines. Being 16/20 bit made it already high end.

If you wanted 24 or 32 bit, IBM had many other machines to sell you. Or you could just buy a VAX.

trollbridge an hour ago

Flat is not necessarily a better idea. In 1978, a 32 bit CPU would be stupidly expensive. The use case for > 64k was to simply have code and data split apart, and also have some MMIO, so basically 192k-256k of addressing needed.

Segmentation meant programs could remain essentially 16 bit with all the benefits to that like smaller code size.

Someone 6 hours ago

Not entirely self-evidently. Position independent code was slower at the time and avoiding having to patching function addresses at load time is a net win.

More importantly, there’s backwards compatibility. By the time the 8086 came out, people had spent serious money on getting binary-only software (WordStar cost hundreds of dollars, for example). “Buy this computer, and you can keep running the software you paid for, but faster” was a good selling point.

rickas 4 hours ago

I wonder why couldn't Intel simply introduce single 'offset' register for apps ported from 8080, and make all other registers 20-24bit. Why bother with 4 different registers and all that segmentation nightmare?

trollbridge an hour ago

ThrowawayR2 2 hours ago

Flat memory was a better idea but it wasn't a cheaper idea, and cheaper beats good. A casual Googling shows a price of $2,880 for a IBM PC in ~1982 dollars versus $8,900 for a Sun Microsystems Sun-1 with a MC68000.

amiga386 34 minutes ago

That was just Sun's market segment.

Also from a casual Google, an IBM PC launched in 1985 (picking the XT 5160-078 as an example) was around $4,995 at launch. Compare that to two 68000-based computers also launched in 1985: the Amiga 1000 launched at $1,295 and the Atari 520ST launched at $600.

These computer system prices - Sun, IBM, Commodore, Atari - came from the market segments the manufacturers aimed to sell to, rather than a cost saving enabled by cumbersome memory models.

The cumbersome memory model is just a historical accident; the 8086 designers wanted 8080 backwards compatibility so they could sell to former 8080 users, IBM did not require this. IBM would've picked the 68000 had it been "production ready" at the time, they did not reject flat memory on cost grounds!

PaulHoule 7 hours ago

Personally I enjoyed writing assembly with segments. You can have 64k of code, 64k of data and 64k of stack without trying. So long as no individual data structure is larger than 64k there is no essential difficulty working with 16 bit pointers.

When I think back I think it would be fun to have a hierarchical structure where composite data structures (think an array or hash map) are referred to with a pointer that goes into the segment register and you index inside a data structure with a regular pointer.

trollbridge an hour ago

Lots of 8086 code was written that way. You’d use the segment register on paragraph alignment and basically take advantage of the << 4 + logic.

This code was a nightmare to port to protected mode 80286 so it went away by the Windows 3.1 era.

KellyCriterion 7 hours ago

-- Everyone knew flat was best but were stuck with 8086 crap.--

This! Thats one of the most interesting things to me: Actually very often in the IT-world, the worst competitor won the race while better solutions were known and available: Microsoft, Intel etc.

Esp. that MS won for decades while making mainly a very bad OS, though they have some good enterprise products.

How would the world look, if Unix/BSD would have won this race?

JdeBP 5 hours ago

There was a Microsoft operating system for protected mode 80286 in 1983. It was Xenix 286; it was basically 7th Edition Unix without the branding; and it was touted as the multiuser operating system that MS-DOS would be the gateway to. How would the world look if Microsoft had won the race? (-:

trollbridge an hour ago

A PC with DOS was cheaper than a PC with Xenix, let alone a typical Unix machine. Also much easier to learn to use.

Macs also existed but were expensive. The PC with DOS was both powerful and cheap.

Ekaros 7 hours ago

I really wonder if Unix is best we can do. Or is it also worst? So in the end two of the worst options won. It did make sense back in time. But could it have been replaced with something better later?

bux93 6 hours ago

amiga386 4 hours ago

wewewedxfgdf 7 hours ago

PunchyHamster 6 hours ago

wewewedxfgdf 7 hours ago

Being technically best has long been known to not be correlated to market success in the way that makes logical sense to technical people who feel confused about this.

alerighi 4 hours ago

Except than later we returned to a sort-of segmented memory. That is of course paging, our programs allocates pages of memory that are a fixed size (4096 bytes) and are arranged in memory or in swap space how the OS decides.

We just have the illusion of a "flat" memory model, but it's not really flat, the CPU and the operating system does an important job in translating our flat memory model in something that is not flat at all. All that address translation work could have been avoided if we accepted to not have a flat memory model and be aware that our memory is divided in pages.

Basically we are doing in hardware the job of managing a non flat memory space that the programmer, or well, the compiler (or these days you would say the AI agent) could probably to better because it knows how to allocate things to avoid being them on page boundaries, and all of this to give the illusion to the programmer that it's working with a flat memory (except when it does something wrong and gets a segmentation fault, that, as the name suggests, is an hint that at the end the memory is not really flat).

bluGill 4 hours ago

In turn though, the thinking needed to handle non-flat memory is a complexity that most programmers cannot handle - and even those who could probably should spend their brain power on the complex parts of their program not managing memory. Best to leave that hard part to a few experts instead of make everyone understand it.

The above is very similar to the argument that you should use a garbage collected langauge.

trollbridge an hour ago

wang_li 3 hours ago

Paging is not part of the CPU architecture. On CPUs of the time the MMU that brought paging to the party often was a completely separate peripheral that the CPU interacted with to gate access to RAM. By contrast segments are an integral part of the CPU instruction set and your code either has to limit itself to 64KB or your application had to be aware of and include logic to manage segments.

As an aside, the memory model is flat, it's just not physically linear when implementing virtual memory addressing.

deepsummer 5 hours ago

1992-me hates the author. Coming from 68k assembly, x86 was a nightmare. And together with the ridiculous number of registers, segments made up a huge chunk of that horrible experience.

forinti 4 hours ago

I knew a bit of 6502 assembly, so I was happy just to have more RAM and more registers.

Looking back, the simplicity of the instruction set seems quaint next to the thousands of instructions we have today.

billpg 4 hours ago

1992-author (me) is wondering if he'll ever get a girlfriend.

(And I completely agree.)

projektfu 4 hours ago

What? It has 4 times as many general purpose registers as you'd ever need, right? /s

deftio 3 hours ago

Agree.. 68k assembly was dreamy compared to 80x86..

BearOso 3 hours ago

AKSF_Ackermann 7 hours ago

The segment model seems clever if you assume that you never have an object that is larger than 64kb. And once you have that you need to care about segment overflow, pointer comparisons no longer work, everything now has to carry around segment+offset instead of just offset, and so on. And if you want an example of a >=64kb object - the html alone for that page is one.

rep_lodsb 7 hours ago

A lot of that is just bloat that you wouldn't have had back then. But it could still be handled by an 8086, not by storing the raw HTML in memory at all, but parsing it as it loads. Each DOM node would be its own object with child pointers, with attributes and names all converted into binary numbers of (at most) 32 bits each.

64K of actual text content in a single node could be reached in some documents, but it's not that small, more than a chapter of a typical book.

What was always a problem for segmented memory was graphics, at least if you wanted higher resolution than 320x200 at 256 colors. But you could have a segment pointer to each row of pixels instead of an entire image, as long as it would still fit within 1 MB (16 MB in the 286 protected mode).

AKSF_Ackermann 7 hours ago

True, graphics is a better example of a period-correct >=64k work, but the point is that there are multiple things where you don't expect the data to be that big until it suddenly is.

billpg 4 hours ago

Why would you ever want a single 64k object? That's like an entire machine's worth of memory!

carry_bit an hour ago

It's basically a 16-bit machine with PAE; 32-bit with PAE runs into similar issues if you want an object larger than 4GB.

hexmiles 8 hours ago

What is the difference between the segmentation model used by Intel and the banking model used by a lot of consoles? I've worked with the code of a couple of NES and GBC games, and while banking could be annoying, I never saw it as a particularly difficult model to follow and use. It did require more planning for the various functionality, but it wasn't even the most complex or difficult thing about developing for consoles.

Someone 7 hours ago

> and while banking could be annoying, I never saw it as a particularly difficult model to follow and use

Segments aren’t conceptually difficult, either, but definitely could be annoying, and certainly were, if you had to access data structures larger than 64 kB.

As to the differences:

- you had four segment registers that you could ‘point’ anywhere, allowing you to access four 64kB regions of memory without changing them (the equivalent of bank switching) (one always was used for accessing the instruction to run, one for accessing the stack, but you could use those for other purposes, too (Could, not SHould)

- segments can overlap. You could set DS and ES to the same value, for example.

Segments also can be moved at 16-byte granularity. If you wanted, you could have DS address address memory range 0x0000 ≤ x < 0xFFFF and SS address memory range 0x0010 ≤ x < 0x1000F.

pjc50 4 hours ago

Banking is different in that the banks swap in and out. The 8086 segments were all available at the same time once you loaded the segment register, and they overlapped.

Banking was one solution to the 1MB limit; was it extended or expanded mode? I can no longer remember, but one of those gave you a 64kb window somewhere above the 640kb limit in the address space not used by either video RAM or BIOS. That window could then be paged around the rest of memory.

trollbridge an hour ago

Expanded memory is identical to banking. It wasn’t particularly popular since it’s a pain to program and compilers never got around to automatically generating code for it.

RiverCrochet 4 hours ago

On the NES, most mappers would control which 16Kbyte blocks from the PRG ROM appeared in the upper ($C000) or lower ($8000) block of the NES ROM space. Often the upper block was fixed because of IRQ/RES/NMI vectors. I think later mappers allowed 8K blocks. So you only had those fixed windows at those fixed granularities, not the 16-byte granular sliding window 8086 offered.

I don't know about DMG/GBC/GBA games. Some very interesting stuff happened on those platforms (e.g. Game Boy Camera, and some game that lets you control a sewing machine in Japan?) and I bet a pure sliding window mapper exists.

The PC Engine/Turbografx-16 had platform support for mapping (specific CPU instructions did it) but it was 8 fixed windows in the CPUs 64K address space that pointed to 8K size offsets in the ROM I believe. SNES had a 24-bit address space and DMA to copy things to VRAM so not sure mappers were really on that platform.

flohofwoe 7 hours ago

It's pretty much the same thing, except that all the memory mapping logic has moved from 'custom memory mapping hardware' into the CPU.

rzzzt 5 hours ago

Banking also appeared on the platform in the form of EMS.

M95D 3 hours ago

They could have used 16 bit segments with no overlap. It would have a 16 bit offset register + a 16 bit segment selector register with the top 12 bits reserved (always 0). 16 bit software would run as usual in a single segment, while larger programs would use both registers for 20 bit addresses.

286 could then use the next 4 bits from the segment register to allow 16 MB address space and 386 could use all of them for 4GB. And wouldn't it be nice if 386 had 64KB pages (1 segment)?

bonzini 3 hours ago

That wouldn't have worked, the point was to pack data in memory. Even on 64kb computers, MS-DOS 1.x loaded .COM files at the bottom of available memory and allowed using the "familiar" CALL 5 interface even if the program was not loaded at physical address 0x100 (which is part of the interrupt table on x86). MS-DOS 2.x augmented that with TSR (terminate and stay resident) programs that could relocate themselves to use the minimal amount of memory at 16-byte offsets.

The 68000 was a complete break so it opted for relocatable code (which also needed more registers, and in fact the 68k had 16 instead of 8).

billpg 3 hours ago

I kinda wish they had. 64k windows with no overlap would make segment registers a slightly inconvenient 32-bit address register.

I get why hey didn't. Someone might want to run two processes each with its own segment, but the whole machine might only have 64k in total.

trollbridge 3 hours ago

Did anyone else find the AI written style of this offputting?

The original 20 bit vision of the 8086 was when memory was very expensive and they expected typical high end machines to have 128K of memory.

Intel’s assembler was designed so you could have up to 128K of code with a “shared” segment in the middle that either side could reach with near (16 bit only) pointers to call commonly shared routines, and more rarely executed code existed on either end.

In addition data could be its own segment, and/or memory mapped I/O outside of the 128K space.

But memory got so cheap that nobody bothered with this, and the performance gains of writing code that way wasn’t worth the effort. X86 code was compact enough most programs could cram their code into 64k anyway, or 64k per functional unit with calls between them being rare.

The real tragedy is they went for 20 bit instead of 24 bit. 8086 with 16MB of addressable space would have been a very different world and would have made little difference if there use. (Paragraphs would have been 256 bytes, the same size as a page; most data structures would have been fine with that.)

billpg 3 hours ago

Hi. I wrote it, and I'm a human. (Or at least I think I am.)

I did use an AI for spell-checking, punctuation, generally making it flow, but its all my text.

You think a machine is going to come up with "near pointers, far pointers, wherever-you-are pointers"?

chowells 2 hours ago

"generally make it flow" is exactly the problem. It's a process of smoothing over any interesting features of the text to replace them with plastic. It's submerging the actual information you wish to convey under a layer of low-entropy noise. The whole signal may still be there, but having to find it under a uniform glossy finish is work for the reader. It's work you didn't need to delegate to the reader.

LLMs generate low-entropy text. That's their entire purpose. But good writing isn't about being as low-entropy as possible. It's about producing peaks and valleys. As a person who's been participating in human-to-human communication your entire life, you probably have a pretty well-developed sense of how to structure the flow of a piece of communication. The small arcs with their ebbs and flows of tension and density provide the reader a rough surface that gives them enough traction to easily move from point to point. Don't let an LLM smooth out all the gaps. It makes it hard for a reader to keep their footing in the text.

saulpw 2 hours ago

trollbridge an hour ago

Thanks for responding. The problem is:

- the “make it flow” made it flow in an AI generated way like short paragraphs that are one short sentence.

- I now have to decide if this is entirely AI generated and thus not worth my time reading or not.

- I would prefer to just interact with you as a real person; your writing doesn’t have to be perfect for what you write to be worth reading.

billpg an hour ago

tliltocatl 2 days ago

It might have worked better if x86 had general-purpose registers where every register could work as a segment. Or maybe just many more segment registers. But with only two data segment registers to play with and quite cubersome (and slow!) loads, most software just chose not to bother.

senfiaj 7 hours ago

For its time it was a decent idea. Software was smaller and simpler. But today (and even before 64-bit) software is larger, more complex, we also need memory protection / isolation and more flexible memory allocation / sharing, so paging memory was not introduced for nothing.

flohofwoe 7 hours ago

> we also need memory protection / isolation

I seem to remember that memory segments came with a permission system (read-only, read/write, execute) in 'protected mode'. Probably only added in the 286 though (I was always more of an m68k guy at that time).

senfiaj 7 hours ago

Maybe (I think it's possible in protected mode), but it still has an allocation problem, imagine there are programs A, B, C in the memory. Later, A and C are unloaded, leaving 2 free holes, totaling in 2MB. Now you want to load a 2MB program, but there is no unfragmented 2MB free block. The only solution I see, is to shift some loaded programs, which might be slow and even risky. Paging makes this problem much easier. Also, paging makes permissions and memory sharing more granular.

mschaef 6 hours ago

pjc50 4 hours ago

I was under the impression that the permissions only kicked in once you were in 32-bit mode on the 386, what Windows called "386 Enhanced" mode.

pif 4 hours ago

> Need more than 64KB? Allocate two blocks.

How is that compatible with an array and a simple implementation of the index operator?

billpg 43 minutes ago

I once developed for PC-GEOS, which wanted all memory in exactly 8K sized blocks. I wrote a set of C macros that presented an array-of-arrays as a single collection by using mod/divide operations on the index.

pjc50 4 hours ago

It isn't.

This was a problem.

trollbridge an hour ago

Not really. Compilers had huge pointers. They just were slower since they had to do 32 bit math.

RagnarD 7 hours ago

I had to use it to do image processing on a 256MB image buffer back in the 1980s in assembly language. It was absolutely hideous. Give me a flat 32 bit memory address space any day (e.g. MC68000 around the same time.)

mschaef 6 hours ago

> I had to use it to do image processing on a 256MB image buffer back in the 1980s in assembly language.... Give me a flat 32 bit memory address space any day (e.g. MC68000 around the same time.)

Huh?

There were no segmented x86 machines capable of addressing 256MB of RAM, aside from the 386 (maybe).

If you had a 386 and the $130K of memory your statement implies, you probably also could afford a Unix (or something else) license to get to that 32-bit address space. (If you weren't doing it all in memory, then you're having to depending on paging stuff out to disk, implying you either have a real OS or a flat memory model isn't enough to save you since you're manually having to page stuff to disk and back anyway.)

That's a super strange scenario you're describing.

sumtechguy 3 hours ago

Probably talking about swapping it in from some external datastore. These days you would open the file and dump it into a single buffer and rip across it, and not even really stress about it. Even 256 meg of hard drive. That would have been impressively expensive in the 80s.

Back then you had to chunk it out and fiddle with the offsets. Even then you still would have had to manage loading out the next chunk.

If my memory is right 1MB of memory in the early 90s was like 200-300 per meg. Would have to dig up a computer shopper and look.

mschaef 2 hours ago

porridgeraisin 6 hours ago

Perhaps they meant KB?

mschaef 5 hours ago

justincormack 7 hours ago

Wasm with multiple linear memories is basically segmented memory. Its a great security model.

PunchyHamster 6 hours ago

It made fundamental mistake of starting as 32 bit memory model

justincormack 41 minutes ago

You can use multiple segments (now, you couldnt originally)

trollbridge an hour ago

God help us all when a webpage needs > 4GB.

tlb an hour ago

actionfromafar 6 hours ago

It puts some drag on bloat, I quite like it.

b800h 7 hours ago

I quite enjoyed using the memory segments - I thought they were quite intuitive and helped in reasoning about the machine.

peterfirefly 6 hours ago

Could have been fixed with an ADC-type instruction that operated on segments.

Imagine if you could have done something like this:

   add  si, some-delta
   adsc es, 0
in order to move a seg:ofs ptr forward by 'some-delta' bytes.

ADSC (add with segment carry) would do:

   segreg := segreg + imm + 1000h (if carry)
or:

   segreg := segreg + imm  (no carry)
Maybe there should also have been an instruction to normalize a seg:ofs ptr (so the new offset was in the 0-15 range).

ADSC could have been adapted for the 286 with ease, as long as a specific layout of the segment descriptor tables was mandated (probably with 10h instead of 1000h in protected mode).

Edited slightly for clarity (ofs => imm). A normalizing instruction would be harder to do right for the 286 because you don't want to spend too many slots in the descriptor table(s) for a single memory object.

musicale 2 days ago

Google's native client (NaCl) even used it on 32-bit x86...

Segmented memory (on hardware that supported segment permissions) was used to good effect in Multics as well.

gpderetta 7 hours ago

x86 32 bit protected mode segments were a very different beast.

PunchyHamster 6 hours ago

Author comes from some weird assumption that software is some annoying byproduct of making hardware, rather than a fact that the hardware is made to run software and making it easier is a goal.

It was just a hack. Hack to delay migration to 32 bit architecture. Effective one, but hack nonetheless

NoGravitas 4 hours ago

I think it's more like marking a transition in how we thought about software.

When I was learning C, we did things at a reasonably low level. I was learning data structures, and building things like binary trees out of things like structs, and the structs were fixed-sized memory blocks holding pointers to regions of memory which were either more structs or data fields. All reasonable stuff. But we weren't writing for a particular machine. We were writing for the idea of a machine, and part of that idea was that the machine had a flat memory model. This really struck me when I compiled my homework (parse some data into a tree) on the departmental SunOS server, and it worked fine, and then took it home and compiled it with Borland C for DOS on my 386 and it segfaulted on the same data. That was when I learned to hate segmented memory, but looking back, it seems to me that I learned the wrong lesson.

I learned to write software for a lowish-level model of an idealized computer. The generation before me was always writing software for a specific computer, consisting of a specific set of hardware. The software was always the goal, but the nature of the task was defined by the hardware. Things like memory segmentation were facts about the hardware, and the available hardware varied widely at the time in a way modern hardware doesn't, really, except maybe in the embedded space.

deftio 3 hours ago

Wow.... I remember writing 8086 assembly on MASM and another assembler I've forgotten the name of, and then also doing inline ASM in Turbo C++

The segment thing and the convoluted different pointer math caused real gymnastics if you ever had data bigger than 64k... such as images.

I always thought of the segments as windows of 64k but moving between those windows, esp with the limited register set, required some real mental gymnastics.

trollbridge 34 minutes ago

Contemporary hardware rarely had images over 64k and memory bandwidth at the time made them a laughable concept.

waynecochran 3 hours ago

I blame 8086 segmented memory and the rest of its horrid architecture on why no one liked programming in assembly language. There were other elegant RISC machines with flat memory models and large general register sets that were a complete joy to program. Memory paging allowed you to do everything you needed to do that segmented memory provided and left the programmer unbothered for the most part.

deftio 3 hours ago

Totally agree.

raverbashing 2 days ago

No

No, it wasn't

It's the "great idea" that sounds great 5 min in and horrible 10min afterwards

You know, kinda like using null as a string end character

But more importantly it kept the x86 world for too long in that dead end that was 8086 mode programming

"Oh if developers would just..." They won't. They haven't. And they will not ever.

In hindsight maybe a binary level translator from 8080 to 8086 would have worked better (and be simple enough)

mschaef 8 hours ago

The 8086 was a stopgap measure to accommodate the fact the iAPX432 was in the middle of turning into the disaster it did. Given the engineering resources and timelines involved in the 8086, it wasn't a bad compromise approach.

> But more importantly it kept the x86 world for too long in that dead end that was 8086 mode programming > > "Oh if developers would just..." They won't. They haven't. And they will not ever.

8086 real mode programming in the mainstream lasted from 1981 until 1991 or so. The last 35 years have 32-bit (and later 64-bit) flat model addressing with pages for the most part. Seems like a reasonable transition period, really.

> In hindsight maybe a binary level translator from 8080 to 8086 would have worked better (and be simple enough)

Part of the reason they liked the segmented model is that it was possible to set the segments to the same value and then ignore them entirely. That gave a programming model for the 8086 that was sufficiently close to the 8080 that it was possible to use a sort of cross assembler to do something like what you suggest. You could then opt into 8086 specific instructions and segmentation as you needed. (Which took a few years... the first IBM PC's shipped with as little as 16K of RAM.)

billpg 2 days ago

Indeed, I say as much at the end.

But what should Intel have done? They needed a CPU that can run 8080 code but with more memory. Also it's the year ~1980 and we're limited to the technology of the age.

A system with 64k sized windows seems unavoidable.

If you extend the size of the address registers, 8080 code will only run in the first 64k, or require some kind of current window register.

An 8080 mode might have worked but that would have been expensive.

flohofwoe 8 hours ago

> Also it's the year ~1980 and we're limited to the technology of the age.

Tbf the Motorola 68000 which was released around the same time (1979) had a proper linear address space with 32-bit address registers (of which 24 bits were wired up).

Also the 8086 was intended as a cheap and temporary stop gap until Intel's "proper" 32-bit CPU architecture was ready for prime time (the doomed iAPX 432).

smallstepforman 7 hours ago

nwallin 26 minutes ago

flohofwoe 8 hours ago

It's not all that different from the memory pages we have today, except that the 'segment' addressing has become a lot more complex under the hood (multi-level page tables) and a lot simpler at the surface (by merging the 'segment-' and page-address bits into a single virtual address).

PS: and segmented memory wasn't all that different from the memory banking used before in 8-bit home computers to address more than 64 KBytes, except that the memory mapping hardware was implemented outside the CPU.

amiga386 7 hours ago

> It's not all that different from the memory pages we have today

An MMU gives you a flat addressing model. There is no comparison. 8086 segments are rigidly locked to a 64KB window that goes forward in memory 16 bytes for every segment (so segmented address 1234:5678 is linear address $12340 + $5678 = $179B8)

It didn't do this to offer a useful feature like an MMU. It did this to allow code that doesn't know segment registers exist to think they're still running on an 8-bit Z80. What a waste of potential. The 68000 didn't pretend to be a 6502.

The 80286 introduced protected mode with "segment descriptors", but this is well after MMUs existed on other CPUs, it didn't invent virtual memory. Only the 80386 offered a 32-bit flat memory model.

If you want to see something to make you weep, look at the MS-DOS version of unzip. It has to do all kinds of crazy, just to allocate 64KB of RAM and get all 64KB, not 8 bytes less. And it's still locked into a memory access model that will not let it ever address more than 64KB of any one object. It's why MS-DOS was viewed as a toy OS for a toy computer.

    #if defined(__TURBOC__) && !defined(OS2)
    #include <alloc.h>
    /* Turbo C malloc() does not allow dynamic allocation of 64K bytes
     * and farmalloc(64K) returns a pointer with an offset of 8, so we
     * must fix the pointer. Warning: the pointer must be put back to its
     * original form in order to free it, use zcfree().
     */

    ...

    static ptr_table table[MAX_PTR];
    /* This table is used to remember the original form of pointers
     * to large buffers (64K). Such pointers are normalized with a zero offset.
     * Since MSDOS is not a preemptive multitasking OS, this table is not
     * protected from concurrent access. This hack doesn't work anyway on
     * a protected system like OS/2. Use Microsoft C instead.
     */

rep_lodsb 7 hours ago

torusle 2 days ago

> In hindsight maybe a binary level translator from 8080 to 8086 would have worked better (and be simple enough)

Many programs written in assembly language used self modifying code back then. It saved RAM and improved performance. All programs that used such trickery would have broken by a binary translator.

wewewedxfgdf 7 hours ago

Don't know why you're being voted down - you're correct - segmented memory was an awful nasty complex way to program and the industry was eager to see the backside of it.

Why would someone be popping up in 2026 saying it was awesome? Weird.

noitemtoshow 3 hours ago

Nope. It was bad. It made computers in the 286/386 eras having RAM above 1MB sitting there and doing nothing. It took years to transit to DOS/4G and then finally 32bit OS Windows 95.

j16sdiz 6 hours ago

> What we needed, in hindsight, was to treat segments as true selectors — opaque handles with no arithmetic meaning. If you can’t assume the next segment is 16 bytes ahead, you’re forced to use segmentation as intended.

Except we couldn't. If we made each segment isolated from other, we would waste so much memory because memory are allocated in segment.

If we made each segment dynamic, we need something to manage them.

This "hindsight" is just a MMU in disguise.