Beyond Brown

When brown just isn't enough

Anatomy of a Compofylla

or: “new” tricks on old machines

Ika I Rutan

Table Of Contents

Ika I Troduction

We (as in: Newline) are known to release one or the other compofiller. Sometimes it’s just old stuff we had lying around, that we throw into some Gravedigger Compo for “historical interest”, sometimes it’s even something not too shabby. Apparently there is this rare occasion, where one can be lucky enough to tick both checkboxes, even nowadays. This is the story of one of those compofillers - “Ika I Compofylla”.

Ika I Whåt?

So, what happened and why did I summon you all to read this? Well, not much, apart from the demoparty Sommarhack 2024 in Sweden, which was as awesome as ever. A small, rural, lakeside hut filled to the brim with the legends that are Omega, 2-Life-Crew, Ghost, Sync, Avena, Aggression, New Beat, Reservoir Gods, DHS, Nature, PHF, KüA, SFMX, Dekadence, Escape, T.O.Y.S. and a lot more that I forgot.

Attending such an invitation-only event is truly like walking amongst demoscene gods. The hours of my child- and nowadays even adulthood that these guys filled with awe are too many to count. But, for some odd reason people still refuse to explain to me, I got the chance to attend a Sommarhack party in 2022.

Oh boy, did I want to go there, release a big demo and… here’s where the screeching halt comes in. I did attend the party, it was even more awesome than imagined - but I showed up empty-handed. Perfectionism. Time. Reasons. I did that stance for a two years now, and people started to whisper behind my back and point at me, the newcomer to this party, daring to not even release something.1

But year after year, when entering the Sommarhack château, the hand would remain empty, only greeting.2

Until 2024. I wanted to do it. Release. A good thing. Artistic. Surprising. Nice and… again, the screeching halt, time flew by, all the pieces I had lying around dared to not fit in one way or the other. So I did the next best thing. Grabbed two very fundamental things that where never done before and that I welded together just a few years ago, both being things that I was sitting on for far too long anyway3 and released it as one. And of course, it’s for the good, old, plain 1MB ST (which means it does not need STE, nor screen address change per line or scrolling per line) - everything else is considered cheating4, after all.

Real hardware, real blurriness

Ika I Histöry

Those two things where a busdisplay trick and a rotator. Both were pretty old (my git says the rotator is being 8 years old now - the initial busdisplay tests date back to 1991 if I remember correctly, might’ve been 1992). I sent a prototype of this amalgamation to people in 2020, and it even worked there, so I was pretty sure that I wasn’t looking at another case of “works on my machine, but only on my machine”™ 5 6

Ika I Tacknology

But let’s concentrate on the busdisplay here. Why should you use it? What’s the ruckus? Where’s the dough?

It’s not bus noise

The underlying (and erroneously called) “bus noise” is well known by many people, but to my knowledge nobody tried to control it in a way, so that we can do write-free effects. It was a parlor trick, up until now.

Imagine you could lavish bitmap data onto your demo-loving audience just by reading it and without ever writing to any screen RAM. We would get rid of that pesky write cycle we always had to do to update the screen memory, we’d gain a lot of cycles that way. Does this sound to good to be true? Yes. Yes, and in a way it is.

You have to pay eventually

There is a caveat, one that Atari ST coders basically have for lunch and would not even bat an eyelid at if mentioned at a cocktail party: you have to race the beam. Synchronise. Hardsync. You know, the ugly stuff, that only real machines force you to do.

The other caveat is, that you lose bitplanes, not only a few, but half of them.

And you only have 16 clockcycles to compute 16 pixels.

This amounts to a lot of restrictions, which make this screenmode a bit… unwieldy.

Greener pastures

If you haven’t run for the shore yet, then please let me congratulate you at this point - from here on out it’s only benefits and roses. At least benefits, we’re still working on the roses.

But how does one show data that the CPU7 just innocently reads?

Kill it with hardware

It’s easy: kill the RAM. No, really. The RAM is the only stumbling block between us, painstakingly reading data and the actual screen display. Basically, the CPU puts our data onto the bus, neat and tidy and then a rude RAM read comes (usually triggered by a LOAD signal from the shifter every 16 pixels in the display area) and just obtrudes itself onto the bus so the shifter can display it.

Since we can’t really kill the RAM (we unfortunately need it to store our program), how about we just go where there is no RAM? At, let’s say, $100000 on an 1MB machine for example? Or $400000 on an 4MB machine? That would be in an area where no RAM is installed, right?

Let’s just say “yes” for now, we will see why it’s not that easy later, but for our mental model this will suffice.

The inner works

Let’s step back a bit and have a look how the shifter gets all those wonderful pixels we all stare at all day.

The shifter itself is basically a slave of its surroundings. It does not do much by itself, than just ingest, ingest, ingest and spit out wonderful colours. When the shifter needs more data to ingest, it issues a LOAD signal, which in turn is connected to the MMU [NOTE: Ijor just currently pointed out that actually the MMU issues the LOAD signal, which is of course correct - fortunately this does not change our mental model, so I just keep this here as a warning for later generations and for “learning purposes” - you are welcome]. When the MMU gets the LOAD signal, its sole job is to prepare the system bus with the next word that the shifter should see.

To do that, it tracks the screen address and basically counts every LOAD pulse and increments that address. It uses that address to finally decode which RAM cell(s) are needed to fulfill the shifter’s request and connects that RAM to the main bus, so the shifter can read it. (I won’t go into how those RAM cells get activated/selected, it’s not of any concern here)

Stale is the new fresh

But if the RAM address is in an uncharted area, there is no RAM the MMU knows of - it does not do that.

Instead, the poor shifter only gets stale data.

And this is where we come in.

That stale data that the poor shifter has to ingest is our code or at least it can be, if we disable all interrupts and do a hardsync.

Opcode aesthetics

But unfortunately code looks ugly - at least to the shifter and our audience, who are only used to those beautiful pixels people like Jade and mOdmate used to feed it. Opcodes are bitpatterns that not really gratify our aesthetic needs8. Seemingly we’re at the end of some strange journey.

But… wait, how about crafting code that does leave room for aesthetics? That is evenly sized, so we can control what we feed to the shifter?

Meet move.l -$3502(A0),D0

From an opcode perspective, this looks like the mnemonic we all did type at some point. But from the CPU’s perspective, it looks a little different:

$2028 $CAFE

The first word is the opcode the CPU fetches to decode and finally execute (move.l n(A0),D0), the second word is the address register offset (-$3502 which is $CAFE when assembled) that it needs to read, to have a fully qualified opcode to execute. The interesting thing here is, that the CPU will read each word every 4 clockcycles. We’ll see in a bit, why this is important.

Now let’s have a look at the bus (and ultimately the shifter’s) perspective on the above opcode when executed:

$2028 $CAFE $AAAA $BBBB

We see two more words on the bus than we see in the opcode representation in RAM; these two words at the end are the actual workload the CPU reads into D0 - we just assume that -$3502(A0) would point to the longword $AAAABBBB for argument’s sake. Again, the CPU needs what I call a “RAM cycle” to read each word from the data bus, which takes, again, 4 cycles per word.

Like clockwork

Let’s just add the respective cycles (relative to our synchronization to the left display start) to each word for clarity:

cycle00 CPU read: $2028
cycle04 CPU read: $CAFE
cycle08 CPU read: $AAAA
cycle12 CPU read: $BBBB
cycle16 CPU read: done (aka next move.l starts)

Now the shifter comes in: the shifter expects to be fed data every 4 cycles (it issues the LOAD signal, see above) during a display period of a scanline. And that’s exactly what that opcode does; the LOAD basically gets ignored by the MMU, no RAM gets connected and our nicely crafted opcodes end up being shown on the screen. Splendid!

Let’s see how the data we put on the bus is ingested by the shifter:

cycle00, plane 1: $2028
cycle04, plane 2: $CAFE
cycle08, plane 3: $AAAA
cycle12, plane 4: $BBBB
cycle16, plane 1: $2028 (next opcode)
...

Don’t be mean to your artists

Unfortunately our nicely handcrafted 2 plane gfx $AAAA $BBBB aren’t the only thing displayed on the screen - but also the opcode and offset words in plane 1 and 2. We could of course ask our artist to incorporate them into his work, but as coders, we simply can’t be that cruel to our gfx colleagues, can’t we?

So you probably see what we have to do now, do you? Correct - we just blatantly lie to the audience instead of alienating our gfx people. We just change the palette in a way, that the first two planes don’t interfere with anything that we want to display inside the second two planes. We basically switch them off, so that plane 1 and plane 2 can sport any garbage without any interference to plane 3 and 4.

There you have it. A sequence of move.l n(Am),Do on a scanline can show beautiful graphics just by reading it into a scratch register.

If you’re still with me, I even prepared some (albeit very basic) sample code. And of course you can do the same using the blitter, which is a) pretty trivial, but b) a lot harder to actually get a benefit from. But I leave that as an exercise to the reader. I’m not ready to give up on our poor old CPU by just replacing it meanly doing blitter stuff, yet.

Now, please, go run with it. Make incredible stuff the ST has never seen.

Iki I Ågly Fåcts

Of course everything’s not always as easy as it sounds.

First ugly fact: the above would be enough to have a prototype working, but that would not work on all machines. Basically there are slight differences in non-RAM-area handling between the STe and IMP and non-IMP MMUs. So I consider it best practice to not use $400000 as a screen base address on 4MB machines, but rather reconfigure the MMU to a lower memory value. One would naively think that you could just switch off bank two and be done with it; but this would of course not work with IMP MMUs (basically those are later MMUs), which do not have independent bank 1 / 2 settings - the settings of bank one and two are forced to be identical. This is why I set the MMU to a 512k/512k setting if I detect an IMP MMU. On all others it’s fine (and a lot easier, I’d recommend it even) to just “disable” bank 2 (e.g. setting 2048k/128k) if 4MB is encountered. Then you can for example just set the screen address to $2f0000 and be done with it.

The second ugly fact is, that opening borders is kinda… meh in this mode. Since everything we do with the CPU is very closely observed and pulled into the open (aka shown on the screen) by the shifter, especially the right border needs some special care. You might have noticed the jailbars between rotator and scroller; that’s the opcode for the right border switch you see there. Unfortunately I was so dumb and added overscan switches purely out of memory just on July, 1st 2024 (with my flight to sweden going the next day) and then discovering that they were borked and did not work on the compo machine (although they worked on all the other STes and STfs and MSTs I could test it on), I had little time to do more clever things to better hide them, like a small scroller in there or sth. That’s what you get when you think “oh, that’s easy, no need to copy-paste from the other protoype”. Yes, I’ll stop whining now.

The third ugly fact: with overscan you totally need line stabilizers on all systems. Fortunately I still have and had nops free per scanline to add them, but it’s a shame we still have to bother with this. Plus I’m probably holding it wrong. Troed to the rescue! [NOTE: actually I did a stabilizer-free version, but this would need a palette change every line (due to every second line “skewed” by one word), thus making it impractical]

The fourth ugly fact: even without overscan, you need to hardsync your display code. But then again, that’s our (the ST scener’s) day job, practically.

Ika I Roto

I won’t dive too deep into the rotator used in this compofiller here, I might do that at a later time - if anyone is still interested, that is.

Let me just cover the basics: it’s a shear rotator, just like Mr. Pet/Sanity invented(?) in Roots 2.0 in 1995 on the Amiga and Axis/Oxyron recreated masterfully in Planet Rocklobster in 2016.

I was talking to Axis about his rotzoomer, and we both unanimously agreed that that technique would just not be possible on a plain ST. At all.

Which, as every ST demoscene coder can confirm, does not sit well with our “we’ll do it anyway” attitude. So I got the nagging feeling that it should be possible, started to come up with some ideas, did the work and had a 320x176 1x1 2plane 50fps rotozoomer working eventually, in real time, freely movable. 2plane was of course not 3- or even 4plane, but it was the fastest around by a large margin, so I was somehow satisfied with the result.

When searching for an effect that I could incorporate into the busdisplay prototype (which was only showing a logo and test gfx for years now) I took this rotozoomer and build a sibling rotator that would render in fixed time and with 16 clockcycles per 16 pixels (see the explanation above for why this is needed). Note that I very distinctly use “rotozoomer” and “rotator”. “Ika I Compofylla” does not use a rotozoom, it’s just a (shear) rotator. That’s a very important distinction, because the seemingly zooming part is merely a side effect of the shearing used, which “shrinks” the gfx used at deterministic angles (as in: at 45 degrees the distance between the texture’s min and max y is the same as the texture’s original height). Everything that is shown on the screen is calculated in real time, textures can be changed, and it works on 1MB.

Ika I Compofylla

And that’s basically it. Fortunately Dubmood introduced GGN to Ika I Rutan, which in turn led to XiA introducing me to this old (but wonderful) swedish children TV classic. We were talking about the irrevocable fact, that the world would need Ika demos, like, a lot. When ggn offered me Dubmood’s skeleton dance cover when I was practically begging people for music, the theme to weave around the busdisplay rotator was set. And it did fit the swedish party nicely!

Ika I Tack Yå

Thanks must go to everyone pushing me into doing a writeup, of course everybody at Sommarhack 2024 who understood what that entry in the compo was showing from the get go (you know who you are), GGN for helping with this writeup and keeping my prose at bay, same to XiA, Dubmood for the wonderful song and eventually Ika-ing us. Axis/Oxyron for our talks that made me try my hand at shear rotation so many years ago, and all the poor people that had to endure my dull technical prototypes (again: you know who you are).

Plus: ggn, XiA and Tat for proof reading / feedback.

Finally: Extra mention to XiA for his really witty translation of the “Ika I Rutan!” video linked above.

Ika I Ranta

Let’s close this with a personal rant that goes out to those people who can’t stand compofillers, screens, small intros and expect and respect nothing less then full demos, yada yada, you know who you are. It’s people like you that practically keep our stuff for ransom, that make people like us, the creators, having stuff lying around just for the sake of the next demo. Since when it has ever been a bad thing just releasing a single screen? Screens are our Atari ST demoscene heritage after all. No intro (intro to what?), no full demo with it’s long timeframe to create. Just screens. Since when do we have to have remorse to actually release something? To everyone else: don’t be afraid of doing screens, I know I am not, any more.9

(v1.1 - added suggestions, added remark regarding MMU and LOAD, added contact info, added missing links)

(contact tIn at #atariscne, on the Atari Demoscene Discord or shoot a mail to tin åt absencehq.de)


  1. There are sources that tell me, they even sniggered [return]
  2. Well, tbf it was only 2022 and 2023, but it felt like a very long series of betrayals [return]
  3. Which is unfortunatly true for most of my proof of concept-stage things [return]
  4. It isn’t of course - but my rejection of the STe used to go deep - I did not manage to own one bitd, and that was of course the STEs fault, entirely. I don’t remember the exact reasoning behind this, but it was enough the keep me on plain ST apparently, even today. [return]
  5. Got pretty badly burned by this in ‘89 when my oh-so-awesome 1-scanline syncscroll turned out to do exactly that - thank you, electronics, who’d known one should know things about you! [return]
  6. But I ran into a short episode of this anyways - Evil’s STe and me had some… slightly different view of things. [return]
  7. It’s not only the CPU, but let’s not disgress [return]
  8. This is of course not true - every coder loves his code, on an aesthetic level, bit he’ll never admit so [return]
  9. Note that I’m of course not dissing demos here, they are a wonderful thing, and I will do another one, but this time I’ll keep releasing screens, too. [return]