madeline: a line editing library for hare

a year and five days ago, i decided to write a line editor. i've been using it in my daily-driver shell for nearly that long now, and i'm quite pleased with where i ended up. let's walk through it!

my main goal for madeline was to guarantee that, as a baseline, line-mode hare programs in the wild would have a comfortable and consistent line-mode ui. what `getopt::` does for command-line interfaces, i wanted madeline to do for line-mode interfaces

rendering and unicode support

(caveat: it's been at least six months since i seriously thought through the logic that lead me to what madeline does here today, so i might be misremembering some things)

madeline (currently) renders exclusively to a terminal, which means that the upper limit on its ability to support unicode is... limited, but i've done my best to make it work as well as it possibly can. this came with some significant challenges and caveats, and i'm still not sure i'm happy with where i ended up

the first issue is that we can't always be sure exactly how wide a string will be when rendered, where there'll be line breaks, or when the screen will scroll. some other line editors get around this by making assumptions about unicode rendering, but this leads to brittleness where the terminal can't fix its unicode handling because clients are relying on the broken behavior and clients can't fix their unicode handling because terminals are implementing broken behavior

to get around this, madeline (ab)uses the alternate screen buffer as a wcswidth() implementation. between renders, madeline stores how many rows below the start of the input the cursor currently is. don't worry if you don't understand everything here just yet, but the algorithm it uses to update the screen is:

- switch to the alternate screen and move the cursor back to the top-left corner

- write the rendered state, putting the escape sequence for saving the cursor position (\x1b7) in the place where the cursor will end up

- write the escape sequence for requesting the cursor position from the terminal (\x1b[6n)

- write the escape sequence for restoring the cursor position (\x1b8)

- write the escape sequence for requesting the cursor position from the terminal (\x1b[6n) again

- switch back to the normal screen

- scroll the cursor up X rows, where X is the stored cursor row

- store the cursor position after the \x1b8 to be used in the next render

- write Y newlines then scroll up Y rows, where Y was the cursor's position between finishing writing rendered state and restoring the cursor position

- re-write the rendered state, putting \x1b7 where the cursor goes and \x1b8 at the end

that's pretty complicated. let's break it down a bit and explain each step in detail

switch to the alternate screen and move the cursor to the top-left corner

the terminal provides two virtual screens, which you can swap back and forth. this is how, for example, vim can draw on the whole screen and fully disappear once you close it. it switches to the alternate screen when you start it up, and it avoids ever touching the normal buffer, which is where eg. the shell's scrollback is

madeline makes use of this to be able to render some stuff without messing with said scrollback. crucially, this allows us to move the cursor back up to the top of the screen, render some stuff there, and avoid having that mess with whatever history was originally up there

write everything before the cursor, then \x1b7, then everything after

the crucial insight here is that, now that we've moved the cursor up to the top-left corner, this is much less likely to cause the display to scroll, so we can accurately judge how tall everything is

to demonstrate this: imagine a five-column-wide, five-row-high screen, with characters guaranteed to all be exactly one column wide. our current state is five characters wide when rendered, and the user just pressed the "a" key, adding another column. we clear the current render, save the current position (5th row, 1st column), write all six characters out, and realize that we're still in the 1st column of the 5th row. the row hasn't changed, so clearly we don't need to clear anything but the current row when we next update the state, right? wrong. the only reason the row didn't change is that everything scrolled up a row when we got to the bottom-right corner of the screen

the only way to be able to actually tell how tall something is is to render it without any scrolling and measure the y position of the cursor at the end. this does mean that madeline's rendering breaks when the state is larger than the screen, which i'm tracking in this ticket:

input that's larger than the screen leads to ui bugs

the reason we store the cursor's position rather than immediately requesting it from the terminal is probably in order to deduplicate code, though this code is extremely fiddly and has been rewritten at least a dozen times, so there might also be some subtle bug it's fixing which took me five hours to find

request the cursor position, then restore it to the saved position, request the position again, then switch back to the normal screen

and now the reason we entered the alternate screen in the first place. i'll explain why we need the full height of the rendered state later, but we already know why we need the cursor's position: when we do the next render, we need to know how many rows to clear. once we've figured out all the information we need, we can go back to the normal screen and start actually rendering things

scroll the cursor up X rows, where X is the last render's cursor height, and store this render's cursor height to be used next time

i just described this lol. pretty straightforward, we're getting back to the place where we should start the current render

write Y newlines then scroll up Y rows, where Y is the full height of the rendered state

and now we get to the trickery. the reason we do this is to avoid any scrolling while doing the final render, which would break things for different reasons than it would break things earlier. joy!

re-write the rendered state, putting \x1b7 where the cursor goes and \x1b8 at the end

and this is the reason we can't have any scrolling. \x1b7 and \x1b8 save and restore the position of the cursor within the screen, which means that if the screen scrolls at all between them, the saved position won't actually line up with where the cursor wants to be

anyways back to rendering and unicode

the biggest upside of this approach to rendering is that we don't have to care at all about the specifics of text rendering, and we're guaranteed to get the correct results so long as the terminal supports all the features we need

however, this does mean that madeline is... not very portable. i've actually only ever tested it particularly strenuously in foot, and when i tried a few other terminals, it was pretty broken. the linux console doesn't implement the alternate screen, and i'm pretty sure the vt220 i have lying around got so angry at me that it started beeping loudly in indignation at the atrocities being committed

the only unicode-related thing i needed to implement myself was cursor movement - pressing the right arrow key should move the cursor forward one grapheme, not one byte or one codepoint. this means that, for example, if you type "i'm🏳️‍⚧️", press the left arrow key, then type a space, the space will be inserted between "i'm" and 🏳️‍⚧️, rather than going between the U+200d and the ⚧️. because hare doesn't yet have a unicode library, i wrote a small parser for a subset of the unicode data and included it along with an implementation of the unicode text segmentation algorithm

uax #29, the unicode text segmentation algorithm

ui niceties

madeline provides all the readline-compatible keybindings which i frequently use, and any other keybindings that other people send patches for. i'm planning to provide support for user-configurable keybindings in the future, in addition to a vi mode, but i haven't written the code for either of those yet

in addition to this, madeline allows prompts dynamically generated by arbitrary hare functions (with a convenience function for fixed prompts), hints which appear after the user's input and can be completed by moving the cursor past the end of the input (with a built-in function that generates these from history, fish-style), a built-in history system which is optimized enough to be able to load my 80k-line shell history nearly instantly, and configurable autocomplete which makes use of a lexer to be able to properly handle items that require escaping

madeline makes light use of styles, limiting itself to only normal and dim+italic styles, the latter of which is used for hints and completions. because of how rendering works, colors are fully supported in prompts (no more needing to fuck around with \1 and \2 in order to fix multi-line inputs!)

conclusion

fuck um i dunno. feel free to check out imrsh, it's emersion's mrsh but with madeline instead of readline, and it's what i've been using as my daily driver for the past while. there's also rc, which is neat and i'd like to switch to it eventually but it doesn't have job control yet and i Really need job control

imrsh

rc

ok bye