tag:blog.sagetheprogrammer.com,2014:/feedSage Griffin2022-01-30T12:51:08-08:00Sean Griffinhttp://blog.sagetheprogrammer.comSvbtle.comtag:blog.sagetheprogrammer.com,2014:Post/una-poema-para-hace2022-01-30T12:51:08-08:002022-01-30T12:51:08-08:00Un poema para "hace"<p>“Hace” es el mejor palabra del mundo, porque puede significar todo. </p>
<p>¿Quieres crear algo nuevo?<br>
¡Hace ese algo!</p>
<p>¿Quieres realizar algo?<br>
¡Lo hace!</p>
<p>¿Quieres cocinar una comida?<br>
¡Hace esa cena!</p>
<p>¿Quieres hablar del tiempo?<br>
¡Hace frío!</p>
<p>¿Cuántas horas en el pasado?<br>
¡Hace horas!</p>
<p>¿"ser" y “estar” no son suficientes?<br>
¡Hace!</p>
<p>En las palabras de Kevin, ¿por qué usa muchas palabras cuando pocas palabras están bien?</p>
<p><img src="https://c.tenor.com/IsYdPRq7bjcAAAAC/why-waste-time-when-few-word-do-trick.gif" alt="Why use many word when few word do trick?"></p>
tag:blog.sagetheprogrammer.com,2014:Post/things-i-wish-i-knew-about-assembly2020-08-04T09:49:23-07:002020-08-04T09:49:23-07:00Things I Wish I Knew About Assembly<p>My talk for <a href="https://rustconf.com">RustConf</a> this year includes an technical deep dive of the MissingNo glitch from Pokemon Red and Blue. It was important to me to really understand not just what happened in this glitch, but why it happened. This meant I had to spend a lot of time over the last year reading through a disassembly of the game.</p>
<p>While I had a very rudimentary understanding of x86 assembly going into this, I had never seen this assembly syntax before. I also didn’t know much about some of the intricacies of assembly programming. As a result I went down a few rabbit holes of misinformation, and had to learn a lot about the GameBoy’s assembly. Here are some things I learned that would have saved me time had I known them up front. </p>
<p>The disassembly I used was built with the <a href="https://github.com/rednex/rgbds">RGBDS assembler</a>. Some of this may be specific to that tool chain, some of it may be specific to the GameBoy hardware, and some of it may be specific to z80 assembly. Unfortunately I don’t know enough about other z80 variants, or other toolchains to say for sure.</p>
<p>The GameBoy used a variant of z80 assembly. It removed all of the I/O instructions (the GameBoy only used memory mapped I/O), as well as the <code class="prettyprint">i</code>, <code class="prettyprint">r</code>, <code class="prettyprint">ix</code>, and <code class="prettyprint">iy</code> registers, and a few other instructions. In their place it added a few extra instructions for manipulating stack pointers, more easily writing to the addresses used for I/O, and a few other helpers. Oh and just for fun, incrementing or decrementing a 16 bit register would corrupt sprite memory if the value in the register was within a certain range.</p>
<h1 id="instructions-require-specific-operands_1">Instructions Require Specific Operands <a class="head_anchor" href="#instructions-require-specific-operands_1">#</a>
</h1>
<p>I was pretty perplexed when I came across some code like this:</p>
<pre><code class="prettyprint lang-asm">ld a, [base + 0]
ld b, a
ld a, [base + 1]
ld c, a
ld a, [base + 2]
ld d, a
</code></pre>
<p>You’d think they could just write this instead:</p>
<pre><code class="prettyprint lang-asm">ld b, [base + 0]
ld c, [base + 1]
ld d, [base + 2]
</code></pre>
<p>But the <code class="prettyprint">ld</code> instruction can’t just take any of the logical combinations of source/destination that you might expect. For indirect loads, either the destination must be <code class="prettyprint">a</code>, or the source must be <code class="prettyprint">[hl]</code>. That means that if you want to read from some arbitrary address into some arbitrary register, you must first either store the address in <code class="prettyprint">hl</code>, or store the value in <code class="prettyprint">a</code>. The code could have also been written like this:</p>
<pre><code class="prettyprint lang-asm">ld hl, base + 0
ld b, [hl]
ld hl, base + 1
ld c, [hl]
ld hl, base + 2
ld d, [hl]
</code></pre>
<p>A side effect of this is that <code class="prettyprint">a</code> and <code class="prettyprint">hl</code> are rarely used as general purpose registers. You were extremely likely to need to overwrite whatever was in those registers as intermediates to do whatever you needed. As a result, data was often loaded from memory in weird places, or even reloaded from memory multiple times purely due to the limited number of registers available. As you might expect, this made bugs more common in code that couldn’t fit everything it needed to do its work in 4 registers.</p>
<h1 id="code-size-was-often-more-important-than-execu_1">Code size was often more important than execution speed <a class="head_anchor" href="#code-size-was-often-more-important-than-execu_1">#</a>
</h1>
<p>For a lot of games, doing everything they needed in ~16ms was no problem. Or dropping the frame rate to 30 fps was acceptable. But code size had a very real cost associated with it. The smallest – and therefore cheapest – cartridges only had 32KiB of ROM (though I think the smallest game shipped with than 256KiB, nobody used the smallest sizes available).</p>
<p>You could have up to 4MiB max of ROM for your game if you needed it, which by today’s standards is absurdly small. But most folks tried to stay even lower than that. Whenever you needed more ROM for your program, you had to double the ROM on the cartridge. And that meant that producing your game was that much more expensive, cutting directly into the profits your game could make.</p>
<p>As a result, folks often optimized for code size at the expense of execution speed – though there are absolutely some exceptions such as audio processing code, or when doing complex visual effects like parallax.</p>
<h1 id="random-instructions-were-used-as-an-optimizat_1">Random instructions were used as an optimization <a class="head_anchor" href="#random-instructions-were-used-as-an-optimizat_1">#</a>
</h1>
<p>There were time’s I’d see a seemingly random instruction for no reason. It turns out there were some cases where if you want to do something specific, there’s another instruction you could use that is smaller and faster. Here’s some examples:</p>
<pre><code class="prettyprint lang-asm">; load 0 into a. Takes 2 cycles and 2 bytes
ld a, 0
; Takes 1 cycle and 1 byte, but does not preserve flags
xor a
; set z if a is equal to 0. Takes 2 cycles and 2 bytes
cp 0
; Takes 1 cycle and 1 byte
or a
; Also takes 1 cycle and 1 byte
and a
; set z if a is equal to 1. Takes 2 cycles and 2 bytes
cp 1
; Takes 1 cycle and 1 byte. Can also be used on registers other than `a`
dec a
</code></pre>
<h1 id="string-literals-could-mean-anything_1">String literals could mean anything! <a class="head_anchor" href="#string-literals-could-mean-anything_1">#</a>
</h1>
<p>Games would sometimes use custom text encoding. Just because you see something in a string literal doesn’t mean that it will represent the bytes that could mean in other languages. For example, in Pokemon, the character “@” wasn’t printed. In the <a href="https://github.com/pret/pokered">pokered disassembly</a>, they map that character to the byte <code class="prettyprint">0x50</code>, which is the “end of name” control character. And any time you see “<trainer>” in the same disassembly, that is actually just the byte <code class="prettyprint">0x5D</code>.</p>
<h1 id="code-classprettyprintccode-means-multiple-thi_1">
<code class="prettyprint">c</code> means multiple things <a class="head_anchor" href="#code-classprettyprintccode-means-multiple-thi_1">#</a>
</h1>
<p>I’m a little embarrassed how long I got tripped up by this one. One of the general purpose registers is called <code class="prettyprint">c</code>, but <code class="prettyprint">c</code> could also mean the “carry” flag. Which <code class="prettyprint">c</code> the character <code class="prettyprint">c</code> in your assembly means depends on the context. <code class="prettyprint">ld c, 1</code> will load that into the <code class="prettyprint">c</code> register, but <code class="prettyprint">jp c, $F00</code> will jump to <code class="prettyprint">0xF00</code> only if the carry flag is set. </p>
<p>But these <em>are</em> separate places, and instructions which set or reset the carry flag will not affect the value of the <code class="prettyprint">c</code> register.</p>
<h1 id="bank-switching-is-like-a-more-primitive-form_1">Bank switching is like a more primitive form of segmentation <a class="head_anchor" href="#bank-switching-is-like-a-more-primitive-form_1">#</a>
</h1>
<p>The GameBoy used a 16 bit processor. Half of that address space was used for your cartridge’s ROM. So unless your entire game fit in 32KiB, you needed a way to access more than 64KiB within the 32KiB address space. To do this, the GameBoy used a system called bank switching.</p>
<p>Modern operating systems use <a href="https://en.wikipedia.org/wiki/Memory_segmentation">segmentation</a> or <a href="https://en.wikipedia.org/wiki/Paging">paging</a> to allow more physical memory than can be represented by the available address space. However, this only helps when you have multiple processes with separate memory spaces. Each individual process can still only access as much memory as can be fit in a pointer.</p>
<p>In these systems, a few bits of the pointer represent a segment or index into a page table, while the remaining bits represent an offset. So a program’s pointer may not be the same as the physical address it represents, but every physical address has a distinct pointer. With bank switching, we instead have the same pointer mean multiple things.</p>
<p>The first 16KiB of address space (0x0000-0x3FFF) always represented “bank 0”. They directly mapped to that physical address on the cartridge. But the next 16KiB (0x4000-0x7FFF) could point to anywhere. By changing the active “bank” number, this address space could point to different sections of the cartridge’s ROM.</p>
<p>While this meant that you could have much more code than would otherwise be possible, it also meant that instructions dealing with pointers, like <code class="prettyprint">jp</code>, <code class="prettyprint">call</code>, <code class="prettyprint">ret</code>, or even indirect loads might have to worry about switching to the appropriate bank. This meant that you essentially had to invent your own wide pointer (though it could at least be 3 bytes instead of 4, since you could have at most 256 banks). Additionally, it was common to have your own calling convention that preserved the current bank number as well as the return address.</p>
<p>Just to make things a bit more horrifying, the way you switched banks was incredibly funky as well. Given that everything on the GameBoy used memory mapped I/O, you might expect that there’s a random byte of address space you write to with the active RAM bank and active ROM bank. But that’s not the case.</p>
<p>Instead you would write 1 or 0 to anywhere in the address space 0x0000-0x3FFF to set RAM/ROM mode, and then write a byte to 0x4000-0x7FFF to pick which bank you wanted to switch to. Since both of those address ranges are ROM, this will cause a segfault which gets intercepted by the memory bank controller on your cartridge, which then performs the switch. There was no way to read the active page number.</p>
<p>This also interacts with the quirk we talked about earlier where you can’t just use any combination of operands with <code class="prettyprint">ld</code>. If you want to switch to bank 3, you can’t just write <code class="prettyprint">ld [$4000], 3</code>. Either the address has to be in <code class="prettyprint">hl</code>, or the value has to be in <code class="prettyprint">a</code>. So no matter what, switching banks requires clobbering a register as well!</p>
<h1 id="the-stack-was-a-luxury-not-to-be-overused_1">The stack was a luxury, not to be overused <a class="head_anchor" href="#the-stack-was-a-luxury-not-to-be-overused_1">#</a>
</h1>
<p>The GameBoy only had 8KiB of general purpose RAM (called wram) available to the game. You could extend this by including RAM on the cartridge (called sram), but because cartridge RAM had to deal with bank switching, it was <em>much</em> more difficult to use. As a result, memory in general was very scarce.</p>
<p>By default the GameBoy would set the stack pointer to the top of a 127 byte section of RAM that was otherwise unused, but most programs would just set it to the top of wram at boot, so any unused memory would be available for the stack. For Pokemon Blue this still only gave them a 207 byte stack, so it was used only when absolutely necessary.</p>
<p>This also resulted in an interesting calling convention where all registers were callee preserved, but they preserved them in a specific spot in memory rather than on the stack. This meant that functions couldn’t call other functions, since registers were always preserved in the same spot in memory, not on the stack. This likely would have been a soft constraint anyway, since a stack overflow would be so easy to achieve.</p>
<h1 id="conclusion_1">Conclusion <a class="head_anchor" href="#conclusion_1">#</a>
</h1>
<p>This is just a random smattering of things that tripped me up while going through the disassembly of Pokemon Blue. Developing for the GameBoy must have been an extremely difficult task, and I have a lot of respect for the developers who had to keep all of this straight while developing. If you ever decide to go source diving into your favorite GameBoy games, hopefully this saves you some time.</p>
tag:blog.sagetheprogrammer.com,2014:Post/neat-rust-tricks-passing-rust-closures-to-c2019-11-25T11:38:37-08:002019-11-25T11:38:37-08:00Neat Rust Tricks: Passing Rust Closures to C<p>One of Rust’s biggest selling points is how well it can interoperate with C. It’s able to call into C libraries and produce APIs that C can call into with very little fuss. However, when dealing with sufficiently complex APIs, mismatches between language concepts can become a problem. In this post we’re going to look at how to handle callback functions when working with C from Rust.</p>
<p>Our hypothetical library has a <code class="prettyprint">Widget</code> struct, which periodically generates events. We want to take a callback function from users that is called whenever one of these events occurs. More concretely, we want to provide this signature:</p>
<pre><code class="prettyprint lang-rust">impl Widget {
fn register_callback(&mut self, callback: impl FnMut(Event)) {
// ...
}
}
</code></pre>
<p>Unlike Rust, C has no concept of closures. Instead it has function pointers. To put this in Rust terms, you can take <code class="prettyprint">fn(Event)</code>, but not <code class="prettyprint">impl FnMut(Event)</code>. Function pointers can only work with the arguments passed to them (or global state like <code class="prettyprint">static</code> variables), while closures can capture (or “close over”) any arbitrary state in the environment they were created. Because of this, C APIs often let you pass a “data pointer” when registering a callback, and then pass that pointer to your function whenever it’s called. For example, the C version of that signature might look like this:</p>
<pre><code class="prettyprint lang-c">void widget_register_callback(
widget_t *widget,
void *data,
void (*callback)(void*, event_t)
);
</code></pre>
<p>If the callback is long lived, or the ownership semantics are complex, the C API may also have you provide a destructor function:</p>
<pre><code class="prettyprint lang-c">void widget_register_callback(
widget_t *widget,
void *data,
void (*callback)(void*, event_t),
void (*destroy)(void*)
);
</code></pre>
<p>If you’re not familiar with C’s syntax, here’s the equivalent signature in Rust:</p>
<pre><code class="prettyprint lang-rust">fn widget_register_callback(
widget: Widget,
data: *mut (),
callback: fn(*mut (), Event),
destroy: fn(*mut ()),
);
</code></pre>
<p>So instead of taking a closure which automatically captures whatever state it needs to, in C we need to manually shove any state we want to keep into a struct, and pass that along with our callback function. Bridging these two APIs in Rust is surprisingly easy. The language does most of the work for us. The way we handle this is to pass the actual closure as our data pointer. Let’s look more concretely at what this means:</p>
<pre><code class="prettyprint lang-rust">fn register_c_callback<F>(widget: &mut ffi::widget_t, callback: F)
where
F: FnMut(ffi::event_t) + 'static,
{
// Safety: We've carefully reviewed the docs for the C function
// we're calling, and the variants we need to uphold are:
// - widget is a valid pointer
// - We're using Rust references so we know this is true.
// - data is valid until its destructor is called
// - We've added a `'static` bound to ensure that is true.
let data = Box::into_raw(Box::new(callback));
unsafe {
ffi::widget_register_callback(
widget,
data as *mut _,
call_closure::<F>,
drop_box::<F>,
);
}
}
// Safety: The pointer passed to this function must be
// a valid non-null pointer of type `F`. We've carefully
// reviewed the documentation for our C lib and know
// that is the case.
unsafe extern "C" fn call_closure<F>(
data: *mut libc::c_void,
event: ffi::event_t,
)
where
F: FnMut(ffi::event_t),
{
let callback_ptr = data as *mut F;
let callback = &mut *callback_ptr;
callback(event);
}
unsafe extern "C" fn drop_box<T>(data: *mut libc::c_void) {
Box::from_raw(data as *mut T);
}
</code></pre>
<p>There’s a lot going on here, so let’s look at each piece one at a time.</p>
<pre><code class="prettyprint lang-rust">fn register_c_callback<F>(widget: &mut ffi::widget_t, callback: F)
where
F: FnMut(ffi::event_t) + 'static,
</code></pre>
<p>The signature of this function is pretty standard for Rust. We don’t care if the function has mutable state, so we take <code class="prettyprint">FnMut</code> instead of <code class="prettyprint">Fn</code>. The <code class="prettyprint">'static</code> bound is needed since the callback given to us is going to be called after <code class="prettyprint">register_c_callback</code> returns. This assumes that we need the function to be valid for some unknown period of time, which is the most common case in my experience. It’s possible to have less strict bounds for this, but it’s hard to do safely<sup id="fnref1"><a href="#fn1">1</a></sup>. For this example, we’re working with a single threaded application. If we were passing this to an API that might call it from another thread, we would need to add a <code class="prettyprint">Send</code> bound as well.</p>
<pre><code class="prettyprint lang-rust">let data = Box::into_raw(Box::new(callback));
</code></pre>
<p>Since we need the closure to live for some unknown period of time, we need to move it onto the heap. We then immediately call <code class="prettyprint">Box::into_raw</code> on it, which will give us a raw pointer to the closure, and prevent the memory from being de-allocated.</p>
<pre><code class="prettyprint lang-rust">ffi::widget_register_callback(
widget,
data as *mut _,
call_closure::<F>,
drop_box::<F>,
);
</code></pre>
<p>Here we’re actually calling the underlying C function. Since the type of <code class="prettyprint">data</code> is <code class="prettyprint">*mut F</code>, and our C API expects <code class="prettyprint">void *</code>, we need to explicitly cast it. Finally, we’re passing our two function pointers. Both of these functions are generic, so we’re giving it the concrete type of the closure it’s calling. This is one of the few times you’ll ever see an explicit <a href="https://turbo.fish">turbofish</a> without actually calling the function.</p>
<pre><code class="prettyprint lang-rust">unsafe extern "C" fn call_closure<F>(
data: *mut libc::c_void,
event: ffi::event_t,
)
where
F: FnMut(ffi::event_t),
</code></pre>
<p>Here we declare the first of our two functions we’re passing to C. Since it’s meant to be called from C code, the function is defined as <code class="prettyprint">extern "C"</code> to tell the Rust compiler to use C’s ABI here. Usually <code class="prettyprint">extern "C"</code> functions also need <code class="prettyprint">#[no_mangle]</code> to disable Rust’s automatic name mangling. However, this function is never called by name<sup id="fnref2"><a href="#fn2">2</a></sup>. We give it to C by passing function pointers directly, so <code class="prettyprint">#[no_mangle]</code> isn’t needed here. Even though this is a function meant to be called from C, we still need to mark it as <code class="prettyprint">unsafe</code>, otherwise safe Rust could trigger undefined behavior by passing <code class="prettyprint">ptr::null_mut()</code></p>
<pre><code class="prettyprint lang-rust">let callback_ptr = data as *mut F;
let callback = &mut *callback_ptr;
callback(event);
</code></pre>
<p>Since our C API wanted a function that takes <code class="prettyprint">void *</code> as it’s first argument, that’s how we declared the function<sup id="fnref3"><a href="#fn3">3</a></sup>. This means that the first thing we need to do is cast it to a pointer of the right type. We then turn that raw pointer into a Rust reference. Finally, with a <code class="prettyprint">&mut F</code>, we can actually call the Rust closure with our event.</p>
<pre><code class="prettyprint lang-rust">unsafe extern "C" fn drop_box::<T>(data: *mut libc::c_void) {
Box::from_raw(data as *mut T);
}
</code></pre>
<p>And lastly, we have our destructor function. <code class="prettyprint">Box::from_raw</code> will recreate a new <code class="prettyprint">Box<T></code> from our raw pointer. Once we have a regular Rust <code class="prettyprint">Box</code>, it’ll automatically be dropped when this function returns, freeing the underlying memory, and calling the destructors of any values captured by our closure.</p>
<p>It’s surprisingly little code for such a complex operation<sup id="fnref4"><a href="#fn4">4</a></sup>. And true to Rust, this is done as a zero cost abstraction. This is all done using <a href="https://www.youtube.com/watch?v=wxPehGkoNOw">monomorphized generic functions</a>, so there’s no indirection or allocation beyond what’s absolutely required.</p>
<p>And this composes <em>really well</em>. At this point any additional code we write can operate purely in safe Rust. For example, if we want to wrap that <code class="prettyprint">ffi::event_t</code> in a Rust abstraction, we can just add more closures!</p>
<pre><code class="prettyprint lang-rust">impl Widget {
fn register_callback(&mut self, callback: impl FnMut(Event)) {
register_c_callback(
&mut self.ffi_widget,
move |ffi_event| {
let event = Event::from_raw(ffi_event);
callback(event);
}
);
}
}
</code></pre>
<p>Your closures can even have mutable state with no fuss!</p>
<pre><code class="prettyprint lang-rust">let mut x = 0;
widget.register_callback(move |_| {
x += 1;
println!("I was called {} times", x);
});
</code></pre>
<p>This will print an incrementing number every time it’s called as you’d expect. With this little bit of code, we’ve got a full bridge between Rust closures and our C API. You can do anything you would be able to do with the same API written natively in Rust, and consumers of this code never need to know the difference.</p>
<p>All of the code in this article is based on real code from Diesel, which allows you to use Rust closures as the implementation of custom SQL functions on SQLite. You can find the code <a href="https://github.com/diesel-rs/diesel/blob/c127361e1e20557706c18c314702da3d230d4de2/diesel/src/sqlite/connection/raw.rs#L67-L109">here</a> and <a href="https://github.com/diesel-rs/diesel/blob/c127361e1e20557706c18c314702da3d230d4de2/diesel/src/sqlite/connection/raw.rs#L142-L204">here</a>.</p>
<p>When I first started working on this feature for Diesel, I knew I wanted an API that let you use a closure, and I expected making that work with SQLite’s C library to be much harder than it actually was. But ultimately the code required boils down to a bit of pointer juggling wrapped in an <code class="prettyprint">extern "C" fn</code>. The fact that the language can handle this with so little work really shows how powerful Rust’s abstractions are, and how well they can compose into unexpected use cases.</p>
<div class="footnotes">
<hr>
<ol>
<li id="fn1">
<p>To have any shorter requirement on our closure than <code class="prettyprint">'static</code>, we need to know exactly how long it needs to be valid for. This will most likely look something like “the closure given must live until some other function is called”. Representing this in rust usually means calling whatever that second function is in the same place we register the callback. e.g. <a href="#fnref1">↩</a></p>
<pre><code class="prettyprint lang-rust">fn process_with_callback<'a, F>(
widget: &'a mut ffi::widget_t,
callback: F,
)
where
F: FnMut(ffi::event_t) + 'a,
{
register_c_callback(widget, callback);
process_events(widget);
}
</code></pre>
</li>
<li id="fn2">
<p>C code actually couldn’t call these functions by name even if we wanted them to. Since the functions are generic, they don’t actually represent a single symbol in the final binary, but one per concrete type passed to the function, so C could never call it by name. That doesn’t mean you can’t use generic functions in C FFI, but it does mean you always have to give the concrete type from Rust code, and C can never call the functions by name. <a href="#fnref2">↩</a></p>
</li>
<li id="fn3">
<p>In theory we could have declared this function as taking <code class="prettyprint">*mut F</code> directly, or even possibly <code class="prettyprint">&mut F</code>. However, this would mean that <code class="prettyprint">call_closure::<F></code> is the wrong type, and needs to get cast to the right one (which we might be doing anyway since some C APIs take <code class="prettyprint">void*</code> instead of function pointers with an explicit signature). Casting our function pointer will compile, but accidentally casting <code class="prettyprint">*mut c_void</code> to a possibly unsized type will fail. <a href="#fnref3">↩</a></p>
</li>
<li id="fn4">
<p>One thing I’ve left out here is panic handling. Unwinding through FFI is currently undefined behavior in Rust. You may wish to wrap all of this in <code class="prettyprint">catch_unwind</code> and either report the error through whatever mechanisms the C API you’re interacting with gives or abort the process. However, there is some active discussion around precisely what the rules should be here! If you’re interested, check out the <a href="https://github.com/rust-lang/project-ffi-unwind/">ffi-unwind project group</a>. <a href="#fnref4">↩</a></p>
</li>
</ol>
</div>
tag:blog.sagetheprogrammer.com,2014:Post/thoughts-on-arbitrary-pagination2019-09-18T11:00:26-07:002019-09-18T11:00:26-07:00Thoughts on Arbitrary Pagination<p>Pagination is the act of breaking a data set into multiple pages to limit the amount of data that has to be processed and sent by a server at once. We’re going to be changing how pagination works on crates.io, and I wanted to share some musings about the issues with supporting this as a generic abstraction. While I’m going to be talking about some PostgreSQL internals in this article, the general ideas presented apply to any SQL database.</p>
<h1 id="quotsimplequot-pagination_1">“Simple” Pagination <a class="head_anchor" href="#quotsimplequot-pagination_1">#</a>
</h1>
<p>The most common form of pagination uses page numbers. This is done by having numbered page links shown towards the bottom of the UI, and including <code class="prettyprint">?page=5</code> in the URL. Often there are also links for next page, previous page, first page, and last page. The UI might look like this.</p>
<p><a href="https://svbtleusercontent.com/f5SmCAs8Te61nM6j1mi2Ua0xspap.png"><img src="https://svbtleusercontent.com/f5SmCAs8Te61nM6j1mi2Ua0xspap_small.png" alt="Screen Shot 2019-08-20 at 10.02.27 AM.png"></a></p>
<p>This form of pagination is implemented on the server by adding an offset to the query. The total is also needed to know how many pages there actually are. For example, the query to get the 5th page of crates might look like this:</p>
<pre><code class="prettyprint lang-sql">SELECT *, COUNT(*) OVER () FROM (
SELECT * FROM crates
ORDER BY name ASC
) t
LIMIT 10
OFFSET 40
</code></pre>
<p>This is by far the most common form of pagination, but it has a lot of issues. The first is that it’s inconsistent. Any rows added or deleted to pages before the one you’re currently on will affect the results you get back, leading to skipped results or results being shown twice. However, the bigger issue for crates.io is that using <code class="prettyprint">OFFSET</code> for pagination doesn’t scale.</p>
<h1 id="the-issue-with-code-classprettyprintoffsetcod_1">The issue with <code class="prettyprint">OFFSET</code> <a class="head_anchor" href="#the-issue-with-code-classprettyprintoffsetcod_1">#</a>
</h1>
<p>If you don’t care about the technical details of why <code class="prettyprint">OFFSET</code> is slow, just know that <code class="prettyprint">OFFSET</code> takes linear<sup id="fnref1"><a href="#fn1">1</a></sup> time. Discussion of alternatives and their implementation starts <a href="#better-pagination_1">here</a>.</p>
<p>Paginated data needs to have some consistent ordering. To make that ordering happen quickly, we will always want to have some index on the data being sorted. The default index type (and the only one that can be used for ordering<sup id="fnref2"><a href="#fn2">2</a></sup>) is B-Tree. A <a href="https://en.wikipedia.org/wiki/B-tree">B-Tree</a> is a generalization of a <a href="https://en.wikipedia.org/wiki/Binary_search_tree">Binary Search Tree</a>. The differences between them are mostly related to the costs of memory or disk access, which isn’t relevant for the discussion at hand. For the sake of simplicity, we’re going to pretend that these indexes used perfectly balanced binary search trees, as it doesn’t change the performance characteristics for the purposes of our discussion.</p>
<p>Balanced binary search trees and B-Trees are both very efficient for finding a specific value (or the first value that is greater/less than some specific value). From each node, we are able to get all values that are greater than or less than the value at that node. This means that finding all crates published after a given date takes at most <code class="prettyprint">log(total_crate_count)</code> steps, as each step reduces the data we have to search by half.</p>
<p>But we’re not looking for some specific known value, we’re looking for the nth record in the set. A binary tree offers us no help here! In order to find the nth element, we have to traverse the whole tree starting from the lowest element, counting how many nodes we’ve seen until we get to the offset<sup id="fnref3"><a href="#fn3">3</a></sup>. So for the purposes of calculating <code class="prettyprint">OFFSET</code>, the only benefit the index is giving us is starting from a pre-sorted list. However: for that benefit, actually accessing each row is slower<sup id="fnref4"><a href="#fn4">4</a></sup>. In fact, the difference is substantial enough that the cost of iterating over the index eventually becomes greater than the cost of just sorting the table and iterating over that. We can see this reflected by the query planner and execution times as we increase the offset<sup id="fnref5"><a href="#fn5">5</a></sup>.</p>
<pre><code class="prettyprint lang-plain">[local] sean@cargo_registry=# explain analyze select * from crates order by name asc limit 100 offset 0;
QUERY PLAN
--------------
Limit (cost=0.08..32.84 rows=100 width=1044) (actual time=0.056..0.761 rows=100 loops=1)
-> Index Scan using index_crates_name_ordering on crates (cost=0.08..9475.16 rows=28922 width=1044) (actual time=0.055..0.751 rows=100 loops=1)
Planning Time: 0.978 ms
Execution Time: 0.785 ms
(4 rows)
[local] sean@cargo_registry=# explain analyze select * from crates order by name asc limit 100 offset 10000;
QUERY PLAN
-------------
Limit (cost=3276.14..3308.90 rows=100 width=1044) (actual time=20.237..20.377 rows=100 loops=1)
-> Index Scan using index_crates_name_ordering on crates (cost=0.08..9475.08 rows=28922 width=1044) (actual time=0.010..20.018 rows=10100 loops=1)
Planning Time: 0.571 ms
Execution Time: 20.405 ms
(4 rows)
[local] sean@cargo_registry=# explain analyze select * from crates order by name asc limit 100 offset 18000;
QUERY PLAN
--------------
Limit (cost=4304.39..4304.44 rows=100 width=1044) (actual time=100.250..100.272 rows=100 loops=1)
-> Sort (cost=4295.39..4309.85 rows=28922 width=1044) (actual time=97.291..99.714 rows=18100 loops=1)
Sort Key: name
Sort Method: quicksort Memory: 29037kB
-> Seq Scan on crates (cost=0.00..3866.77 rows=28922 width=1044) (actual time=0.009..14.576 rows=28922 loops=1)
Planning Time: 0.121 ms
Execution Time: 101.942 ms
(7 rows)
</code></pre>
<p>Even though this query is nearly instant when fetching early rows, we quickly see an increase in growth as <code class="prettyprint">OFFSET</code> increases, and the query eventually becomes unacceptably slow for a high traffic endpoint. This isn’t going to be a problem for all data sets. Maybe the data you’re paginating over is small enough that you can offset to the end quickly. Maybe nobody ever visits page 1800. But if you’ve got an API that is routinely crawled (which might just be googlebot indexing your site), you’re likely to run into this problem sooner or later.</p>
<p>This can lead to some baffling graphs where the execution time of a query appears to be correlated with how frequently it’s run (top graph is executions per minute, bottom is average execution time – each bar represents 1 hour):</p>
<p><a href="https://svbtleusercontent.com/xnvgRCd6UqiDeox6cg17ST0xspap.png"><img src="https://svbtleusercontent.com/xnvgRCd6UqiDeox6cg17ST0xspap_small.png" alt="Screen Shot 2019-08-20 at 2.52.12 PM.png"></a></p>
<p>In this case the spikes in execution count are from the API being crawled, meaning we’re getting more requests where we have a high <code class="prettyprint">OFFSET</code>, causing an increase in the average execution time.</p>
<h1 id="better-pagination_1">Better Pagination <a class="head_anchor" href="#better-pagination_1">#</a>
</h1>
<p>Now that we’ve seen why traditional pagination is slow, let’s talk about what we can do instead. The technique we’re going to discuss has a lot of names. I’ve seen it called seek based pagination, cursor pagination, keyset pagination, and more. I’m going to refer to it as seek based pagination, as I think it’s the most appropriate name<sup id="fnref6"><a href="#fn6">6</a></sup>.</p>
<p>Regardless of the name, the basic idea is the same. Instead of having numbered pages, we only have links for next and previous page. In some contexts (such as infinite scroll or “load more” pagination), we don’t even need a previous page link. The UI might look like this:</p>
<p><a href="https://svbtleusercontent.com/fAZia7pQHKERT65TkaSRaV0xspap.png"><img src="https://svbtleusercontent.com/fAZia7pQHKERT65TkaSRaV0xspap_small.png" alt="Screen Shot 2019-08-20 at 3.39.53 PM.png"></a></p>
<p>The basic idea behind seek based pagination is that instead of having numerical pages, your next/previous page links reference the first/last result in the set on the current page. So instead of “Next Page” going to <code class="prettyprint">?page=5</code>, it would instead go to <code class="prettyprint">?after=643</code>, where <code class="prettyprint">643</code> is the ID of the last thing displayed on the page.</p>
<p>To implement this form of pagination, your query needs to change slightly. You still use <code class="prettyprint">LIMIT</code> to make sure you are only retrieving one page of records. However, with seek pagination, instead of using <code class="prettyprint">OFFSET</code>, you will filter the records to only include those after some known value. So if you were paginating by id, it’d be <code class="prettyprint">WHERE id > {last id on previous page}</code>. For name it’d be <code class="prettyprint">WHERE name > {last name on previous page}</code>.</p>
<p>Many implementations I’ve seen have this field be the actual value you’re sorting by, but I don’t think that’s necessary. You can substitute <code class="prettyprint">name > 'name of crate 1234'</code> with <code class="prettyprint">name > (SELECT name FROM crates WHERE id = 1234)</code> with virtually no change in performance characteristics, as long as the column you’re searching by is the primary key or has a unique index. I do think there are a few good use cases for taking input other than the primary key<sup id="fnref7"><a href="#fn7">7</a></sup>, but in general taking the public identifier for the row as your input is the most simple and flexible solution.</p>
<p>If we only ever sorted by name, we’d be pretty much done here. Our abstractions over pagination mostly exist to calculate offset, and help grab the total from the query. Seek based pagination doesn’t need either of those things, so we could possibly remove that abstraction entirely. It would also be fairly simple to write an abstraction over “here’s a column, a table, and an id please paginate”. Either way, the implementation would be fairly simple. This is where most literature I’ve read on this subject ends.</p>
<h1 id="supporting-more-complex-ordering_1">Supporting More Complex Ordering <a class="head_anchor" href="#supporting-more-complex-ordering_1">#</a>
</h1>
<p>The first issue comes from the fact that we sometimes sort by multiple columns. For example, on the crates search page, we <em>always</em> sort by “name exactly matches search query” before anything else, since we want to show any crates that are exactly what you searched for first. In addition to that, often we want to sort by values with duplicates (download count, relevance) – so we need to add an additional sort expression at the end to make sure our results are deterministic.</p>
<p>Most of the places we use pagination today actually have non-deterministic ordering, which we need to fix. It’s surprisingly easy to do on accident. Some databases will refuse to execute a query without deterministic ordering, but it’s surprisingly easy to accidentally get this wrong. With <code class="prettyprint">OFFSET</code>, your backend will generally (but is not guaranteed to) do the right thing. But with seek pagination, if your ordering has duplicates, you might end up with a “Next Page” link that displays the same results as the previous one.</p>
<p>So we need something that can handle multiple columns. For crates.io, we use PostgreSQL. <a href="https://www.postgresql.org/docs/current/rowtypes.html">PG supports composite values</a>, which we could use to support multiple columns without too much additional complexity. Our query to sort by all time downloads with a search term might look like this:</p>
<pre><code class="prettyprint lang-sql">WHERE (
canon_crate_name(name) = canon_crate_name('foo'),
crates.downloads,
id
) < (SELECT
canon_crate_name(name) = canon_crate_name('foo'),
crates.downloads,
id
FROM crates WHERE id = 1234 LIMIT 1
)
ORDER BY
canon_crate_name(name) = canon_crate_name('foo') DESC,
crates.downloads DESC,
id DESC
LIMIT 100
</code></pre>
<p>This approach is more complex, but it still seems fairly reasonable both to implement a generic abstraction for, or to have by hand. But this still has problems. The first is that using a tuple for this means we have to use the same comparison operator for all our sort expressions. This can’t support order clauses which use multiple directions. So we can’t use row comparison for this, we’ll need to fall back to each condition individually. So our query will end up looking like:</p>
<pre><code class="prettyprint lang-sql">WITH order_row AS (
SELECT
canon_crate_name(name) = canon_crate_name('foo') AS matches,
downloads,
FROM crates
WHERE id = 1234
LIMIT 1
)
SELECT ... FROM crates ...
WHERE
(canon_crate_name(name) = canon_crate_name('foo')) <
(SELECT matches FROM order_row LIMIT 1)
OR (
(canon_crate_name(name) = canon_crate_name('foo')) =
(SELECT matches FROM order_row)
AND
downloads < (SELECT downloads FROM order_row)
) OR (
(canon_crate_name(name) = canon_crate_name('foo')) =
(SELECT matches FROM order_row)
AND
downloads = (SELECT downloads FROM order_row)
AND
id > 1234
)
ORDER BY
canon_crate_name(name) = canon_crate_name('foo') DESC,
downloads DESC,
id ASC
LIMIT 100
</code></pre>
<p>I’ve swapped the order for <code class="prettyprint">id</code> for the sake of example. Now we can support arbitrary directions, since each part of the where clause is written out explicitly. With something this complex, we <em>must</em> be done, right? Unfortunately even this still is not yet enough.</p>
<h1 id="don39t-forget-about-null_1">Don’t Forget About Null <a class="head_anchor" href="#don39t-forget-about-null_1">#</a>
</h1>
<p>Every example I’ve given so far will fall apart as soon as <code class="prettyprint">NULL</code> is involved. Even in articles I’ve seen that talk about multi-column ordering assume that no value is <code class="prettyprint">NULL</code>. And this completely throws a wrench into what we’ve written so far. The issue isn’t even that the ordering will be wrong. Records with <code class="prettyprint">NULL</code> in one of the sort expressions won’t even show up.</p>
<p>In SQL, <code class="prettyprint">NULL</code> behaves much like <code class="prettyprint">NaN</code> when dealing with floats. Passing <code class="prettyprint">NULL</code> to either side of any operator other than <code class="prettyprint">OR</code> returns <code class="prettyprint">NULL</code>. <code class="prettyprint">anything = NULL</code> is <code class="prettyprint">NULL</code>. <code class="prettyprint">anything > NULL</code> is <code class="prettyprint">NULL</code>. <code class="prettyprint">anything AND NULL</code> is <code class="prettyprint">NULL</code>. Since <code class="prettyprint">NULL</code> is considered falsey when filtering, any record with <code class="prettyprint">NULL</code> in any column considered by the order clause will not get returned.</p>
<p>This gets even more complicated if you want to support <em>any</em> order clause, which means dealing with <code class="prettyprint">NULLS FIRST</code> and <code class="prettyprint">NULLS LAST</code> (for reference, <code class="prettyprint">ASC</code> is equivalent to <code class="prettyprint">ASC NULLS LAST</code>, and <code class="prettyprint">DESC</code> is equivalent to <code class="prettyprint">DESC NULLS FIRST</code>)</p>
<p>Luckily PostgreSQL has some operators that will help us deal with at least a bit of this complexity. We can replace all uses of <code class="prettyprint">=</code> with <code class="prettyprint">IS NOT DISTINCT FROM</code>, which behaves exactly like <code class="prettyprint">=</code> except it doesn’t propagate <code class="prettyprint">NULL</code>. For comparison operators, we’ll need to manually check for <code class="prettyprint">NULL</code> to get the correct behavior, since <code class="prettyprint">1 > NULL IS NOT DISTINCT FROM NULL > 1</code></p>
<table>
<thead>
<tr>
<th>Direction</th>
<th>Nulls</th>
<th>Operator</th>
</tr>
</thead>
<tbody>
<tr>
<td><code class="prettyprint">ASC</code></td>
<td><code class="prettyprint">FIRST</code></td>
<td><code class="prettyprint">x IS NOT NULL AND x > y IS NOT FALSE</code></td>
</tr>
<tr>
<td><code class="prettyprint">ASC</code></td>
<td><code class="prettyprint">LAST</code></td>
<td><code class="prettyprint">x IS NULL OR x > y</code></td>
</tr>
<tr>
<td><code class="prettyprint">DESC</code></td>
<td><code class="prettyprint">FIRST</code></td>
<td><code class="prettyprint">x IS NOT NULL AND x < y IS NOT FALSE</code></td>
</tr>
<tr>
<td><code class="prettyprint">DESC</code></td>
<td><code class="prettyprint">LAST</code></td>
<td><code class="prettyprint">x IS NULL OR x < y</code></td>
</tr>
</tbody>
</table>
<p>So our final, actually working query would look something like:</p>
<pre><code class="prettyprint lang-sql">WITH order_row AS (
SELECT
canon_crate_name(name) = canon_crate_name('foo') AS matches,
downloads,
FROM crates
WHERE id = 1234
LIMIT 1
)
SELECT ... FROM crates ...
WHERE
(canon_crate_name(name) = canon_crate_name('foo')) <
(SELECT matches FROM order_row LIMIT 1)
OR (
(canon_crate_name(name) = canon_crate_name('foo')) =
(SELECT matches FROM order_row)
AND (
downloads IS NOT NULL
AND
downloads < (SELECT downloads FROM order_row) IS NOT FALSE
)
) OR (
(canon_crate_name(name) = canon_crate_name('foo')) =
(SELECT matches FROM order_row)
AND
downloads IS NOT DISTINCT FROM (SELECT downloads FROM order_row)
AND
id > 1234
)
ORDER BY
canon_crate_name(name) = canon_crate_name('foo') DESC,
downloads DESC,
id ASC
LIMIT 100
</code></pre>
<p>Of course that query was written manually, so I can omit null checks for columns I know are not null, inline values that I know can be inlined, and use <a href="https://www.postgresql.org/docs/current/queries-with.html">a CTE</a> for readability. However, the goal here is to generate this from an abstraction which handles arbitrary where clauses. So the actual query <a href="https://gist.github.com/sgrif/cf3de5b988d417851f3e0d6a2a3cd269">is a good bit more verbose</a>. That query, while much harder to read, will have the same performance characteristics<sup id="fnref8"><a href="#fn8">8</a></sup>.</p>
<p>{This is no longer in the realm of something you can reasonably write by hand. A generic abstraction is 100% needed, especially with the branching/interacting ordering of crates/search}</p>
<h1 id="could-this-be-easier_1">Could This Be Easier? <a class="head_anchor" href="#could-this-be-easier_1">#</a>
</h1>
<p>I was really surprised that I ended up needing to write this by hand. What I really want is a function that takes some arbitrary <code class="prettyprint">ORDER BY</code> clause, and returns a value that implements comparison operators accordingly. What I really want to write here is:</p>
<pre><code class="prettyprint lang-sql">WHERE ORDERING(
canon_crate_name(name) = canon_crate_name('foo') DESC,
downloads DESC,
id ASC
) > (SELECT ORDERING(
canon_crate_name(name) = canon_crate_name('foo') DESC,
downloads DESC,
id ASC
) FROM crates WHERE id = 1234 LIMIT 1)
</code></pre>
<p>This is the sort of feature that the authors of PostgreSQL seem to always think of before I know I need it, so I was surprised to see something similar didn’t already exist. Unfortunately this would be hard to implement without native support, since proper index usage is critical here<sup id="fnref9"><a href="#fn9">9</a></sup></p>
<p>A big part of the issue here is that there’s very little tooling for this form of pagination. Very few pagination libraries even support it at all. If they do, it often only handles single column ordering or treats it as an afterthought. As time goes on, the need for more scalable pagination is going to become more common, and this needs better support across the board. Arguably, it should even be the default.</p>
<p>Thankfully with <a href="https://diesel.rs">Diesel</a>, an implementation of this isn’t too unreasonable. Since everything is represented in the type system, we just need an <code class="prettyprint">order_by</code> function that takes one of <code class="prettyprint">Asc</code>, <code class="prettyprint">Desc</code>, <code class="prettyprint">NullsFirst</code>, or <code class="prettyprint">NullsLast</code> (for simplicity of implementation, this will probably require explicitly calling <code class="prettyprint">.asc()</code> instead of passing a column directly). There’s still room for logic errors to sneak in, but we can lean on the type system quite a bit to help keep the complexity manageable.</p>
<p>If you start to look closely you might see that a lot of folks who serve large data sets and use numbered pagination… Actually don’t. For example, Google still has numbered pages. But they’ll cut you off long before you get to the point where there would be performance issues (they only serve 1k records total per query).</p>
<p>Seek based pagination isn’t without its drawbacks. One thing that is much harder with this form of pagination is detecting the first/last page. You could detect if you’re on the last page by loading N+1 records. Detecting the first page could be done similarly, but I don’t think either of those are worth the complexity. If you got any pagination parameters at all, you can assume you’re not on the first page. If you got <code class="prettyprint">per_page</code> records back, you’re not on the last page. This means that if your total number of records is divisible by your page size, you’ll have an empty last page. And if someone clicks “Previous Page” from page 2, they might have a clickable “Previous Page” link that goes nowhere (but really do you even need “Previous Page”? Browsers have a back button). Neither of these come up in practice for human users often enough to be concerning. Bots aren’t crawling backwards, and they can deal with an empty last page.</p>
<p>If you’re paginating data in 2019, consider whether you really need numbered page links. Are they actually useful? There are absolutely some use cases where they’re necessary, but I think these are mostly niche and typically for internal systems that can afford the performance tradeoffs. Often times when folks go to numbered pages directly, they’re doing it as a proxy for “seek X% in the data” which is something that we can provide better UI for, and implement efficiently in ways other than <code class="prettyprint">OFFSET</code>.</p>
<div class="footnotes">
<hr>
<ol>
<li id="fn1">
<p>It’s arguably super-linear, but Big-O isn’t really helpful when the changes in performance characteristics are based on in-memory vs on-disk vs disk seeking. Just know that traversing an index linearly is much more expensive than traversing a table linearly, and how much more expensive is based on configuration and hardware. <a href="#fnref1">↩</a></p>
</li>
<li id="fn2">
<p>Technically BRiN indexes support the operators that would be used for ordering, but they aren’t useful for <code class="prettyprint">ORDER BY</code> clauses. These indexes cause each heap page to include the max/min value for that page, so some pages can be skipped when filtering. This means they will only be used when those operators appear directly in a <code class="prettyprint">WHERE</code> clause. Though a BRiN index combined with the methods described in this post would create an interesting scenario where fetching the last page of a result set is nearly instant, but the first page is <code class="prettyprint">O(n log(n))</code> over the size of the table. <a href="#fnref2">↩</a></p>
</li>
<li id="fn3">
<p>You might think we could work around this by having each node store how many children it has. But that’s not a question that can be answered usefully for a database, since different transactions have different answers to that question. It’s not just a question of traversal, we also have to check if each node is actually visible to our transaction or not. For more details, see <a href="https://wiki.postgresql.org/wiki/Slow_Counting">why counting in PostgreSQL is slow</a>. <a href="#fnref3">↩</a></p>
</li>
<li id="fn4">
<p>The details of how much slower this is depends on your shared buffer cache size, and the storage hardware you’re using. If you’re using a traditional disk drive, random access of rows potentially means many extra disk seeks, which is the absolute slowest way to access data on the same physical machine. If you’re using a solid state drive, random page access is going to be the same cost as sequential page access, assuming you would have read both from disk otherwise (which is not necessarily true). <a href="#fnref4">↩</a></p>
</li>
<li id="fn5">
<p>Depending on your local configuration and what type of storage you’re using, the exact point at which the query planner stops using the index may not be the actual point that the query becomes faster from a seq scan. If you’re using an SSD, setting <code class="prettyprint">random_page_cost</code> to <code class="prettyprint">1.0</code> will make it much more accurate in that prediction. Even if you’re using an SSD though, you will still see a linear increase in time based on the value of <code class="prettyprint">OFFSET</code>, and there is still eventually going to be a point where it’s faster to not use the index at all. If you’re at the point where <code class="prettyprint">OFFSET</code> is adding tens of milliseconds to your query, it’s time to look at alternatives regardless of whether your index can be used or not. <a href="#fnref5">↩</a></p>
</li>
<li id="fn6">
<p>Keyset based pagination implies that this is being used only when ordering by the primary key. Which is perhaps a very common case, but not relevant to the strategy as a whole, nor does that situation apply to me. Cursor based pagination… Well cursors are a thing in SQL, and if you’re writing a desktop application where you have a direct DB connection, it’s probably the best way to do pagination. But using cursors for this requires the same database connection be used for all subsequent requests, which doesn’t really work for web apps being served over HTTP. Unfortunately <a href="https://slack.engineering/evolving-api-pagination-at-slack-1c1f644f8e12">cursor is a bit of an overloaded term in this case</a> <a href="#fnref6">↩</a></p>
</li>
<li id="fn7">
<p>For example, an API endpoint meant to give the most recent records/changes would benefit from being able to provide a timestamp instead of an ID, so clients can tell you the last time they asked without having to track the last ID they got back. <a href="#fnref7">↩</a></p>
</li>
<li id="fn8">
<p>Technically it’s marginally slower, since each of those subselects is going to be execute separately. Keep in mind that there’s no round trip time here though, and the actual execution time of “select 1 row by its primary key” is around 10 microseconds. So the query is roughly .05 ms slower than the CTE version, and will scale roughly linearly with the number of joins in the from clause. That seems acceptable. <a href="#fnref8">↩</a></p>
</li>
<li id="fn9">
<p>Yes, I’m aware that the first piece of the order by clause I’ve used in every example will prevent index usage. It’s not relevant to the point I’m making. <a href="#fnref9">↩</a></p>
</li>
</ol>
</div>
tag:blog.sagetheprogrammer.com,2014:Post/moving-on-from-rails-and-whats-next2019-04-02T11:43:23-07:002019-04-02T11:43:23-07:00Moving on from Rails and what's next<p>It’s been more than 6 years since my first commit to Ruby on Rails. I had just gotten my first full time Ruby position, was excited to move away from PHP, and wanted to give back. Since then I made 1452 commits to the project. Today, I am finally ready to move on from Rails.</p>
<p>In 2014 I started working on the <a href="https://api.rubyonrails.org/classes/ActiveRecord/Attributes/ClassMethods.html#method-i-attribute">Active Record Attributes API</a>. We got to a working implementation fairly quickly, but it took many more months and thousands of lines of code to get to an implementation that I was comfortable shipping. By the time Rails 4.2 came out, I had ended up rewriting a significant portion of the library, which meant I was spending more and more of my time maintaining that code and fixing other issues.</p>
<p>Around the time that 4.2 was entering beta, I decided it was time to try and take this commitment full time. I spent the next 4 years devoting as much of my professional time as possible to the success of Rails, supported by wonderful companies like thoughtbot and Shopify.</p>
<p>A lot has happened during that time. I created <a href="https://diesel.rs">Diesel</a>, an ORM for Rust. In April of last year, I began managing the operations of crates.io, which eventually led to the creation of the crates.io team which I co-lead. I also started to find myself less able to effectively contribute to Rails. It became clear that I have a different vision for the future, and that I would never make it onto the core team.</p>
<p>In October I left Shopify, and stopped working on Rails. Stepping away from Rails was a difficult decision for me. For a long time, “Rails committer” was a big part of my personal identity. Most of my close friends came from the Ruby community. I stared at this confirmation for a lot longer than I should have. <a href="https://svbtleusercontent.com/q3bj5powRfBQDkYPZWTP8N0xspap.png"><img src="https://svbtleusercontent.com/q3bj5powRfBQDkYPZWTP8N0xspap_small.png" alt="Screen Shot 2019-04-01 at 5.10.55 PM.png"></a></p>
<p>The plan was originally that I’d spend a few months focusing on crates.io full time to work through our backlog before finding a new job and cutting my open source work back. It became clear that a few months wasn’t going to cut it, and I’m in a position where I can meaningfully contribute to the Rust organization. So my job search ended, and I decided to figure out a way to focus on Rust full time.</p>
<p>The problem is that working on MIT/Apache licensed software doesn’t exactly help pay the bills. That’s why I’m asking for your help. I’ve spent the last 5 years having a single company sponsor my open source work. This time I’m going to try something different. Right now my goal is to get a handful of medium sized grants from larger companies to support my work on crates.io. If you work for a company that might be interested in helping sponsor me, please <a href="mailto:sean@seantheprogrammer.com">reach out</a>.</p>
<p>If you’re interested in contributing in a smaller fashion, I’ve also started a patreon, which you can find <a href="https://www.patreon.com/seantheprogrammer">here</a>. My goal is to devote 100% of my time to Rust, but I’m available for part time contract work in order to make ends meet as needed.</p>
<p>This is an exciting new chapter in my life, and I’m excited to see how it works out. Thank you to everyone who is helping support me in this goal.</p>