No description
Find a file
2025-10-24 11:18:40 +01:00
.github/workflows Remove test-all-features on release 2025-10-24 02:29:05 +01:00
yaag Rename crate to yaag 2025-10-24 11:18:17 +01:00
yaag_proc_macro Rename crate to yaag 2025-10-24 11:18:17 +01:00
.gitignore Ignore local act files 2025-10-24 02:21:08 +01:00
Cargo.lock [Cargo.lock] 2025-10-24 11:18:40 +01:00
Cargo.toml Rename crate to yaag 2025-10-24 11:18:17 +01:00
README.md Minor README fixes 2025-10-24 02:44:08 +01:00

Yet another generators concept

(or how to abuse the compiler for profit)

What are generators

You might've seen those in Python:

def gen():
    for i in range(10):
        yield i

or in JS:

function* gen() {
    for (let i = 0; i < 10; i++) {
        yield i;
    }
}

They are really convenient at implementing procedures that must be performed one step at the time, preserving it's state between the iterations. For example, one might define iterator chain with them:

def chain(a, b) {
    for e in a:
        yield e
    for e in b:
        yield e
}

But if you ever tried to do the same in Rust, you should've encountered the error:

fn generator() -> i32 {
    for i in 0..10 {
        yield i;
    }
}
error[E0658]: yield syntax is experimental
  --> your_lib/src/lib.rs
   |         yield i;
   |         ^^^^^^^
   = note: see issue #43122 <https://github.com/rust-lang/rust/issues/43122> for more information

I guess, we'll have to wait... or

Implementing generators

There is a bunch of crates implementing generators in Rust. If safe generator is what you need, you better off somewhere else.

The following implementation does use unsafe under the hood with possibly-terrible consequences if the implementation turns out to be incorrect. So I would appreciate help with testing and better unsafe blocks explanation.

Misusing the async to yield

async blocks were designed to create futures. But the state machine they generate is also desirable in case of generators. The only difference being Poll replaced by Generate (or similar):

enum Poll<T> {
    Ready(T),
    Pending,
}
enum Generate<T> {
    Yield(T),
    Done,
}

Well, we can't just replace Future::poll output type, so that's a no-go. But what if we could still pass the yielded value out? Hmmmm...

impl Waker {
    pub const unsafe fn new(
        data: *const (),
        vtable: &'static RawWakerVTable,
    ) -> Waker;

    pub fn data(&self) -> *const ();
}

Now, I have no clue why Waker::data is even a thing, but that's exactly what we need. Here's the rundown:

  • Replace all the yield <expr> with awaits of a special Yield future created with <expr>
  • Convert generator's body into an async block, now referred to as "generator's future"
  • (!!!) Make sure the are NO other await points. Any "usual" await point would immediately trigger UB
  • When the generator is asked for the next element:
    • a temporary output: MaybeUninit<Output> is defined to potentially store the result
    • a special Waker is constructed, with data pointing to output
    • Generator's future is polled with the Context providing the aforementioned Waker
  • Once the await point on the Yield future is reached, it will:
    • take out the <expr> stored inside it
    • write it to the data pointer of the provided Waker
  • If the generator's future poll had returned Poll::Pending, it means that Yield was encountered along the way, and it must have put an element to the output we created. So the output is assumed init and returned out of the generator
  • If poll returned Poll::Ready(()), generator must have reached it's end, and None is returned to the caller

And there you have it -- generators in Rust! Now, there are a couple problems to address...

  1. Yield is hilariously unsafe. Like, writing whatever you have into a random *const () pointer is not particularly safe. Yield must only be used in the code generated by this crate's macro
  2. Memory aliasing -- no reference to output should exist while poll is being called
  3. Dangling pointers -- the special Waker we create should NEVER be passed to normal kind of futures. If does no actually "wake" anything and contains a pointer that will only be valid for a short time after the poll returns
  4. Send and/or Sync are actually the easiest one -- the generator object we create, contains the generated future, so Send and Sync implementations are for compiler to figure out
  5. Pinning. Futures require pinning so that they could reference it's own state. Obviously, the struct we create here must also require this, and should not violate contained future's pin guarantee

Is that it? Have other concerns? Let me know!

Misusing the async to await

But wouldn't it be nice to have the ability to await for something between the yields. Like, maybe you want to asyncronously load some files and then yield the contents as they become available?

Well, with the approach outlined above -- it's actually trivial! All we need a different kind of special future, say... Await. Here's the gist (note how similar it is to the non-async case!):

  • Replace all the yield <expr> with awaits of a special Yield future created with <expr>
  • Replace all the <expr>.await with awaits of a special Await future created with <expr>
  • Convert generator's body into an async block, now referred to as "async generator's future"
  • (!!!) Make sure the are NO other await points. Any "usual" await point would immediately trigger UB
  • When the async generator is polled for the next element:
    • a temporary state: (Option<Output>, Waker) is created to both potentially store the result AND to store the "normal" Waker
    • a special Waker is constructed, with data pointing to state
    • Generator's future is polled with the Context providing the aforementioned Waker
  • Once the await point on the Yield future is reached, it will:
    • take out the <expr> stored inside it
    • write it to the Option<Output> part of the state pointed to by data pointer of the provided Waker
  • Once the await point on the Await future is reached, it will:
    • get the Pin<&mut _> to the future it wraps
    • obtain a Waker from the state pointed to by data pointer of the provided Waker
    • poll the wrapped future with the "real" waker
  • If the generator's future poll had returned Poll::Pending, it means that some future was awaited along thee way:
    • If Option<Output> contains the value, then Yield must have put it there. That's the next item, so we return Some(Poll::Ready(output))
    • If Option<Output> does not contain a value, then Await was encountered. Next item is pending, so we return Some(Poll::Pending)
  • If poll returned Poll::Ready(()), generator must have reached it's end, and None is returned to the caller