A text box with the post title on top of a gradient background that transitions smoothly from cyan to hot pink.

I found some spare time this past week and sat down with a nice brew and Jon Gjengset's excellent Crust of Rust video on declarative macros. For the longest time, macros have felt 'that last part of Rust that I haven't gotten around to checking out'. I've had a vague notion of what they are, but have never quite gotten to exploring them. However, Gjengset's video served as a perfect introduction to declarative macros, and was just enough to get me started.

One thing that was mentioned in the video that I have never thought about before, is that one of the simplest things to do with macros is simple substitution. In fact, that's all a declarative macro can do: given some input, it'll expand to a block of code. That suddenly gave me an idea for writing my own first macro.

Post purpose

This post is intended to be a very brief and basic introduction to declarative macros based on what have I found in the past week. For more comprehensive material, see the 'Further Reading' section at the end.

The post describes one very simple use case for macros, and that's all it's intended to do. In particular, this post will not disccuss

macro syntax
There will be no talk of macro syntax, of capture kinds and patterns, or of clever ways to count.
other use cases
This is not an exploration of all the ways in which you can use declarative macros or where they shine. This is a description of one case that solved a problem that's been irking me.
proc macros
Proc macros are a different subject, and is something I don't know much (or anything, really) about. They're both a form of metaprogramming, but from what I understand, proc macros are quite a bit more complex than declarative macros (and thus more powerful), so it's best left for later.

I assume a basic level of familiarity with Rust, but a deep understanding is not required.

My Little Macro ๐Ÿฆ„

I was very excited when advanced slice patterns were stabilized in Rust 1.42. Among the things I'd been looking forward to was the ability to match on strings as if they were a slice of characters, similar to what you might do in Haskell or Elm. It wasn't immediately obvious how to do it, but I figured something out in the end1:

    fn f(s: &str) {
        match &s.chars().collect::<Vec<char>>() as &[char] {
            ['๐Ÿ’˜', .., '๐Ÿฆ„'] => println!("<3 and horse"),
            ['๐Ÿ’˜', snd, .., '๐Ÿ˜ช'] => println!("Love, {}, and sleeps", snd),
            ['๐Ÿ’˜', ..] => println!("Just <3"),
            _ => {}
        }
    }

However, it's not immediately obvious what's happening on line 2: what exactly does &s.chars().collect::<Vec<char>>() as &[char] mean? Sure, I can tell you that it turns the string into a char slice for matching, but as it stands, it's quite the mouthful. Let's write a macro to make this cleaner and clearer!

Replacement as a form of abstraction

Because a declarative macro is nothing but text substitution2, we should be able to simply abstract away the pesky line from above. Instead, we want to write something like this:

    fn f(s: &str) {
        match chars!(s) {
            ['๐Ÿ’˜', .., '๐Ÿฆ„'] => println!("<3 and horse"),
            ['๐Ÿ’˜', snd, .., '๐Ÿ˜ช'] => println!("Love, {}, and sleeps", snd),
            ['๐Ÿ’˜', ..] => println!("Just <3"),
            _ => {}
        }
    }

To do this, we write a very simple macro with a single pattern:

    macro_rules! chars {
        ($s:expr) => {
            &$s.chars().collect::<Vec<char>>() as &[char]
        };
    }

All it does is replace the macro call (chars!(s)) with the long incantation (&s.chars().collect::<Vec<char>>() as &[char]) when compiling. It's incredibly simple, but it's also incredibly powerful and it's made the code look less cluttered and read better at the same time.

Further reading

If you want to know more about macros, here are some good resources to continue your journey:

Jon Gjengset's Crust of Rust: Declarative Macros
Gjengset spends roughly 90 minutes explaining and demonstrating macros in a very clear fashion by creating a macro that has the same functionality as the standard library's vec! macro. If you're looking for an introduction to what declarative macros are, I absolutely recommend you watch this.
The Little Book of Rust Macros
Gjengset mentions this as a resource in his Crust of Rust video. It's a thorough intro to macros and includes everything from syntax to patterns to macro building blocks and a guide on how to implement esoteric languages using only macros.
The Book, chapter 19.06
As always, the Book is a valuable resource on all things Rust. Compared to the previous items on the list, this is much shorter, but it provides a strong high-level overview.
The Rust Reference on macros
Probably the densest resource on this list, the Rust Reference provides a short, yet comprehensive reference on macros. If you need to quickly look something up, this is a good bet.

Footnotes

Iterating over a string and collecting it into a vector of characters is probably not the most efficient way to work with strings, but it's the best way I've found to turn a string into a slice of (UTF-8) characters that can be matched on. If you've got a better solution for this, let me know! I've been looking for a while.

Well, saying it's just text substitution might be misleading: Because the expanded macro is parsed into the program's abstract syntax tree, it has to be valid Rust. See the first chapter of the Little Book of Rust Macros for more information.


The title of the article on a black background. The Org mode logo is placed inside the first 'O' of 'org mode'.

Even if you have the source code in front of you, there are limits to what a human reader can absorb from thousands of lines of text designed primarily to function, not to convey meaning. --- Ellen Ullman (from quotes about software aging on literateprogramming.com)

I've been intrigued by literate programming for a while now, but never quite found the right opportunity to try it out. All the business with tangling and detangling, calling code from other snippets, and understanding how or even if you can import literate code into non-literate code, made it seem like you'd need to understand a lot just to get started. Fortunately, I found something that doesn't require any of the above: extending your Emacs config.

Using Org mode and literate programming to configure Emacs is something I've heard a lot about, but I found it difficult to find out exactly how to get started. Luckily it turned out to be pretty simple.

Below, I'll show you how to start configuring Emacs with Org mode and what I've learned about it so far, but if you're really eager to just get started, the trick is to use the function org-babel-load-file and give it the path to your Org file.

*Disclaimer*: I am /not/ an expert at literate programming, or even particularly knowledgeable; these are my first steps into this brave new world. If I've made any mistakes or if you've got tips: don't hesitate to reach out.

Who is this for?

This post is aimed at anyone interested in literate programming with Org mode, but who does not know where or how to get started. It assumes some familiarity with Emacs and Org mode, but no further knowledge of programming is required. Further, I assume no prior knowledge of literate programming.

What is literate programming?

I believe that the time is ripe for significantly better documentation of programs, and that we can best achieve this by considering programs to be works of literature. Hence, my title: "Literate Programming."

Those are the words of Donald Knuth, the creator of literate programming. Knuth calls for a form of programming where we shift our focus from instructing the computer what to do, to explaining to another person what we want the computer to do. This forms the basis for literate programming.

A literate source code file inverts the typical notion of a source code file: rather than being source code with comments strewn around, is a text with source code blocks inserted. The exact file format and medium of the file doesn't matter. The most well-known literate programming tool1 is probably Jupyter Notebook.

Org mode

With Emacs, Org mode is arguably the most readily available way to do literate programming. Babel, which has been included in Org mode since version 7.0, enhances Org mode's source blocks by providing (as described in the introductory tutorial)

  • interactive and on-export execution of code blocks
  • code blocks as functions that can be parameterized, that can refer to other code blocks, and that can be called remotely
  • export to files for literate programming

In short, this enables you to write Org documents with source code blocks, where the source code blocks can be interacted with and used to generate pure source code.

Configuring Emacs

It turns out that if you're on a semi-recent version of Emacs, it's really very simple. Babel provides a function called org-babel-load-file which 'Load[s] Emacs Lisp source code blocks in the Org [file]'. As such, all you need to do is to call this function from your Emacs configuration with the path to the Org file you want to load.

Here's what I've got in my configuration:

    (org-babel-load-file
     (expand-file-name
      "config.org"
      user-emacs-directory))

This snippet assumes that the file you're loading is located in a file called config.org in your user-emacs-directory (which defaults to ~/.emacs.d).

This was tested with vanilla Emacs 26.3, so any recent version of Emacs shouldn't need any more configuration than this, but if you want a more detailed guide, check out the Emacs Initialization with Babel section of the Babel introduction.

Benefits of using Org mode for configuration

So far, I've found a number of benefits to using literate programming for my configuration, including (but not limited to):

I'm able to express /why/ I'm making certain configurations
Not that you can't describe what you're doing or why in pure elisp, but I certainly find it harder to be clear about it. Explaining the reasoning behind making certain configuration decisions is not only useful to other people reading my configuration, but also to myself when I return to something after weeks or months (or years) after having last touched it. 'Why did I configure it this way again? Oh, yeah: that's it!"

Furthermore, being able to use a text-based mode to write text not only makes a lot of sense, but it's also much more ergonomic, and offers much more expressibility than code comments.

Formatting
Another bonus of using a text-based mode is that you can take advantage of text formatting. For instance, being able to include links that only display a link text and not a full URL is a nice bonus. Lists (numbered, unnumbered, and definition lists) are also readily available.
Expanding and collapsing regions
If you want to get a quick overview over your configuration, it's easy to quickly toggle the headings in the entire file to find exactly what you're looking for. It's also super easy to narrow your editor view to the section you're focusing on right now.
Using tags
Even better than being able to fold your config and scan through the headings is using tags. Org mode's support for tags makes it a breeze to show only sections that relate to whatever you're looking for. For instance: one of the tags I've defined is keybinding. By tagging every section that deals with key bindings with this, I can quickly whip up a view that shows all the relevant sections by using org-sparse-tree.

Summary

That sums up my journey into literate programming thus far. I still have a lot to learn and have probably only truly understood a fraction of the power that lies within Org mode, but I've gotten over that first step. It's not (as) scary anymore, and I'm looking forward to learning more about it.

If you came here looking for how to use Org mode to configure Emacs or to do literate programming, I hope you found what you were looking for. And if you came here wondering what this was all about, I hope your curiosity was rewarded and that you're still at least as curious or intrigued as before.

Further reading and references

I've included a few links that could be useful or interesting below. Most of them are already linked to in the text above, but included here again for convenience.

Wikipedia on Literate programming
As per usual, Wikipedia offers a concise and informative take on what literate programming is and where it comes from. In addition to the basics, it also includes a very nice illustration of literate programming by showing certain parts of the Unix word count utility wc written using literate programming techniques.
literateprogramming.com
A website about literate programming. There's not much information about the purpose of the website, nothing to be found about the author, and it looks like it hasn't been updated since 2009, but there are some resource links, and a great many quotes about literate programming and how programs tend to be underdocumented.
Babel: active code in Org-mode
This website includes links the Babel reference documentation, the introductory tutorial, a journal paper describing the use of Org mode and Babel for literate programming and reproducible research, and more.
My literate Emacs config
This is the config I use at the time of writing. Note that it is not my complete Emacs configuration, but rather an addition to my main configuration. I'm in the process of (very slowly) moving away from Spacemacs and into my own configuration, so this document reflects that.

Footnotes

Or at least the only one that I hear mentioned at semi-regular intervals.


Building a request inspector

First pass: duct tape and string
The title and the subtitle of the post on a white background. An orange highlight cuts horizontally across the middle.

I've been quite involved with distributed tracing and have spent a fair amount of time looking at the W3C recommendation for dealing with trace context at work lately. As a result of this, I have found myself wanting to inspect the headers on outbound requests to ensure that the framework and libraries we use handle tracing correctly.

But how do you do that? I couldn't find any services or command line utilities that did this (at least not simply), so I set out to build one myself. It's been a while since I did anything in Rust, and this sounded like a fun, little weekend project. Spoiler alert: It wasn't.

I wanted this to be a short and simple tutorial on how to build such a server in Rust, but things didn't go quite as I planned, and it took much more time and effort than I expected. Rather than hiding it and pretending it never happened, though, I'm going to take it and run with it. I'm sure that if I'm running into these issues, I'm not the only one.

Intended audience

This post is intended for people who have at least some experience with Rust, including familiarity with the type system and the borrow checker (at least as concepts). You should also have some passing knowledge of HTTP requests.

This post is not intended to be a thorough tutorial or a list of best practices, but rather to serve as a way of demonstrate how I work. Yes, the code here works and runs as expected, and clippy doesn't complain, but it's not good.

There are code samples below, and you can also check out the repo on GitLab. However, this should not be considered a final version, and there will likely be further updates to the repo later on. This is also not an in-depth analysis of the code, but a short tour of it. In short: it works, but it's far from perfect. In a lot of ways, writing this code felt a lot like how the Oatmeal describes projects coming together in his fantastic comic 'Erasers are Wonderful': full of twists, turns, and toilet fires, and in the end you have something that's good enough.

The goal

I set out to make a simple web server that:

  • would accept requests at any endpoint
  • would accept requests with any method
  • would respond with a JSON object containing data about the request's:
    • headers
    • method
    • path
    • query string

I also wanted to add the request body (if there was one) to the response, but it wasn't the most important issue. Other additional features, such as reading data from environment variables, command line options, logging, etc., could be added later.

How (or: 'show me the code')

When building a web server in Rust, there's a number of frameworks to choose from. I first went with Actix, but after not reading the docs, ended up working directly with Hyper because it was easier to create a function that would handle any request at any route with any method. Or at least it was covered in the initial tutorial.

In addition to Hyper, I'm also pulling in anyhow and serde-json for dealing with errors and working with JSON.

Below, we'll break the program up into functions and look at them one at a time.

Dependencies and imports

Let's get the boring (but very important) bits out of the way. Here's the dependencies section of the Cargo.toml file, as well as the program imports.

Dependencies:

      [dependencies]
      hyper = "0.13"
      tokio = { version = "0.2", features = ["full"] }
      serde_json = "1.0"
      anyhow = "1.0"

Imports:

     use anyhow::Result;
     use hyper::service::{make_service_fn, service_fn};
     use hyper::{Body, HeaderMap, Request, Response, Server};
     use serde_json::json;
     use std::collections::HashMap;
     use std::convert::Infallible;

The main function

     #[tokio::main]
     async fn main() -> Result<(), hyper::error::Error> {
         let make_svc = make_service_fn(|_| async { Ok::<_, Infallible>(service_fn(handle_requests)) });

         let addr = ([127, 0, 0, 1], 8080).into();

         let server = Server::bind(&addr).serve(make_svc);

         println!("Server started. Listening on http://{}", addr);

         server.await
     }

There's a few things happening here, but it's rather self-explanatory. We declare a handler and an address for the server; start the server with the aforementioned handler and address, and wait for it to finish (which happens on termination).

The request handler

     async fn handle_requests(req: Request<Body>) -> Result<Response<Body>> {
         let response_data = json!({
             "headers": to_string_map(req.headers()),
             "path": req.uri().path(),
             "queryString": req.uri().query(),
             "method": req.method().as_str(),
             "version": format!("{:?}", req.version()),
         }).to_string();

         println!("Received request: {:?}", response_data);

         Ok(Response::builder()
            .header("content-type", "application/json")
            .body(Body::from(response_data))?)
     }

This is the meat of the program and really what it's all about: extracting data from the request and returning it to the caller. As this is a very rough proof of concept, I'm mapping the the data into a completely arbitrary JSON structure rather than into a struct.

After mapping, I print the result of the mapping, and return the response with an appropriate content-type.

Serializing the HeaderMap

Serde takes care of serializing most of the data very well, but doesn't like the HeaderMap which contains the request's headers. ~HeaderMap~ is a multimap---a map structure that can associate multiple values with a single key---and as such, doesn't easily serialize to JSON.

To solve this, I decided to turn the HeaderMap into a HashMap<String, String>, simply creating a comma-separated string for headers that have multiple values. Not the most elegant or robust solution, but hey, it works.

Also, because the header_value.to_str function 'yields a &str slice if the HeaderValue only contains visible ASCII chars' (according to the docs), I put ~"Non-ASCII header value"~ to handle cases where it contains non-ASCII characters.. Again: it works.

     fn to_string_map(headers: &HeaderMap) -> HashMap<String, String> {
         let mut map = HashMap::new();
         for (header_name, header_value) in headers.iter() {
             let k = header_name.as_str();
             let v = header_value
                 .to_str()
                 .unwrap_or("Non-ASCII header value")
                 .into();

             match map.get_mut(k) {
                 None => {
                     map.insert(k.into(), v);
                 }
                 Some(old_val) => *old_val = format!("{}, {}", old_val, v),
             }
         }

         map
     }

The unexpected challenges

So what made this so difficult? Why didn't it work out as I expected? Well, here's some of the issues I ran into:

The request body
As briefly mentioned up top, I originally wanted to also include the request body in the response. I spent too much time on trying to make this work before realizing that I should leave it out for now.

This turned out to be difficult because of how I couldn't easily parse the body as a String and include that in the output JSON. But after a bit of thought, I realized that I can't just assume that the body is JSON (or even a string), so it's more work than I expected.

Converting between different types
Related to the issues with the request body is conversion between different data types, and especially between types that are and aren't serializable by Serde. It felt like there was a lot of juggling types around just to please the compiler.
Manually serializing data types
While most of the data types can easily be represented as strings in this case, the header map needed some work. While it wasn't a very difficult exercise, it took more time than expected, especially because there wasn't an obvious way to perform an upsert-like action into a HashMap.
Lack of examples
This could be me or it could be the documentation, but I found it difficult to do what I wanted. I'd expected there to be more information on getting data from a request, but it's quite possible that I just didn't read far enough.
I'm ... /rusty/
It's been a while since I last worked with Rust, and the borrow-checker was stricter than I remember.
Working directly with Hyper?
I don't know whether this was much of an issue or not. It gave me quick and easy access to the endpoint setup I wanted, but it might have introduced other complications. That said, it looks as if Actix simply re-exports a lot of Hyper's data types, so I don't know how much of a difference that would have made.

Wrapping up

Even if things didn't go exactly as planned, it was a fun, and at times very frustrating, little project. Having looked a little bit more at the Actix docs, I have found a few sections that make me think it could be quite suitable after all, so I'll probably rewrite the project some fourteen times in the coming week.

Next time I'll hopefully have something a bit more polished to show off.

Peace.