(Back to Shell Autocompletion)

This is a DRAFT. Don't circulate yet!

May 2019: Shellac Protocol Proposal V2

Shellac Protocol

Shellac is a protocol for shell-agnostic autocompletion. Shells and command line tools written in any language can communicate with each other.

Motivation

The status quo is that you can only expect upstream authors to maintain autocompletions for bash, the most popular shell in the world.

Shellac is a simple protocol aims to change this dynamic. The author of a CLI tool can easily implement it, and their completions will work in all shells that are Shellac clients.

The author of a shell can implement Shellac and get many common completions "for free". (These may be basic bash-style completions, or more elaborate zsh/fish style ones.).

In addition, existing corpuses of completion logic like the bash-completion project, the zsh core, and zsh-completions can be wrapped in this protocol, and reused by alternative shells like Oil or Elvish.

Overview

Roughly speaking, Shellac plays the same role for shells as the Language Server Protocol does for editors, but it looks more like CGI or FastCGI.

Shellac clients request completions, and Shellac servers provide them.

Rough Example 1

Let's use the example of busybox ash, which is derived from the dash code. I've heard some people complain that you have to use bash on Alpine Linux to get completions, because ash/dash have no support for it. The Shellac protocol potentially provides a migration path out of that situation.

Type this in ash:

$ git --git-dir . a<TAB>

ash will act as a Shellac client. It forms a request that looks something like this (encoding to be discussed):

{ "SHELLAC_ARGV": ["git", "--git-dir", ".", "a"]
  "SHELLAC_ARGV_INDEX": 3,
  "SHELLAC_CHAR_INDEX": 1,
}

ash just needs way of associating a command with a binary that supports the Shellac protocol. It doesn't need its own completion API.

It invokes the Shellac server/provider. Servers come in two flavors: SHELLAC_MODE=batch and SHELLAC_MODE=coprocess:

In this case, let's say we have a batch provider. It can just be the bash interpreter itself running git-completion.bash! We should be able to write a tiny wrapper shcomp_provider.bash that adapts between the bash completionAPI and the Shellac protocol.

The response is:

{ "candidates": ["add", "am", "annotate", "apply", "archive"] }

ash displays these alternatives to the user.

NOTE: I've written the protocol like JSON, but the encoding will most likely not be JSON.

Rough Example 2

Like the above, but perhaps Clang decides to implement Shellac. Then you have ash invoking Clang itself, not ash invoking bash.

Request format

SHELLAC_* environment prefix. SHELLAC_

SHELLAC_ARGV@, SHELLAC_ARG_INDEX, SHELLAC_CHAR_INDEX ?

problem: you can't have NUL bytes for arrays? Maybe the request comes on stdin then? Can bash deal with that?

Response format

Types of responses:

Status/Progress communication

(from Ilya Sher)

There must be a status/progress communication. During the response building phase, imagine the completion server needs to talk to all AWS regions (even if it is in parallel, it is not fast). It would be nice to have something like "Listing instances. Found X instances in R out of RR regions". Why all regions? It's a real use case. There is an ec2din.ngs with --allreg switch.

We can go further into semantic level. Status - an arbitrary string. Progress can be more defined, such as X out of Y items done. This more defined approach will let shells to display the information in a meaningful way, maybe a progress bar.

The server should also be able to communicate ETA (I think this is less important).

Needs thinking and discussion.

Response Encoding

Rich Completions

The request and response format have a JSON-like data model, so ZSH-like descriptions can also be returned:

ls --a
--all                                      -- list entries starting with .
--almost-all                               -- list all except . and ..
--author                                   -- print the author of each file
{ "candidates": [
  {"value": "--all", "desc": "list entires starting with ." }, 
   ...
  ]
}

This kind of structured data should handle the following:

Delegating Back to the Shell For Rich Completions

Filename completion could be fuzzy or case-insensitive. Instead of returning candidates, the completion server can specify a type of completion

{ "compgen": { "what": "files", "prefix": "RE" }}  # complete files beginning with RE

{ "compgen": { "what": "dirs", "prefix": "foo/testdata/c" }}  # complete dirs

This is similar to a bash completion function invoking compgen. It's user-defined code delegating back to the shell.

Other ZSH Like Features

(from Oliver Kiddle)

Modes

Character Encodings

Shellac clients and servers should prefer UTF-8 where possible. But file system paths are often the things being completed, and they are just byte strings. So technically most of the strings in the request and response format are NUL-terminated byte sterings, and UTF-8 is a special case of that.

Dispatch

Typical Client Algorithm

Typical Server Algorithm

Design and Implementation Issues

Streaming Responses

Security

Why Coprocesses?

For low latency responses. Startup time of processes is large, especially for Python, Ruby, JVM, Julia, etc.

Why not Multithreaded Servers?

Why not put one completion per line?

Because touch $'\n' breaks that protocol.

Risks

Related