Up: Structured Data in Oil

The analogy to JSON is that a serialization format can be defined from a subset of the programming language syntax.

Since shell essentially only has strings, this is just a format that serializes strings in shell.

Let's introduce two requirements:

Summary

Serialize with

printf '%q' "$mystring"

or in bash 4.4

echo ${mystring@Q}

But this doesn't appear to be POSIX? printf is a required shell builtin, but %q isn't mentioned.

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html

Parse with:

read untrusted_data < $myfile
printf -v myvar '%b' "$untrusted_data"  # This should just create a string an not evaluate arbitrary code.

2024 Update - This does not work, %q and %b are not inverses. %b respects \n like echo -e, but it doesn't respect even basic shell quoting like \'.

printf -v is obviously not

Dynamic assignment is another alternative:

declare a="myvar=$untrusted_data"
declare "$a"

Although I'm not sure this works in bash, because () might be special-cased for arrays. There is probably a hole for code execution, whereas I don't expect that for %b.

Example

It doesn't work in dash, but works in bash and zsh, as expected.

Caveat: NUL character are not representable in shell strings! OSH will fix this. And TSV2 will also be able to represent NUL.

$ echo $'a\x01b' | bash -c 'read x; printf %q "$x"' 
$'a\001b'$ 

$ echo $'a\x01b' | zsh -c 'read x; printf %q "$x"' 
a$'\001'b$ 

$ echo $'a\x01b' | dash -c 'read x; printf %q "$x"' 
dash: 1: printf: %q: invalid directive

Lesson for Oil

Shell needs a serialization format!!!

Update: This is now QSN.