DMBCS | Intelligent Bash (<i>ibash</i>)

1 Introduction

Working at the command line, one often finds oneself wishing that bash was just a little bit cleverer. Allow for obvious and oft-repeated spelling mistakes. Get its configuration from a central location. Just know how to build and install source-distributed projects. Recognise machine names and just ssh into them. Recognise important directories and just know to cd to them. And to be able then to work across machines in a seamless way, with one able to wait for another to finish some work off.

The aim here is to develop an extended bash (Bourne Again Shell) which uses Guile to imbue the command line with a level of intelligence: the ability to explore the context in which a command line has been written, and modify it to adapt to the situation, e.g. if there is a configure script but no makefile, modify a make instruction to first call on configure.

2 Audience

This system is in an early and very exploratory stage of development. There are currently many disparate parts which must work in synchronous harmony with each other, and there is not a utility for putting all these pieces in place in an automated way. The main reason for this is that eventual use cases for the system are not known, and it would be premature to be developing such things at this early stage.

It is also necessary currently for someone who wants to use the system for their own purposes to fully understand the workings of the system at all levels in order to be able to utilise it.

Thus the system is only accessible to experienced bash users who are also Scheme programmers who have taken the time to understand a lot of details of the inner workings of iBash, as described below; this document is intended to be read by such people.

3 Progressive Description of the System

The diagram below gives the overall idea. We break out of the bash REPL (read-execute-prompt-loop) by tokenizing input command lines and sending them, possibly over the Internet, to a central intelligence service. The service uses processing modules to fabricate functions which, when applied to the tokenized command line will manufacture a new set of command lines. These functions are sent back to the client where they are executed on the original tokenization, and the result is given to bash as if it had been typed in at the terminal.

All parts of the system, except bash itself, are written in Guile. This gives us the immediate ability to write code which writes code which can be serialised and modified and executed and the results modified to our heart’s content: once Guile is in the loop, the sky’s the limit.

The descriptions below indicate the logical layering of the software stack which is put in place, and, more precisely, spells out how the system has developed to date chronologically into a progressively more sophisticated intelligence.

3.1 bash Call-out to Command-line Modifiers

3.1.1 Motivation

We strive to make the minimum possible incursion into the core of the bash executable itself. The simple idea we pursue is that, immediately prior to processing the symbols of the command line that a user has entered at the terminal, the command line is offered to an external scheme procedure which returns with a re-written version of the command line (possibly split into more than one executable statement).

3.1.2 Abstract Concept of a Command-Line Processor

Our basic unit of command-line processing is a procedure which takes in a list of tokens as entered on a command line, and a list of lists of tokens—usually initially empty—which are to be given back to bash for actual execution. The return from the procedure is a cons cell containing a list—possibly empty—of unused tokens, and a list of lists of tokens to be eventually executed. It is then seen that the action of the procedure is to take the first list of tokens and produce one or more computed command lines based on the original, augmenting the second list-of-lists with new instructions. (Because of the way lists work in Scheme, the return list is presented, and finally executed, in reverse order, with commands to be executed later appearing nearer the front.)

There is, however, flexibility in that the procedure could simply modify the incumbent tokens and leave them in the first list for further processing, or could modify already computed lines in the second list to apply another layer of intelligence to them.

3.1.3 Implementation

Firstly, bash’s main function is replaced with a stub which calls Guile’s main, which in turn calls the original bash entry point (shell.c:820). This is Guile’s mechanism for initialising the system, giving Guile the opportunity to gain oversight and control of the run-time stack.

During bash’s setup phase, a scheme file is loaded and two functions are mapped into bash’s Guile namespace, callable from C as scheme functions. The file which is loaded is either declared in the environment variable I_BASH_CALLOUT, or else ${HOME}/.bash_guile.scm is used (shell.c:522; an error “no call-out” will be signalled on the terminal if no file can be loaded). This file must provide two functions.

BASH:process-command-line: take in a representation of the current command-line (list of space-less strings: tokens), and store a modified version, and possibly other generated command lines, for later retrieval;
BASH:next-command: iterate through the stored command lines.

The bash executable will, when the time comes to process a command line, tokenize it and convert it to a scheme list of strings. This will be passed to the first of the above functions (eval.c:204), and then the latter function will be called to provide actual instructions for bash execution, until there are none left.

In the very first incarnation of this facility, .bash-guile.scm is all there was and everything just worked to provide a basic level of intelligence. However it soon became apparent that more flexibility was needed as things got more sophisticated, and so the environment variable came to be used.

In principle, this is enough already to make an arbitrarily intelligent shell, as the .bash_guile.scm file is implicitly bestowed with the power to make any changes it wants to the command lines typed by the user.

The script which we currently provide for this purpose, ibash-scheme/ibash-callout.scm, first checks to see if the first token on the command line is ”db” (“direct bash”); if so, then the rest of the command line is sent back verbatim to bash, as a way to bypass the augmented system in a pinch. Otherwise, the command line is sent to a function called process in a file either named in an environment variable called I_BASH_CALLOUT_2, or else the file $HOME/i-bash/remote-sender.scm, which is described more fully in the next section; if such file cannot be loaded then an error message saying ”no call-out-2” will be seen on the command line.

3.2 Relay Requirements to Central Intelligence Service

3.2.1 Rationale

The next evolution of the system was to not actually process the command line directly, but to delegate processing to a central process. The reasons why we wanted to do this are

A more powerful machine can be used, maybe with GPU-assisted machine learning?
Shells can potentially learn from each other.
Shells can potentially interact with each other by modifying state in the central service: this would be useful when using an off-line build system, for example.
Only have to develop the intelligence once and not worry about distributing it to all the machines that we use. This also means that we can make live updates to the service, and then all existing shells get new functionality on-the-fly.

Note that the option to run the service on the local machine always exists to keep things simple, if that is all that is needed.

3.2.2 Implementation

The functionality is currently implemented in the ibash-scheme/remote-sender.scm script, called out, in turn, from the function described in Section 3.1.

Each time it is called up, the process procedure of the remote-sender.scm script looks for an environment variable called I_BASH_REMOTE—the value takes the form of host:port—, and tries to use this to contact a central service. If it fails, the script will return a command-stack which causes the words “REMOTE iBASH INTELLIGENCE SERVICE NOT AVAILABLE” to be echoed on the terminal, and the bash shell will appear to not do anything (the user will have to revert to prefixing their input lines with “db” as described in Section 3.1.3).

Once a service has been identified, the tokenized command-line is transmitted to the server. The server responds by sending a Scheme procedure back as an s-expression on the wire, which is then evaluated into a procedure which is then run locally and is also given the original tokenized command line; the return from this local execution must be a list of modified (or not) command lines. These are then returned to the caller, utltimately to be sent back to bash to be executed as if they had been typed in by the user.

It immediately becomes apparent that the complexity and sophistication of the system has increased an order of magnitude in making this development. To recap, we have a server which reads the command line and generates a procedure which will convert that command line into some other command line, which will be returned to bash for execution. The idea here is that the central server provides the intelligence as required, but that the intelligence is deployed locally so that local state can be taken into account.

3.3 The Central Intelligence Service

3.3.1 Rationale

It is clear at the outset that we want to separate the logic for the different primary commands the user may type at the command line. We also choose to separate into categories of verb processing and simple text substitution (anywhere in the command line).

And so we need to modify the abstract concept of a command processor presented in Section 3.1.2: it is now a procedure which takes in the tokens of a single command, and manipulates those tokens for further processing and/or provides a new procedure for generation of the finally executed command line on the local machine. This concept must have a means of registering itself in the core of the intelligence service, either as a verb processor in which case it identifies with the first word on the command-line, or as a text-substitution processor to be run before any verb processors.

3.3.2 Implementation

This is implemented in ibash-scheme/ibash-server.scm, which scans for files in a directory either named on the command line, or else defaulting to $HOME/i-bash/modules.

3.3.3 The ib Substitution

When working on a project, the name of the project often needs to be typed at the command line in various contexts. A case in point is the i-bash project, and we would like to be able to write ib anywhere that i-bash would need to be typed, just to save ourselves a little effort.

Thus the i-bash module (ibash-scheme/modules/i-bash) provides this very simple token replacement function.

Experienced bash users will recognise this as variable substitution, but we now forego the need to type pesky dollar signs everywhere.

3.3.4 The cd Command

The sole example of a verb processor at this time is the cd command. The idea is that a special set of directories is nominated at the server, and when the user asks to cd to some directory, the local function will search for that directory under the nominated ones, and provide a modified command line for final execution by bash which makes the proper cd call to get to that exact directory.

Experienced users of bash will know that this behaviour can almost be accomplished with the CDPATH environment variable. However this does not always do the right thing, especially if the directory you want to get to is right below the one you are at, rather than one with the same name somewhere else in the file system. Also, we want the ability to use abbreviations for common directory names, such as provided by the ib substituter above.

The upshot of having these two modules, is that one can type cd ib at the command line, and be immediately transported to the i-bash project root underneath some designated projects directory, regardless of the current working directory.

The functionality is provided in the cd module (ibash-scheme/modules/cd). You may want to edit this around line 38 to have it use your own favourite container directories.

This is quite a convoluted script as it has to produce a serialisable representation of a procedure which processes the command line to produce a suitably modified one, with the real work being done on the local machine, not necessarily the machine running the cd script.

4 Installation

4.1 Requirements

Guile 2.2
- Tested with version 2.2.7.
Enough infrastructure to be able to build bash from source.
- Tested with GNU make 4.3, gcc 10.2.0.

Note that if you are using your operating system’s package manager to get these, you will probably need to specify that the libguile development package is needed explicitly (on Debian, apt install guile-2.2-dev).

Other versions of Guile may work, but haven’t been tried.

4.2 Steps

Download the package: git clone https://rdmp.org/dmbcs/i-bash.git
cd i-bash
./configure CFLAGS=”$( guile-config compile )”
make
Optional: sudo cp bash /usr/bin/ibash
Either
1. export I_BASH_CALLOUT=$PWD/ibash-scheme/ibash-callout.scm
2. export I_BASH_CALLOUT_2=$PWD/ibash-scheme/remote-sender.scm
Or
1. cp ibash-scheme/ibash-callout.scm $HOME/.bash_guile.scm
2. mkdir $HOME/i-bash
3. cp ibash-scheme/remote-sender.scm $HOME/i-bash/remote-sender.scm
Edit the file ibash-scheme/modules/cd around line 38 to include the container directories important to you, in particular the one which is the parent of this i-bash project directory.
Optional: cp -rp ./ibash-scheme/modules $HOME/i-bash.
guile -s ./ibash-scheme/ibash-server.scm $PWD/ibash-scheme/modules 9081 &
- If the two arguments to this application are not supplied, the default values will be assumed: $HOME/i-bash/modules and 9081.
export I_BASH_REMOTE=localhost:9081
./bash

You should now be able to use the new shell just like plain old bash, but now if you type cd ib at the command-line it should take you to the i-bash source directory regardless of where you currently are in the file-system.

It is now up to you to add modules into the modules directory, to implement intelligence which is useful to you!

Note that you can interactively change the I_BASH_REMOTE environment variable inside the intelligent shell, to dynamically change the central intelligence server used to modify future command lines.

5 Contributing

5.1 Technical

The primary repository for this project, at https://rdmp.org/dmbcs/i-bash, is duplicated at GitHub https://github.com/Dale-M/i-bash. If you want to contribute ideas to the project the easiest way is to fork at GitHub, make your changes in your fork, and then send us a pull request. If, like us, you actually detest GitHub, you will have to provide a repository elsewhere from which we can pull, and send us a message at https://rdmp.org/dmbcs/contact with details.

Either way, we would love to hear about your ideas!

5.2 Financial

Like all free (as in free beer) FOSS software, the amount of time that can be devoted to this project is limited by the needs of the developers to otherwise earn a living wage. Contributions to the project will make a difference to the amount of resource available to it. If possible, please contribute by Bitcoin to the address

1PWHez4zT2xt6PoyuAwKPJsgRznAKwTtF9. For other forms of donation please message us at the contact form at https://rdmp.org/dmbcs/contact. If you want paid support, or want to pay for specific features to be developed, DMBCS will most likely be happy to oblige!

Thank you.

Intelligent Bash (ibash)

Table of Contents