drizzle
Profile
Search
 
Powered by SoftLayer
Client/Boots API Design Concept

Image:Boots.png

Note: I basically brain-dumped this set of ideas in one session with minimal editing. Please excuse any rough writing, and please edit it as you see fit!

Contents

Boots Design Concept

The boots shell is a rich facade on top of a Python shell specialized to deal with databases.

Boots consists of different objects that pass data to each other via connections (straps). The shell connects to query processing objects, which interact with multiple connections, which connect to output processing objects, which feed back into the shell. Since it might make sense to write the input and output in one object, since it all deals with CLI manipulation, the shell performs double duty: it is both an input and output object.

(Shell) -> (Query Processing) -> (Connections) -> (Output Processing) -> (Shell)

Boots API

Basic Data Types (primitives)

Connections

Server connections. Execute queries and return rows.

Queries

Queries. Can be manipulated, such as parameter substitution.

Rows

Rows. A list/tuple of columns.

Multi

An object to represent and work with multiple pieces of data: multiple queries, multiple rows, multiple connections. A multi allows the user to filter and process different pieces of data, by regular expressions, filters, etc.

Example of a **multi** of connections:

   servers = multi(connection)
   
   servers.add(
     connect.drizzle(
       host = "foo"
       port = XXXXX
       user = "..."
     ),
     
     connect.drizzle(
       host = "bar"
       port = XXXXX
       user = "..."
     )
   )
   
   servers.regex(".ar").execute("INSERT INTO ... VALUES (...)")
   
   def confirm(server):
     return prompt(server.name + ": yes or no?") == "yes"
   
   servers.filter(confirm).execute("DELETE ... FROM ...")

The multi is thus a swiss army knife of handling multiple connections, multiple queries, and multiple rows of output. Since many of the operations you'd want to do on groups of those primitives are the same (filter, map, regex, etc), it makes sense to have a base object. It will probably be useful to specialize the multi to have methods specific to the primitives, though not necessarily too much: for instance, you could do parameter substitution on multiple multis by combining a map and a query substitution method.

Nodes

Many pieces of our client will embody a basic pattern: hooks for input and output, in the form of multis of a certain primitive. Connections will take multi-queries, and output multi-rows. Shells will output multi-queries to multi-connections, and input multi-rows.

We'll need a basic generalized hook API where multiple nodes can be added to each hook (multi-node?). It might be like a basic event registration/signal handler system.

Inputs

Inputs generate queries, via interaction with the user, reading from files, etc.

It might also be useful for inputs to generate rows. For example, performing multiple INSERT queries with parameter substitution from a CSV file.

Outputs

Outputs take rows and perform actions, such as presenting them to the user, writing them to a file, or serving them over the network (!?!).

A Shell

The shell is both an input and an output. It generates queries, and takes rows and presents them to the user. Thus, a shell is like a link in a chain that forms a loop:

  Shell -> Queries
    ^          |
    |          v
  Connections <-

The Boots Sole

The boots sole fills in the application pieces: it handles outwardly aspects such as configuration, command line arguments, etc.

It is an interface to the underlying boots API described here, masquerading as a typical SQL shell. It can be thought of as a loop where a multi of connections (instantiated in a config file) are executing one query at a time, and the rows are displayed. The boots sole is also a node (a shell and an output): it outputs multi-queries to whatever hooks into it, and nodes hook into it to present their resulting multi-rows.

Modes and Shorthand

Classic

In its default usage, the sole should probably behave like a classic SQL shell: it takes lines of SQL, runs them on connections, and outputs the results. Nothing really cool is happening here. This would be easy to code up using the API as a backend.

Python

In Python mode, the sole acts as a Python interpreter, where the user types Python expressions that leverage the Boots API. Considering this usage, the Boots API should be as concise and straightforward as possible, to facilitate easy usage interactively.

Shorthand

It might be useful to have a middle-of-the-line shorthand for performing basic actions on SQL queries. This would expose most of the powerful features of the API (multi-substitution, outputs, filtering connections) without forcing the user to learn or use Python.

How this shorthand would work is TBD, but here are a few ideas.

Tacking things on the end of lines with a redirection operator:

   SELECT * FROM ...; > csv("./data")
   INSERT INTO ... VALUES(%1); < csv("./data")

Surrounding SQL and replacing method application with redirection operators:

   csv("./keysets") >substitute> s(SELECT (%1, %2, %3) FROM ...) >output> csv("./data")
   csv("./keysets") > s(SELECT (%1, %2, %3) FROM ...) > csv("./data")
   (maybe actions are implicit?)

Batch Operation

For writing batch files and maintenance scripts, it might be useful to easily switch between the different modes. For instance, there could be block of SQL manipulation commands, followed by some Python to do more complex processing. We might want to have some sort of mode-switching block syntax, or keywords to facilitate this.

Ideas

Operations may need to retain more state than meets the eye. For instance, when performing parameter substitution interactively, it would be useful to break the pipeline apart, displaying the resulting queries to the user.

User types query with a mapped multi-row substitution. When the user types something like:

   query("INSERT INTO data VALUES (%foo, %bar)").substitute(json("./data"))

We really want to evaluate the substitution first, show the queries to the user (optionally, of course), and then run it. This fits the idea of a shell generating multi-queries.

It follows that, when a line is entered, the boots sole should use the API to evaluate it, expecting it to generate a multi-query. It then passes the queries to whatever is hooked into its node.

More Ideas

Please add your own ideas here.

  Another filter example
  def is_slave(server):
    return server.execute("SELECT @@default_replicator_enabled")  
  
  servers.filter(not(is_slave)).execute("DELETE ... FROM ...")
  List Comprehensions - instead of...   
  for S in servers do 
    for R in S.execute("SELECT tbl, whr, out FROM tables_to_dump") do
      S.execute("SELECT * FROM %s WHERE %s INTO OUTFILE %s", R[tbl], R[whr], R[out])
  
  one could do...
  servers[].execute("SELECT tbl, whr, out FROM tables_to_dump")[].execute("SELECT * FROM %s WHERE %s INTO OUTFILE %s", tbl, whr, out)
  
  or ...
  servers[].execute("SELECT host, cmd FROM things_to_run")[].shell("ssh %s %s", host, cmd)

TBD

Error/exception handling?

How does buffering between nodes work?\

Jokes

Place boots-related puns here.

Site generously hosted by SoftLayer Technologies