Thursday, December 11, 2014

Talking with DigitalOcean from shell or JSON + bash = rcli

RESTful APIs with JSON are all over the software world.  SaaS, PaaS and other aS'es frequently expose HTTP endpoints that eat and breed JSON. They're fairly easy to work with by using almost any language, except when it comes to do something really simple from a shell script.
Having extremely positive experience with JavaScript library Ramda that makes JSON processing very verbose, I'm now discovering new uses of it's command-line sister - Rcli: Ramda for Command line.

DigitalOcean example
Let's pretend we have a droplet in DigitalOcean and want to run a script on it.
We're going to figure out droplet IP using API and run some script there via ssh. Sounds ideal for bash script.

IP=`curl_get_droplet | \
  R path droplets | \
  R find where '{"name": "example.com"}' | \
  R path networks.v4.0.ip_address`

ssh root@$IP 'bash -s' < script.sh

For simplicity, I replaced exact curl command with curl_get_droplet function. 
For full example, see gist.

The IP address is extracted from DO response in 3 steps:
1. get droplets list from response
2. find the droplet with name "example.com"
3. for that droplet, extract first IP address out of all IPv4 addresses list in network section

{
    "droplets": [
        {
            "backup_ids": [],
            "created_at": "2014-12-11T20:25:40Z",
            "disk": 20,
            "features": [
                "virtio"
            ],
            "id": 3448941,
            "image": {
                "created_at": "2014-10-17T20:24:33Z",
                "distribution": "Ubuntu",
                "id": 6918990,
                "min_disk_size": 20,
                "name": "14.04 x64",
                "public": true,
                "regions": [
                    "nyc1",
                    "ams1",
                    "sfo1",
                    "nyc2",
                    "ams2",
                    "sgp1",
                    "lon1",
                    "nyc3",
                    "ams3",
                    "nyc3"
                ],
                "slug": "ubuntu-14-04-x64"
            },
            "kernel": {
                "id": 2233,
                "name": "Ubuntu 14.04 x64 vmlinuz-3.13.0-37-generic",
                "version": "3.13.0-37-generic"
            },
            "locked": true,
            "memory": 512,
            "name": "example.com",
            "networks": {
                "v4": [
                    {
                        "gateway": "104.236.64.1",
                        "ip_address": "104.236.126.246",
                        "netmask": "255.255.192.0",
                        "type": "public"
                    }
                ],
                "v6": []
            },
            "region": {
                "available": null,
                "features": [
                    "virtio",
                    "private_networking",
                    "backups",
                    "ipv6",
                    "metadata"
                ],
                "name": "New York 3",
                "sizes": [],
                "slug": "nyc3"
            },
            "size_slug": "512mb",
            "snapshot_ids": [],
            "status": "new",
            "vcpus": 1
        }
    ],
    "links": {},
    "meta": {
        "total": 1
    }

}

Monday, October 20, 2014

Spikes in RabbitMQ + NodeJS latency test

Measuring things is a fun way to confront assumptions we put everywhere. In software development products tend to be based on tens of assumptions, such as that network will be fast enough, REST or database transaction time short enough, success rate for external API calls within reasonable range. Knowing the limits of those assumptions is always worth at least one beer.

The other day I was looking how long it takes my message queue, RabbitMQ, to pass messages from producer to consumer. RabbitMQ has loads of options: queues, topics, durability, acknowledgements, clustering to name a few. It's not obvious how those different options affect latency.  I wrote two little NodeJS processes - producer and consumer, based on samples from RabbitMQ tutorial, to measure how long it takes a message to fly from one process to the other. Those processes would just run many times with various configs to produce data - big latency table.

Unfortunately, the very basic scenario turned out to be a puzzle. As fast as possible, producer inserts small (few bytes) non-durable messages directly into queue. Consumer has unlimited prefetchCount and uses auto-ack to eat from that queue. All processes run on single host, so networking is negligible (I assume!). For four runs, 500 messages each, latency can be visualised as follows:



While most messages spend 1-3ms in travel there are a few that took 10-50ms. I'm not quite sure yet where do these spikes come from. There's few options like RabbitMQ fault, NodeJS fault, AMQP library, some garbage collector (Erlang and NodeJS have it), JS testing code fault or the environment.
Environment seems not the case, because every result looks similar to the above, and I ran dozen of them on few different machines. GC seems unlikely because for so small data amount, memory should hardly notice anything.

I'm going to run similar tests using producer+consumer pair written in some other language and we'll see what comes out of it, but maybe you my reader have answers?