Archive

Archive for the ‘NoSQL’ Category

RESTandra/ – Update

November 8th, 2011 No comments

Quick update:

RESTandra/ Has received a *much* needed overhaul and now has a better implemented back end, with new URL design.  More to come soon.

RESTandra/ – A RESTful HTTP API for the distributed structured data store Cassandra

May 20th, 2011 1 comment

RESTandra/ is(was) my final year honours project. The point of it was to provide RESTful access to Apache Cassandra Resources. Having just written a 10,000 word report on it, I’m going to keep this brief.

The project started out in python, using thrift and Pycassa to provide access to Cassandra over HTTP. However this wasn’t an ideal implementation so I moved on to building the API into Cassandra’s source. You can have a look at an early build of it here:

https://github.com/jonnyboris/RESTandra

RESTandra/ is all about applying HTTP verbs to Cassandra nouns. The interface to Cassandra was provided through a slightly tricky URL:

http://domain:18220/keyspace/columnfamily/row/columnStart/columnEnd

/consistencylevel.filetype

It was quite hard fitting Cassandra’s data model into a URL, so the one above has been slightly altered. To interact with the data in Cassandra HTTP methods are used on variations of the above URL

so a

GET /Twissandra/User/fairfull///QUORUM.json HTTP/1.1[CRLF]
Host: domain:18220[CRLF]

Would read all the columns of the Fairfull row from the User column family with in the Twissandra keyspace and return them in JSON. A HTTP POST to that URL would be used to create the fairfull row, a PUT would be used to edit it and a delete would, well… delete it.

A lot of my research was based around how well HTTP methods and status codes map to the functions and errors of a database connector. For example, everyone is familiar with an HTTP 404 this maps well to a row being looked up, that cant be found.

Anyway, like I say, I’ve done enough writing on this subject, if you want you can find my report here:

http:/jonnyfairfull.co.uk/RESTandra.docx

Oh, one thing worth mentioning is how I tested RESTandra. As Cassandra is designed for big data applications serving many requests, I was unable to put it under any sort of stress. So to test it I came up with a distributed method of testing, to do that I wrote a test application in JavaScript, and pretty much spammed the school of computing with the link, this was quite successful, as RESTandra ended up handling just over 4,000,000 million requests over 2 days without leaking memory all over the place. I wrote a much better description of the test here:

http://restandra.blogspot.com/2011/04/javascript-testing.html