There's a new tech conference coming to Atlanta at the end of the month. Unlike other events I have covered and attended, this one is for those who not only develop websites but also deal with large data sets at high load and have learned the struggles of dealing with relational databases like MySQL at such scale. The underlying concept (movement rather) is called NoSQL — a (much debated) term describing the next generation of data storage technologies.
Yeah, it's a fairly technical conference and to be honest most NoSQL stuff is probably way over my head. That's why I'm so intrigued by it. The only conferences that get press around here seem to be B2B-centric or talk about leveraging Twitter to grow your brand — not exactly stuff that whets my intellectual appetite.
Alright so you're a technical person and do some web app work but are not exactly a DBA. What is NoSQL and what are the common applications for it? Well first off, the current limitations of relational databases need to be addressed. A relational database (RDB) like SQL can be most easily described as a table-based data system where there is minimal data duplication and sets of data can be accessed through a series of relational operators like joins and unions. The problem with such relations is that complex operations with large data sets quickly become prohibitively resource intense, although generally the benefits are reaped at the application level where database code need not be convoluted.
There are ways of getting around these limitations, as so well described by Adam Wiggins of Heroku in his aptly-named article SQL Databases Don't Scale. He talks about the popular tactics of beefing up relational databases for huge applications (vertical scaling, sharding and read slaves) along with listing their downsides.
So why are relational databases just now becoming an annoyance? Eric Florenzano puts it best:
As the web has grown more social, however, more and more it's the people themselves who have become the publishers. And with that fundamental shift away from read-heavy architectures to read/write and write-heavy architectures, a lot of the way that we think about storing and retrieving data needed to change.
Enter NoSQL: non-relational data stores that "provide for web-scale data storage and retrieval especially in web based applications because it views the data more closely to how web apps view data - a key/value hash in the sky." NoSQL is meant for the current growing breed of web applications that need to scale effectively. Applications can horizontally scale on clusters of commodity hardware without being subject to intricate sharding techniques.
Of course if you are coming from an RDBMS background, you will preceive a functionality loss when moving to such non-relational key/value stores. I can't go into much detail here as this is all still new to me. If you're up for some technical reads:
- NoSQL: A Modest Proposal by Chris Williams
- NoSQL if only it was that easy by BJ Clark
- Notes from a NoSQL Meetup - Yahoo! Developer Network Blog
- My Thoughts on NoSQL by Eric Florenzano
- Social Media Kills the RDBMS by Bradford Stephens
Slides from a NoSQL meetup
As a clarification, I'm not trying to say that there is a looming war between relational and non-relational databases. There's nothing stopping people from splitting up data in their web application and using both types of data stores where it makes sense. As Brad Anderson of Cloudant (YC S08) says:
NoSQL is about 'right tools for the job' as opposed to anti-relational, or replacing traditional solutions.
Alright, now back to the actual conference: NoSQL East 2009 is being held October 28-30, 2009, at the Georgia Tech Research Institute Conference Center in Midtown Atlanta, GA.
And the list of speakers and the subjects they will be talking about:
- Arin Sarkissian // Digg talking about Cassandra
- Kevin Smith // Hypothetical Labs talking about Redis
- Kevin Weil // Twitter talking about Pig
- Chris Curtin // Silverpop talking about Cascading
- John Hornbeck // Engine Yard talking about MongoDB
- Mike Miller // Cloudant talking about CouchDB
- Cliff Moon // Microsoft / Powerset talking about Dynomite
- Justin Sheehy // Basho talking about Riak
- Mark Gunnels // Catamorphic Labs talking about HBase
- Tim Anglade // af83 talking about tin
- Emil Eifrem // Neo Technology talking about Neo4j
- Geir Magnusson // Gilt Groupe talking about Project Voldemort
- Yuan Yu // Microsoft Research talking about Dryad/DryadLINQ
- John Corwin // Yahoo! talking about Sherpa
You can signup with a promo code that I will be offering here
later today: PROMO CODE Stammy250. Registration is appears to be closed at the moment but I will update this later today when the organizers figure out how to deal with adding people. I'll be sure write a recap of the event and share what I learned after the event.
Disclosure: I personally know one of the conference organizers and offered to help him spread the word in the interest of promoting the Atlanta tech scene as well as for admission to the conference.
Have you heard of NoSQL before? What kind of web app/database stuff have you dealt with before?