pinterest

Less than a year ago, I wrote a post on the different DB solutions available that are ‘cloud-ready’. For those who are still struggling with the decision of which database solution to use and want some real-life examples, I recommend reading this recent post on Pinterest’s infrastructure architecture.

Key notes from the article:

Choice of DB: MySQL + Redis over MongoDB or Cassandra
Reason for DB: Sharding over Clustering – because of maturity and ease of hire
Number of DBs: 88 Master + 88 Slaves (cc2.8xlarge = $2.700 per Hour = $1944 per Month per Server = $342,144 per month for just MySQL)

Interesting points:

  • Algorithm for placing data is very simple. The main reason. Has a SPOF, but it’s half a page of code rather than a very complicated Cluster Manager. After the first day it will work or won’t work.
  • Can’t perform most joins.
  • Lost all transaction capabilities. A write to one database may fail when a write to another succeeds.
  • Reports require running queries on all shards and then perform all the aggregation yourself.
  • Joins are performed in the application layer.
  • When the Pin table went to a billion rows the indexes ran out of memory and they were swapping to disk.
  • If your project will have a few TBs of data then you should shard as soon as possible.
  • Architecture is doing the right thing when growth can be handled by adding more of the same stuff. You want to be able to scale by throwing money at the problem by throwing more boxes at the problem as you need them. If you are architecture can do that, then you’re golden.

My thoughts:

Firstly, thank you Pinterest for showing us your infrastructure architecture. With so many choices of DBs available (NoSQL vs SQL vs NewSQL), it is very insightful and helpful for others in a position to choose which DB strategy path to follow. After analysing the post, I feel that for a small team / startup and for future proofing, a fully managed DB is a better option. Small teams don’t usually have the necessary man power to maintain and branch out new MySQL shards. What they should do instead, is to concentrate on delivering quality software and if traffic increases, to have the flexibility to push a button and let a DaaS take care of the scaling ala DynamoDB or Instaclustr or MongoHQ.

Their decision to shard MySQL meant that:

a) they can horizontally scale
b) they can run flexible (but limited) queries

however, sharding MySQL had drawbacks too:

a) you can no longer have JOINS – one of the main strong points of having SQL
b) no longer fully transactional – again, another strong point for choosing SQL over NoSQL
c) operational burden for manual sharding – your team needs to be 100% on top of data size as well as when the need to re-shard

It does feel to me that there is quite a lot to work with (though they have the man power to do so 44 engineers). But then again, they’ve done it and proved it to work – so surely it’s can be considered a winning strategy?

Thoughts?

@munwaikong

Infrastructure

ShowCaster’s success is down to its ability to scale up and down and meet the varying demands of live events – sometimes during short periods of time (typically around an hour) we will receive bursts of hundreds of thousands of users interacting with our platform concurrently (eg. when promoted on a mainstream TV channel).

We’ve just completed a project rebuilding our auto-scaling infrastructure – consisting of 2 weekends creating ShowCaster Tools. ShowCaster Tools keep track of the health and usage of our servers as well as demand-based auto-scaling, wrapped in a simple GUI .

Let’s get technical

In order to automate this process, we needed to integrate our system with Rackspace Cloud APIs. The brief was to create a simple web interface for anyone in the team to use (both techies and non-techies). Seeing as we (the techies) are always up for challenges and exploring the world of languages & tech, we thought that this would be a great opportunity for us to try a different tech stack from our usual Java.

Ruby as a replacement for our perl scripts

We currently write all our scripts using Perl. So far it has done the work without issues. However, everyone who has used Perl knows that it’s not the easiest to learn. Every time a new developer joins our team, it takes a while until they fully understand the language and how to make the most out of it.

Ruby on Rails as the MVC framework for the front-end

At the same time we were learning Ruby for the server side interactions, we thought that the best tool to create the web GUI was Ruby on Rails. This is one of the most famous MVC frameworks out there for web development. We always have read that the the different helpers offered by Rails together with Ruby increases the productivity of any web development so we decided it was the right time to give it a try.

The architecture

One of the first things we realized when we started to architect the application was that there is no direct support for multithreading within Ruby on Rails. There are some “hacks” but none of them were appealing to us. For this reason, we ended up splitting our application into two main components. The first component was entirely implemented with Ruby and is in charge of all the business logic of the application. Things like:

  • Multithreading
  • Rackspace APIs interactions
  • Server monitoring and management

The other component was built with Ruby on Rails and looks after the user interface. This provides two main benefits:

  • Split the logic from the view (MVC)
  • If we decide that Ruby on Rails is not for us, we can easily bin it and start from scratch with another framework

The conclusion: Ruby yes, Rails not so much

After creating ShowCaster tools, Ruby has impressed us – especially for its SPEED. Other reasons:

a) The number of lines of Ruby code is significantly less – meaning a faster development cycle, less bugs and easier maintenance resulting in increased productivity.

b) We liked the simplicity of Ruby when integrating with other libraries – thanks to their valuable gems. We installed the Rackspace Cloud gems for their load balancers and servers and within 5 minutes we were ready to make requests to their APIs.

Ruby on Rails provided a good implementation of the MVC pattern, awesome helpers to speed up the entire development, and we easily understood how using it would improve long term productivity.

Working with the Rails ActiveRecord helper was a great experience compared to similar frameworks – we’re very familiar with Hibernate for Java and ActiveRecord was far simpler, better and faster in terms of data definition and usage than Hibernate.

In order to take full advantage of Rails, you really need to make complete use of all their helpers and tools – but in most of our user interfaces these helpers and tools wouldn’t be applicable, losing some of the potential of Rails.

Based on this experience, we’ll use Ruby to replace all our Perl scripts over time, providing us with benefits in the long term.  Rails won’t replace the web development tools used on our core products but it will definitely be considered on future projects where the potential of Rails can be used to its maximum.

@isaacmartinj

Development Infrastructure ShowCaster

In today’s world of software engineering, traditional relational databases (RDBMS) like MySQL or PostgreSQL databases are no longer the ‘de facto’ choice for a database system. Since the increase in popularity of cloud computing, NoSQL databases have risen to play an important part in data architecture in the cloud. Cloud servers no longer guaranteed dedicated performance (disk io / cpu / memory) as well as a 99.99% uptime. The best approach of designing software in the cloud is by designing the application to expect failures. This proves to be an issue as databases are usually the living heart and soul of any data-driven application. Without data, the application cripples.

Because Cloud Computing provides the ability to scale server resources up/down easily and quickly, the database design will also need to respond to the change in traffic load and scale accordingly. Most databases will have replication features, which you can setup a master-slave network of databases to help ease the load of a high-read application. But what about a high-write application – would you need to consider sharding?

The key point to take away from this post is that although there are many databases available that you can choose from (MySQL, PostgreSQL, Redis, Riak, MongoDB, CouchDB, HBase, Cassandra, Neo4j, etc.), there is no such thing as the ‘best database for the cloud’. In order for you to choose the ‘best’ database, you must first identify the needs of your application. If you are familiar with the CAP Theorem (Consistency, Availability, Partition Tolerant), different databases are designed for different combinations of CAP. Although there are many blog posts on the interweb comparing the different variations of databases, you should not base your choice solely from these results. For example, although big corporations like Twitter and Facebook use HBase for some of their products, doesn’t mean that you should implement HBase on your design. Perhaps a less complex setup is key and therefore a database like CouchDB is more suitable. So below are a few of the key questions you should try and answer which should help you narrow down the choices so that you can then focus on the details of the databases and then ultimately choose the ‘best’ database for your application.

Is your application read or write heavy? or both?

  • Read-Heavy – [DBs with Replication feature i.e. almost all] most databases provide master-slave (or even master-master) replication. Replication will have handle the load of read-heavy applications.
  • Write-Heavy – [DBs with Sharding feature i.e. MongoDB, HBase, Cassandra, Riak] whilst you can have a master-slave setup, all writes (i.e. inserts / updates / deletes) will be directed to the master. In order to take some of the load off the master, you need sharding
  • Both – [DBs with Replication and Sharding feature i.e. MongoDB, HBase, Cassandra, Riak]

Do you have an ops team to help with the complex setup / management of db clusters?

  • Databases like MySQL, CouchDB are very easy to get started with on a single server. They provide easy to use GUI / admin tools that you can experiment around with. Others like HBase, Cassandra and MongoDB will require more planning and architecture design to get an optimized setup.

Does your data need guaranteed durability?

  • Databases like MongoDB and Redis are known for their blazing speed because they first store values onto memory which then gets flushed to disk periodically. However as a trade-off to speed, they are a threat for data not being persisted given a DB failure event.

How big is your data?

  • Databases like Cassandra and HBase are designed for ‘Big Data’ ground up. However the ability to handle huge data comes at a cost: complexity

What is your primary goal and what does your application dataset resemble?

  • Are you building a write-log type system, or a read-cache reference type system, or a write-analyse analytics type system? Does your application natural fall under a key-value (Redis, Riak) / document orientated (MongoDB, CouchDB) / relational (MySQL, PostgreSQL) / columnar (Cassandra / HBase) or graph (Neo4J) type data model?

Do you need features like map-reduce / secondary indices / REST interface / views or stored procedures?

  • Some databases provide a subset of features where others don’t. For example, if you need a feature like secondary indices, you would choose MongoDB over CouchDB.

There are many other questions you should ask yourself but the bottom line is: there is no ‘right or wrong’ database. Instead there are ‘suitable and less suitable’ ones. In fact, you don’t even have to choose just one – a combination of multiple databases could as well, yield the best result.

@munwaikong

Infrastructure

Preface – This is the first post of our new engineering blog so drop us a comment and tell us how we’re doing and what you think.

Let’s begin.

Here at Orca Digital, we love tech. From software, to gadgets and designs – we love it all. Technology moves fast in this day and age but the sad news is: loving and keeping up with new tech is not always as easy as it sounds. Often it takes a lot of time, energy, (and money) to acquire those new features that your friends (or competitors – same thing really) already have. This is especially true for when you have to migrate your existing infrastructure to the cloud. Although cloud computing technology has proven to be a real success story for some companies, it also has some drawbacks. This is the story of Orca Digital’s journey to the Cloud.

Part 1: Understanding the way the Cloud moves

We needed to understand the ins’ and outs’ of the Cloud before we embarked on our journey. Knowing the benefits of a cloud IaaS (Infrastructure as a Service) is one thing, but knowing how the Cloud differs from a traditional hosted infrastructure will help us decide whether it makes for a suitable solution. Understanding the differences will also provide us with the ability to tune the Cloud to fit our specific needs (and also to squeeze as much out of it as possible). For example having a physical server with a quad core processor will not deliver the same amount of resource or processing behaviour as a virtual machine which has been allocated quad core’s worth of processing power. Even though there are many Cloud providers out there, each Cloud provider’s infrastructure will slightly differ from one another. After some basic research, we decided to concentrate on 2 specific providers: Amazon AWS’s EC2 and Rackspace Cloud.

Part 2: Choosing the type of Cloud (providers)

Comparing Cloud providers can be quite challenging. There are many cloud providers out there that you can choose from – Amazon AWS, Rackspace Cloud, Google AppEngine, Heroku, and many more. We had decided to narrow our choices down to Amazon AWS and Rackspace Cloud because: (A) Amazon’s popularity, recommendations as well as leadership in the industry, (B) we were already existing Rackspace customers. On top of that, they both had competitive pricing and had data centres in the EU as well as the US (which is key since we are based in the UK). Even though Amazon has the advantage of being around for longer and having quite a substantial amount of big and small sized clients, Rackspace had a managed service level, which for a relatively small monthly price, provides support for your virtual machines – should anything go wrong at any time.

If we were to judge the two IaaS providers by just reading tech docs or opinions on forums and blogs, it would have been a very difficult choice to make. Therefore, we thought that the best way to compare the two was to put them both to the test. We signed up with accounts for both, and began to install a few of our web-apps onto them. During this process, we made sure we took note of any noticeable differences in the two solutions – from timings to complete certain tasks, to the discovery of implementation hurdles, and even frustrations and/or appraisal from our migration engineers. After a couple days of hacking around, we had the answer (well at least a pretty good idea of the two).

The research showed that Amazon had the upper hand. AWS’s EC2 infrastructure provided many more features and functionalities that were not available on the Rackspace Cloud infrastructure at the time. Features like Elastic IP addresses and flexible billing were missing from the Rackspace Cloud infrastructure – though they promised it will be on the roadmap. That is why, we had decided to go with Rackspace. (“wait, what? huh???” Now now, calm down – there are reasons for this) To begin with – we are an existing client of Rackspace – i.e. our dedicated hosted solution. Since the Rackers were already familiar with our infrastructure and solution, their input and advice on the migration plans were very valuable to us. With their help and expertise, we were able to come up with a solution that was tailor made for our platform, suited for the Cloud. Another major factor was support. Whilst Amazon EC2 is all about self-serve and APIs and tech docs and what not, Rackspace has a support level (what they call a managed support service). Since the engineering mob @ Orca Digital is currently made up of a team of 8, being able to rely on a support team 24/7 to help manage your servers is great. This meant we could worry less about infrastructure, and concentrate on making good software. The last and most important reason for choosing Rackspace was that we had dedicated servers that couldn’t be moved to the cloud. These applications rely on low latency internal network communications, which meant that the cloud servers running the web-apps will need to be on the same network as the dedicated physical machines (or physically close to). Rackspace were able to provide us with a solution that ‘connected’ our dedicated machines (on our internal private network) with the cloud applications (on the Rackspace Cloud internal private network) via RackConnect – which meant that we can achieve a hybrid solution, which provides scalability whilst meeting the needs of our low latency internal network dedicated machines / applications.

Part 3: Taking the first leap

As you can see, comparing the cloud providers needed more than just a simple checklist comparison. You need to understand what you need from your applications / platform and what you can get out of each Cloud provider. Only then will you be able to choose the right provider for you. Now that we have established which provider is suitable for us, it is time to begin the migration process. But before we got ahead of ourselves, there is one final test to perform before taking the plunge.

Migrate your test environment (or build one in the cloud if you don’t have one)

The best way to ensure that your migration will go smoothly is to do a dry run. Yes – your devs and engineers are going to moan when certain things are not working or running like they used to be, but what’s worse? – that, or your entire production platform going haywire and your boss / investors (or even worse, your end users and customers) shouting at you? We surely didn’t want the latter and so we decided to migrate our test environment over first and run it for 1 month. Our test environment was configured to be an exact replicate of our production services – with the exception of a test / sandboxed database. That way, we had 1 month to iron out all the creases that we found with our platform running on the Cloud.

Part 4: Cloudify

After our successful test run, we set a date in the calendar, co-ordinated with our Rackers, and we went for it. I will not go through step-by-step the entire migration process as it is too long and boring, but to summarise, the entire migration period took 2 weeks – which considering the amount of things we had to migrate and test, was quite an achievement. We are now running the latest versions of everything (linux, apache, php, java, mysql, clusters, etc…) and enjoying the benefits of running a hybrid platform on the Cloud and on our dedicated hosted servers.

@munwaikong

Infrastructure