Spicule - Data Processing Experts

The importance of choosing the right tools for the job

The importance of choosing the right tools for the job

If you’ve been following our most recent blogs, you’ll know we’ve spent the past couple of seasons creating the ANSSR platform in collaboration with Canonical. It’s a project we’re extremely excited about because it’s going to revolutionise the way businesses apply complex data-based solutions. Before ANSSR, many companies didn’t have the expertise or budget to mount the database that was right for them. Now, thanks to our new ANSSR platform, they will have access to all the data processing services they need without the hassle or expense of having to install and configure them themselves. However, there is still one important issue to address before you can use ANSSR to spin up your database of choice… you need to pick the right database for your job.

On the surface, how hard could that be? Surely all databases are databases and all servers are servers, right? Well – no. They might all have certain similarities to one another but the jobs they do are, critically, very different. So how do you choose the correct database for your project?

To begin with, here are a few key features you should always consider when selecting your database:

  • What are you storing?
  • What is interfacing with the database server?
  • Are aspects like scale-out and redundancy important to you?
  • What budget do you have available?

Budget can make or break a deployment, but not in the way you might think. For example, some databases are commercial whereas a lot are open source (or open core, if that’s the term you prefer.) Ask yourself the question – do you need a commercial database? If so, what budget do you have for it? If performance is your key metric, a database like Exasol might fit your requirements very nicely because it tops most of the performance benchmarks, but it isn’t free and it’s not especially cheap.

Even if you use an open source database you’ll still need the budget to get it deployed and configured correctly, and it’s important that whoever does this work understands database design well enough to build you an effective schema ensuring you can write streamlined enquiries and access your data as quickly as possible. And while we’re on the subject of data…

…what data are you storing? These days there are many different database types, from traditional databases like MySQL which store tabular data, to databases that specialise in JSON objects, algebraic specialist and textual data. And that’s only naming a few.

Also, what are you using to interrogate your database? It’s vital to put a lot of thought into how the data extraction and reporting elements of your new system are going to work because if your interrogation tool doesn’t interface with your database then you’ve already broken down at the starting line. If you are planning on using more traditional reporting tools, something with a JDBC/ODBC compliant SQL interface is probably required. However, if you are going to analyse data by writing your own front-end you might want to choose something a little different. One thing’s for certain, though – you don’t want to input your data into your database of choice only to discover that you can’t plug your reporting tool in!

Finally, if you need to process data at scale, what does that scale actually look like? Is it two servers, ten servers or 100 servers, and how will that scale change over the next two years or five years’ time? And what about if one of those servers fails, will that detrimentally impact on what the platform needs to keep working? If you’re using more than one server, scale and redundancy are incredibly important.

Those are some of the more obvious points to consider when choosing a database, but there are a lot of other factors involved too. It might seem like a lot of homework to do, but it’s worth taking your time and experimenting at this stage because making the wrong decision and swapping data storage engines further down the road can cost time, money and give you a pretty nasty headache.

At Spicule, we have a huge amount of experience in deploying databases at scale and configuring them for optimum performance so, if you need any assistance in selecting the correct database for your project or implementing its tooling, don’t hesitate to get in touch. Give us a call on 01603 327762 or email info@spicule.co.uk. We’re always here to help!