In the Developing your information management solution architecture article, I described how you produce a systems architecture containing generic components with interface technologies to join the components together. In the technology architecture section, I describe how you then need to select the hardware and software that meets your needs. But then how do you actually come up with candidates to choose from when simply googling will come up with a blizzard of potential technologies?
The easy answer is to rely on a couple of companies who spend their time identifying new hardware and software, working out what their purpose is and determining their relative merits. The 2 companies are Gartner Research and Forrester Research. Of the two, I prefer Gartner’s output as they provide a nice diagram which they call a “Magic Quadrant” for each technology area. So let’s take a look at the ones that relate to database management systems and BI & Analytics:-
Gartner’s magic quadrant measures tools on their ease of use (ability to execute) and how feature rich and easy to integrate they are (completeness of vision). Those in the top right hand corner are the current leaders and those in the bottom right are the newcomers who are starting to attract attention. What you will notice is that Gartner doesn’t cover the Apache open source projects directly and there are just a smattering of cloud-based and NoSQL data management solutions. The main reason for this is that Gartner is geared up to provide research on chargeable, enterprise solutions, whereas a lot of the momentum in Big Data has come from dotcom startups who have grown in size, and started out by using free, open source software. In 2014, there were a lot less Big Data-related solutions so progress is being made by the likes of distribution package providers – MapR and Cloudera to appear in the quadrant at last. Hortonworks, is still missing, however.
To provide a fuller candidate list which includes Big Data and NoSQL data stores, db-engines.com, provides monthly popularity rankings and describes the general features of each database system. Generally, the more popular a database system is, the easier it will be to integrate with, simply because other tool vendors take note of which systems are popular and ensure that they can work with them. A more detailed focus on NoSQL databases is provided at http://nosql-database.org/
In the BI and Analytics magic quadrant, it’s better news for recent challenger data discovery tools such as Tableau & Qlik, who have gained market share by being easier to engage with than the incumbent business intelligence tools created by Microsoft (SSRS/SSAS), Oracle (OBIEE), IBM (Cognos), Microstrategy and SAP (Business Objects & Crystal Reports) tools. Pentaho is improving it’s completeness of vision from 2014’s version. SAS remains the tool of choice, if you want to do heavy statistical analysis and present it easily in a graphical form, however. Again, open source technologies popular in data analytics such as R, Matlab & Scala are not examined by Gartner.
Looking at Gartner, Forrester and db-engines.com, provides a candidate list, but you then need to go through each on a feature by feature basis to see if it meets your requirements in terms of functionality and consider the cost that having a best of breed solution incurs when you may only need a small percentage of the features.
Once you’ve selected a candidate list, you should then work out the weightings that you wish to apply to different features and functionality that you need in your solution and then run a proof of concept to ensure that the chosen tool actually does what you think it does (typically there are a lot of caveats on functionality embedded in the small print which cause headaches).