The Best Big Data Tools

WHAT IS BIG DATA?

Big Data is is a term for data sets that are so large or complex that traditional data processing applications are inadequate.  Big data analytics is the process of examining large data sets containing a variety of data types — i.e., big data.  Uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information.  There are many different big data tools that you can use.  The analytical findings can lead to more effective marketing, new revenue opportunities, better customer service and improved operational efficiency.  Also, competitive advantages over rival organizations and other business benefits.  One way of learning more about big data is by attending local summits.

BOSTON BIG DATA SUMMIT

The Big Data Innovation Summit in Boston September 8-9th at the Seaport World Trade Center in Boston.  There will be 80+ Industry Speakers & 800+ attendees, Big Data Innovation is the largest gathering of Fortune 500 business executives leading Big Data initiatives.

BIG DATA TOOLS

Jaspersoft, Pentaho, Karmasphere, Talend, Skytree, Tableau, and Splunk.

The Jaspersoft package is one of the open source leaders for producing reports from database columns. The software is already in many businesses turning SQL tables into PDFs that everyone can scrutinize at meetings.

Pentaho is another software platform that began as a report generating engine; it is, like JasperSoft, branching into big data by making it easier to absorb information from the new sources. You can hook up Pentaho’s tool to many of the most popular NoSQL databases such as MongoDB and Cassandra. Once the databases connect, you can drag and drop the columns into views and reports as if the information came from SQL databases.

Many of the big data tools did not begin life as reporting tools. Karmasphere Studio, for instance, is a set of plug-ins built on top of Eclipse. It’s an IDE that makes it easier to create and run Hadoop jobs.

Talend Studio allows you to build up your jobs by dragging and dropping little icons onto a canvas. If you want to get an RSS feed, Talend’s component will fetch the RSS and add proxying if necessary. There are dozens of components for gathering information and dozens more for doing things like a “fuzzy match.” Then you can output the results.

Not all of the tools will make it easier to string together code with visual mechanisms. Skytree offers a bundle that performs many of the more sophisticated machine-learning algorithms. All it takes is typing the right command into a command line.

Tableau Desktop is a visualization tool that makes it easy to look at your data in new ways.  Slice it up and look at it in a different way. You can even mix the data with other data and examine it in yet another light. The tool gives you all the columns for the data.  And it lets you mix them before stuffing it into one of the graphical templates.

Splunk is a bit different from the other options. It’s not exactly a report-generating tool or a collection of AI routines. Although it accomplishes much of that along the way.  It creates an index of your data as if your data were a book or a block of text. Databases also build indices, but Splunk’s approach is much closer to a text search process.

 

You may also like