Big Data: Tools and a TestBig Data: Tools and a Test
Big data has promise, but big data discussions are fraught with opinions. In the end, all things that are hyped have to deliver something useful, economical, and productive.
September 12, 2014
Big data has promise, but big data discussions are fraught with opinions. In the end, all things that are hyped have to deliver something useful, economical, and productive.
I was searching through LinkedIn when I rediscovered Sandy Borthick who was part of the staff at Business Communications Review (BCR), working with Eric Krapf and Fred Knight. She is now with Stratecast, a unit of Frost & Sullivan, helping to build one of their newest practices covering Big Data & Analytics. This blog is the result of a recent interview with Sandy on big data.
Sandy, would you define big data and analytics?
"Stratecast defines Big Data as the database hardware and software designed to handle data sets that are too large and complex, or are growing and changing too quickly for traditional databases and applications. Analytics refers both to general purpose statistical software and to special purpose software applications for more specific analytic functions within the organization."
Would you discuss some tools for analyzing big data?
"Stratecast/Frost & Sullivan has identified close to 400 providers of Big Data and analytic tools and technologies. We group them into 18 types, and those into 5 broad categories: Big Data & Analytics Core Products and Services; Business Process and Strategic Analytics; Customer Experience, Marketing, and Sales Analytics; Mobile, Retail, and Location analytics; Social and Site Analytics."
How pervasive is the use of these tools?
"If there were any doubts remaining about the growing importance of Big Data & Analytics (BDA), they will be dispelled by the results of Stratecast/Frost & Sullivan's recent market survey. More than half (58%) of the 402 respondents have already implemented one or more BDA technologies and another 29% are planning them.
"Stratecast expects Big Data and analytic technology advancements to be as broadly disruptive and transformational over the next 10 years as were the personal computer in the 1980s, the Internet in the 1990s, and the social Web in the last 10 to 15 years. Nearly all (96%) of the survey respondents agree, and many of them predict a much shorter transitional timeframe."
When big data analysis is applied to B2C, has anyone used past data to predict the present? If so were they accurate?
"You can find case studies on many of the vendors' websites that attest to the value of the Big Data & Analytic (BDA) solutions that their CPG and other B2C customers have deployed. If they have increased revenue or decreased costs by applying the insights they gained from these solutions, does that make their BDA predictions 'accurate' or 'just' useful?"
Sandy pointed out an article about Google Flu Trends (GFT) and its inability to relate Google flu searches to actual flu numbers. It did not work well. The big data analysis over predicted the numbers of flu cases in 100 of the 108 weeks analyzed. An article in Science magazine, "The Parable of Google Flu: Traps in Big Data Analysis" compares its results to the real world, and discovered GFT doesn't work. This lesson in big data analysis demonstrates that collecting a huge amount of data will not necessarily produce reliable, useable results.
Would you discuss security for B2C, B2B, and internal big data collection and storage?
"Our May, 2014, survey also found that respondents put the resolution of privacy and security issues at the top of both their IT spending priorities and their strategic business priorities lists. The transition to a BDA-enabled world will require the resolution of data security and privacy issues, just as BDA solutions will be needed to discover, prevent, and mitigate security and privacy threats. Both will necessitate interdepartmental coordination, end user education, improved internal processes, and better data governance (data clean up). ... There can be little doubt, however, that clean, complete and secure data is a primary requirement for any BDA or security/privacy solution, and that all successful technology deployments ultimately depend upon proper utilization by end users. This is especially true for BDA and security/privacy solutions, because they involve important changes in policies, procedures, and metrics that affect end users in multiple departments.
"End users are notoriously capable of discovering workarounds that simplify their own daily activities, and keep their personal and work group processes moving, even if these casual practices impair data accuracy and muddy up the tracking of business process metrics. Therefore, getting the right people involved in BDA or security/privacy initiatives is not just a matter of hiring and retaining top technical and analytic talent. Successful implementations will also depend on interdepartmental cooperation to develop effective new procedures in the first place, and on the end users who can make or break the deployments, to buy in and follow through by adhering to the new procedures."
Some conclusions
Some other perspectives on this topic, from another expert: In an interview of Jules J. Berman, author of Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information, in ODBMS Industry Watch publication, he stated, "My own personal opinion is that data analysis is much less important than data re-analysis. It's hard for a data team to get things right on the very first try, and the team shouldn't be faulted for their honest efforts. When everything is available for review, and when more data is added over time, you'll increase your chances of converging to someplace near the truth."
Berman also stated in the interview that, ""The creators of Big Data resources like to believe that they have collected all the data relevant to their domain, that all of the data is accurate, and that the data is organized in a manner that supports meaningful data searches. The Big Data analysts like to believe that their results and conclusions are correct." With this in mind, he said, they should heed his warning that, "Overconfidence is the biggest culprit."