Endeca recently announced the results of a survey that explores the changing business intelligence analytics requirements. The survey polled 228 marketers at U.S. organizations with an eCommerce presence. The demographics indicated that 56% of the respondents came from B2C retail shops, 27% came from B2B Distributors, and the remaining 17% spanned Manufacturing, Financial Services, Education, and Media organizations. The survey found that while it is more important than ever to tap into social media sites such as Facebook and Twitter to understand evolving customer behavior and forge new revenue streams, marketers are overwhelmed with the growing number of data sources that need to be measured and analyzed.
Combined with the rise of information overload from the volumes of unstructured data coming from the leading social media sites, it is not surprising that survey results show that more than 60% of respondents admit they are currently making decisions based on only 50% or less of data available to them. In addition, nearly half of the respondents report that they are still using multiple tools (at least three or more) to support BI decisions.
Nearly half of respondents say they are not currently incorporating unstructured data into analysis, but it is something they would like to do. These firms are missing much of the good stuff. In addition, 35% of respondents say they spend hours combining data from various data sources and over half say they would like to analyze all information in a single view.
The need for a dynamically changing set of PI tools is also seen as 48% of respondents said their analytics requirements changed at least monthly, with 20% of respondents requirements changing daily or hourly. In addition, more than 40% of respondents cited that it often takes months to have their BI requests fulfilled or they often cannot get their requests fulfilled at all.
There is a great untapped potential for business intelligence as firms are still not taking advantage of the vast amounts of user generated content within Web 2.0.
Hi, Bill. I'm a bit weary of what seems to be whining and just plain ignorance on the part of the BI community that seems evident in these survey results. Many BI analysts are unaware of more modern, open source, content-centric integration and analytic approaches that scale better than ETL + data warehousing and deal with less structured data. For example:
(1) Standardized graph data stores (RDF or comparable triple or quad stores) for Web-scale integration: Graphs are more articulated and much easier to join than tabular, relational databases. Some vendors like InSilico Discovery now serve as on-the-fly report integration SaaSes for banks. Data description via inferencing and ontologies scales, as ISD and others have proven. The Semantic Web stack (RDF/RDFS/OWL) is in use at numerous media companies such as the BBC, NYT, Reuters, Wolters Kluwer, Lexis-Nexis--i.e., content companies. Lately, software vendors like Cisco and Amdocs are basing their products on these triple stores for scalability reasons. Many BI specialists just haven't worked much with content or are averse to trying a method that initially seems alien to them. See http://www.pwc.com/us/en/technology-forecast/spring2009/semantic-web-technologies.jhtml and my Sem Web Quora answers at http://www.quora.com/Alan-Morrison/Semantic-Web/answers for more detail.
(2) Parallel processing a la Hadoop (derived from the Google Cluster Architecture, Bigtable and MapReduce) or its NoSQL cousins: This method speeds up high-volume data crunching and makes it cost effective. Companies like Backtype (bought by Twitter)and FlightCaster have been analyzing scads of Web data on the cheap, and started just with a handful of staff and EC2 clusters. Others like Disney just kept servers they were going to retire and with the help of a few savvy staffers made them into Hadoop clusters. See http://www.pwc.com/us/en/technology-forecast/2010/issue3/features/big-data-pg1.jhtml for more detail.
In other words, the large-scale methods are out there, but just aren't evenly distributed. Shades of George Box.... There are ways to do large scale integration high-volume, fast crunching of less-structured data, and companies like Google and the BBC have paved the way. Other companies just need to pay attention.
Social information will actually help machines make the connections, but data in graph form is what will enable sufficiently context-rich, large-scale integration. A brief animation at http://www.pwc.com/us/en/technology-forecast/2011/issue3/index.jhtml explains the phenomenon. We also interviewed your pal Sameer Patel in this issue of our journal.
Hope this helps for background, and that your painting is going well....
@AlanMorrison
Posted by: AlanMorrison | September 23, 2011 at 01:57 AM
Alan - Thanks for your lengthy comment. In the middle of a two day event but give it proper attention over the weekend. Bill
Posted by: bill Ives | September 23, 2011 at 08:41 AM
Alan - Thanks for your extensive commentary and useful links. The handling of big data from a technical side seems to be a problem that can be addressed as you point out. It just takes the will to do it. There are ways to store it and there are ways to visualize it to discover meaning, It is the discovery of meaning that is key as you note.
In a parallel way we, as individuals, also have to deal with an expanded set of stuff to look at. I used the term "big content" as a complement to the term big data. Big data effects certain organizations who deal with massive data sets. Big content effects all of us. Here is my post - “Big Data” vs. “Big Content” Complementary Sides of Information Overload - http://billives.typepad.com/portals_and_km/2011/09/big-data-vs-big-content-and-information-overload.html
Posted by: bill Ives | September 24, 2011 at 01:08 PM
Bill,
I like the Big Content meme you've elaborated on, but am wondering if it's ultimately more useful to consider ways to analyze less and more structured data together. ("Less structured" data includes content.) Cassandra's being used for a range of different data types--see Bill Bosworth's comments here: http://venturebeat.com/2011/09/21/datastax-lands-11-million-to-further-the-nosql-data-store-revolution/
Open Link's using SPARQL to query a blend of XBRL financial data mapped onto RDF along with less structured sources such as DBpedia: http://www.openlinksw.com/dataspace/dav/wiki/Main/VOSArticleRDFandMappedBI.
These methods are directly relevant to conventional BI. We quoted Doug Lenat of quad-store provided Cycorp awhile back, who pointed out that many BI folks are looking for their keys underneath the lamppost because that's where the light is. Integrating more sources--particularly blending external with internal data--makes it possible to light a larger area and query a broader footprint of information in one fell swoop.
Posted by: Alan Morriosn | September 25, 2011 at 08:05 PM
Alan
You raise good points. What I was referring to was a complementary way to heavy duty BI that puts the ability to find the unexpected in the hands of the topic expert. I am not suggested it replace traditional BI or even the more new wave approaches that you are describing. Thanks for the additional links to useful material in this space.
Posted by: bill Ives | September 25, 2011 at 08:23 PM