I will be on vacation the week of the 6th. While I am gone, I thought I would run some of my favorite posts from the past 6 months. Also, while I am thinking about it, here are some free How-To guides that I thought might be of interest:
- How to Achieve Best Practices: Records Management
- How to Conduct a Social Business Assessment
- How to Unclog Your Business by Automating Content-Intensive Process
- How to Assess Scanning and Capture Requirements
- Automating ERM with SharePoint
- How to Develop Taxonomies to Support Navigation, Information Discovery, and Findability
Enjoy. "See" you the week of the 13th.
From March 15...
I've been thinking about the question of the relationship of content and unstructured information to the seemingly never-ending parade of articles about Big Data. There is a certain element to this thinking that frankly is opportunistic, finding me thinking, "Hey, if EVERYONE is going to talk about Big Data, I want a piece of that." But there is also the stubborn reality that unstructured information is the red-headed stepchild of the Big Data equation - and the source of so much untapped value and intelligence in organizations. And our community - users, solution providers and consultants - knows something about this whole messy question of unstructured information.
I think understanding our role begins with some of the work we have done relative to Systems of Record and Systems of Engagement. Yuchon Lee, a VP with IBM, described the relationship this way...
For the past decade, companies have been accumulating data in what we call a system of record. Those who survive going forward will also have systems of engagement, which start with evaluating how you can have a relevant conversation with each individual customer across all channels. And insuring you have the analytical capability and the data to support that analysis. That is where the linkage is between the system of record data to system of engagement. On the technology side, we believe the future of handling this volume lies in leveraging the capability of the cloud. A lot of the analysis is done behind a firewall, but the analysis, platform and architecture is really a hybrid. That is how you solve the problem and get the most value out of the data.
This leads then to the kinds of business applications that are dramatically changed - or even made feasible - by tapping into the power of aggregating and interpreting large volumes of information. I like this list, which comes in part from Cloudera. All of these are applications with a fairly high connection to the land of Systems of Engagement - but are improved by tapping into the information - especially unstrutured - hidden away in the land of Systems of Record.
- Modeling risk and failure prediction
- Analyzing customer churn
- Web recommendations (ala Amazon)
- Web ad targeting
- Point of sale transaction analysis
- Threat analysis
- Compliance and search effectiveness
2-We have done a better job of managing this high-value Systems of Record information on thestructured side than on the unstructured side. This is not only in terms of the % of information under some sort of governance (think of our usual lament that 80% of the information in an organization is unstructured, and most of this in unmanaged), but also in terms of the lack of tools to actively interpret and mine all of this unstructured information in any meaningful way.
3-Systems of Engagement are generating massive volumes of new structured and unstructured information. Per Fortune, by 2020, Internet connected devices will grow from 400 million today to 50 billion. These devices will be talking to each other and to the Internet. By 2020, it is also predicted that our smart phones will have the capability of storing and accessing as much information as IBM’s Watson and super-computers can. The core difference between this "low-value-density" information and all of the high-value information in Systems of Record is that this new information tends to have value in the aggregate or as it is interpreted rather than intrinsinctly. In other words, it is easy to see the value in storing a document or a piece of data that documents a specific transaction or process. It is more difficult - and it has been too expensive in the past - to do so with vast quantities of digital flotsam and jetsam that has value only as it is aggregated and analyzed.
4-Cloud technologies such as HADOOP and NoSQL have dramatically changed the cost of analyzing large volumes of information, making analysis of large amounts this information affordable for first time.
5-Advances in semantics, search, content and text analytics, and print stream analytics are now making analysis of large amounts of information practical for first time - especially all of that unstructured information hidden away in digital landfills. In addition, for the first time, natural language processing and visualization techologies are moving the analysis of all of this data and information from technical back rooms and into the executive suite (to help solve the vexing business problems listed above).
6-Lastly, the opportunity that exists now - as reflected in the opening IBM quote - is the marriage of the cloud technologies that are making large scale information analysis affordable for the first time with new analytic and reporting technologies that are making all of this information comprehensible for the first time. A marriage with rich opportunities to move the management of large aggregations from a pure cost calculus (whether hard dollars or risk-based) to one that is balanced by the potential value hidden away in digital landfills.