AIIM Market Intelligence Delivering the priorities and opinions of AIIM’s 65,000 community Industry Watch Content Analytics - research tools for unstructured content and rich media Underwritten in part by: Send to a friend (cid:2) ® aiim.org I 301.587.8202 About the Research As the non-profit association dedicated to nurturing, growing and supporting the ECM (Enterprise Content Management) community,AIIM is proud to provide this research at no charge. In this way the education, thought leadership and direction provided by our workcan be leveraged by the entire I n community. d Wu We would like this research to be as widely distributed as possible. Feel free to use this research in s t presentations and publications with the attribution – “© AIIM 2010, www.aiim.org” ary t Rather than redistribute a copy of this report to your colleagues, we would prefer that you direct them c to www.aiim.org/researchfor a free download of their own. h Our ability to deliver such high-quality research is partially made possible by our underwriting companies, without whom we would haveto return to a paid subscription model. For that, we hope you will join us in thanking our underwriters, who are: IBM MediaBeacon, Inc. Allyis 3565 Harbor Blvd, 123 North 3rd Street, Suite 800, 10210 NE Points Drive, Suite 200, Costa Mesa, CA 92626 Minneapolis, MN 55401 Kirkland, WA 98033 Phone: 800-345-3638 Phone: 612-317-0737 Phone: 888-425-5947 Email: [email protected] Email: [email protected] Email: [email protected] www.ibm.com/software/ecm/compliance www.mediabeacon.com www.allyis.com - re C Process Used, Survey Demographics and Terminology sea o rc n Wnohni-lep rwofeit ainpdpuresctriyataes tshoec isautipopno. rTt hoef trheessuelt sspoof nthseo rssu, rwveey a alsnod g trheea mtlya vrkaelut ec oomurm oebnjetacrtyiv imtya adned i nin tdheisp erenpdoernt caere as a h too te independent of any bias from the vendor community. ls fo n The survey was taken by 527 individual members of the AIIM community between February 9thand February 26th, r u t 2010,using a Web-based tool. Invitations to take the survey were sent via e-mail to a selection of the 65,000 AIIM ns A community members. tru n c tu a Survey population demographics can be found in Appendix A. Graphs throughout most of the report exclude re responses from suppliers of ECM products or services. d c ly o n t te i n c About AIIM t & s ric AIIM (www.aiim.org) is the community that provides education, research, and best practices to help organizations h m find, control and optimizetheir information. For more than 60 years, AIIM has been the leading non-profit e d organization focused on helping users understandthe challenges associated with managing documents, content, ia records and business processes. Today, AIIM is international in scope,independent and implementation-focused, acting as the intermediary between ECM (Enterprise Content Management) users, vendorsand the channel. About the Author Doug Miles is head of the AIIM Market Intelligence Division. He has over 25 years experience of working with users and vendors across a broad spectrum of IT applications. He was an early pioneer of document management systems for business and engineering applications, and has been involved in their evolution from technical solution through business process optimization to the current enterprise-wide adoption. Doug has also worked closely with other enterprise-level IT systems such as ERP, BI and CRM. Doug has an MSc in Communications Engineering and is an MIET. ® © 2010 AIIM - Find, Control, and Optimize Your Information 1100 Wayne Avenue, Suite 1100, Silver Spring, MD 20910 Phone: 301.587.8202 www.aiim.org 2 © 2010 AIIM - Find, Control, and Optimize Your Information Table of Contents About the Research: Projected Spend: About the Research . . . . . . . . . . . . . . . . . . . .2 Projected Spend . . . . . . . . . . . . . . . . . . . . . .14 I n d Process Used and Survey Demographics . . . . .2 Wu s About AIIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Conclusion: t ar y About the Author . . . . . . . . . . . . . . . . . . . . . . . . .2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . .15 t c h Introduction: Appendix 1: Survey Demographics: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Survey Background . . . . . . . . . . . . . . . . . . .16 Key Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Organizational Size . . . . . . . . . . . . . . . . . . . . . .16 Geography . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16 Search and Research: Industry Sector . . . . . . . . . . . . . . . . . . . . . . . . .17 Search and Research . . . . . . . . . . . . . . . . . .5 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8 Appendix 2: Glossary: Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18 Levels of Adoption: - re C Levels of Adoption . . . . . . . . . . . . . . . . . . . .10 Underwritten in part by: sea o rc n Content Decommissioning . . . . . . . . . . . . . . . .11 Allyis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19 h too te Business Drivers for Content IMBeMdia .B . e. a. c. o. n. ,. I.n .c .. . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..2109 ls for u nt Analytics: AIIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20 ns A tru n c Business Drivers for Content Analyitics 12 tu a re d l c y Rich Media and Digital Asset on t te i Management: nt & cs Rich Media and Digital Asset ric h Management . . . . . . . . . . . . . . . . . . . . . . . . .13 m e d ia Business Drivers for Rich Media Analytics: Business Drivers for Rich Media Analyitics . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 3 © 2010 AIIM - Find, Control, and Optimize Your Information Introduction The term “Content Analytics” has been coined to cover a range of search and reporting technologies which can provide similar levels of business intelligence and strategic value across unstructured data to that conventionally associated with structured data reporting. Sophisticated I n content search across text and rich media file-types, combined with trend analysis, content d assessment and behavioral reporting, has created the opportunity to track and manage Wu s unstructured content and digital assets with the same levels of capability as BI reporting of atr y structured content - with associated business benefits of content optimization, asset management, t pattern detection and compliance monitoring. c h As is usual with new technology, levels of awareness of both the technology and the naming terminology vary considerably. We have measured this in the report and provided a glossary of the main terms in Appendix 2. We have also explored the user-perceived limitations of conventional search, and the potential savings that could be achieved by application of content analytics to a number of business scenarios such as fraud detection, asset protection, healthcare research and market monitoring. Key Findings (cid:2) For 72% of respondents, it’s harder to find information owned by their organization than information not owned by them – i.e, on the Web. (cid:2) Of the 47% who find they frequently need to use Advanced Search options, more than half would like something more effective. - re C (cid:2) 70% would find advanced analytic functions “extremely useful” or “very useful.” sea o rc n (cid:2) “Fsoera mrcohs”,t pcoanrttiecunlta trylyp efosr, oriuchr rmesepdoian dfielenst,’s b aubt ialiltsyo t oo f“friecese daorccuhm” ies n3t-s6 a tnimd eesm leasilss .t han their ability to h too te (cid:2) E-discovery, Digital Asset Management (DAM), Web Analytics and De-duplication are the better ls fo n known technologies compared to Sentiment Analysis, Copyright Detection and Digital Forensics. r uns t A (cid:2) There are strong plans to adopt DAM, Faceted Search, E-discovery and Content Assessment in tru n c the next 18 months. tu a re d l (cid:2) The biggest obstacle faced regarding content decommissioning is, “not clear which content is c y o valuable and which is not”.There is also considerable “Fear of the compliance and regulatory n t te i impact of deleting information.” n c t & s (cid:2) Only 15% have an automatic way of finding and deleting duplicates in their content stores, with ric just 8% able to analyze them automatically for relevancy and to delete irrelevant content. h m (cid:2) 50% would find it of “high” or “very high” commercial value to be able to link a customer/citizen/ ed ia staff-member search across structured (database) data & unstructured documents & case notes. 44% would find it of “high” or “very high” commercial value to be able to automatically redact (blank out) sensitive information across forms, etc. (cid:2) 81% of those who have digital assets to manage are not using a dedicated Digital Asset Management system but 14% of our respondents are planning to implement one in the next 18 months. 48% store digital assets and rich media on ad hoc file shares. (cid:2) 59% would find it of “high” or “very high” commercial value if they could use a faceted search across multiple metadata tags to cross-reference categories of rich media.50% would find it of “high” or “very high” commercial value if they could detect unauthorized use of their assets across the web. (cid:2) Net spending on Enterprise Search, Digital Asset Management and Content Analytics is set for a considerable increase in the next 12 months. 4 © 2010 AIIM - Find, Control, and Optimize Your Information Search and Research For most people, “search” is synonymous with a Google-style presentation, with results based on relevancy of matches across the web. There is no doubt that this can often be startlingly useful - although it can also be somewhat frustrating at times. In fact, many information workers would be I n only too glad to have the same search capabilities across their internal information repositories as d they have on the web. Wu s t Figure 1: How easy is it to search information and documents held on ary your own internal systems compared to the Web? t c 0% 10% 20% 30% 40% 50% h Much harder Harder About the same Easier Much easier However, as we move from the more common requirement of document “search”to the somewhat - re C more demanding needs of “research”, straightforward search engine mechanisms are unable to se o a provide the pattern matching, trend plotting and semantic analysis that may be required. rc n h Figure 2: For your information research tasks, how effective do you find the “Advanced Search” too te options in standard search engines? (N=484, Non-Trade) ls fo n r u t 0% 10% 20% 30% 40% ns A tru n c I seldom need to use them ture a d l c y o I occasionally use them n t te i n c I use them and find them t & s effec(cid:3)ve ric h I use them but would prefer m e something more effec(cid:3)ve d ia I have access to more effec(cid:3)ve research and analysis tools Of our survey sample, only 47% are regular users of the so called “advanced” search functions, and more than half of those would prefer something more effective. 5 © 2010 AIIM - Find, Control, and Optimize Your Information Taking the idea of research versus search, we asked respondents how their ability compared across different types of content. We can see that rich media files such as audio, video and graphics lack tools for both types of activity. The differences are more clearly shown in Figure 4 where we show the ratio of search capability to research capability. Figure 3: How would you rate your ability to research across the following content types? In (N=450, Non-Trade, normalized for N/A) d Wu s t ar y Support/call-center logs t c Healthcare records h Case files Applica(cid:3)ons and claims Li(cid:3)ga(cid:3)on and legal reports Web content (own sites) Web content (external, social media, etc) Patents and research papers Contracts and customer agreements Inspec(cid:3)on reports General office documents Customer correspondence and comments - re C Emails se o a Video files rch n Photo images too te Press ar(cid:3)cles and news ls fo n Audio recordings r u t ns A Collateral, brochures, publica(cid:3)ons tru n c Graphics/print design files tu a re d l c y o Research/Analysis Search only Neither n t te i n c t & s ric h m e d ia 6 © 2010 AIIM - Find, Control, and Optimize Your Information Figure 4: How would you rate your ability to research across the following content types – ratio of “search” to “research” (higher number indicates poor ability to research compared to search) 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 I n Support/call-center logs d Wu Healthcare records s t ar Case files y t c Applica(cid:3)ons and claims h Li(cid:3)ga(cid:3)on and legal reports Web content (own sites) Web content (external, social media, etc) Patents and research papers Contracts and customer agreements Inspec(cid:3)on reports General office documents Customer correspondence and comments Emails - re C Video files se o a Photo images rc n h Press ar(cid:3)cles and news too te Audio recordings ls fo n r u t Collateral, brochures, publica(cid:3)ons ns A Graphics/print design files truc n tu a re d l c y o Healthcare records are an interesting example here in that Figure 3 indicates a lacking of basic n t te i search, but Figure 4 shows that some of our respondents have some useful research tools, and n c similarly with general case files which can be quite complex to analyze, but yield useful results. t & s Graphics and print design files, on the other hand are fairly easy to find, but few have the ability to ric h analyze their content - for example, seeking out obsolete logos. m e d Looking in Figure 5 at the types of analysis capability that researchers might wish to access, we see ia that users would definitely like to better exploit the keyword metadata, and use faceted drill-down to quickly refine results(although there may be some confusion here with conventional keyword search). 7 © 2010 AIIM - Find, Control, and Optimize Your Information Figure 5: Do you have access to any of the following analysis capabilities for research of unstructured/document/media content? (Tick all that apply)? 0% 10% 20% 30% 40% 50% 60% I n Keyword faceted search and mining -to quickly d Wu narrow results to topic of interest s t ar y Trend analysis -to iden(cid:3)fy pa(cid:5)erns in your business t area or func(cid:3)on c h Excep(cid:3)on analysis -to expose unusual occurrences of business behavior Time series analysis -to dynamically view occurrences or trends over periods of (cid:3)me Dimensional analysis -to compare facets to one another for insigh(cid:6)ul or unusual associa(cid:3)ons Correla(cid:3)on analysis - to map the relevancy of key concepts to the queried documents Duplica(cid:3)on analysis -using contextual understanding of content & associated metadata - re C se o None of these a rc n h too te Terminology ls fo n As mentioned in the introduction, the term “Content Analytics” and many of its constituent parts are r u t relatively new. This makes it quite difficult to measure the existing installed base. ns A tru n c Figure 6: Prior to this survey, how familiar were you with the term “Content Analytics”? tu a (N=454, non-trade/non-consultant) re d l c y o n t 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 te i n c t & s ric Never heard of it before h m e d ia Vaguely heard of it but no idea what it might mean Heard of it but not quite sure what it covers I think I know what it covers, but the term is not widely used in our organiza(cid:3)on We are using the term widely to describe what we do or plan to do 8 © 2010 AIIM - Find, Control, and Optimize Your Information In Figure 7, we asked about each individual technology, but without defining or describing what they are.Readers are referred to the Glossary in Appendix 2for a description of each. Figure 7: How aware are you of each of the following technologies? (N=446, non-trade) I n 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% d Wu s t E-Discovery tools ary t c Digital asset management h Web analy(cid:3)cs Content de-duplica(cid:3)on Faceted search tools Text analy(cid:3)cs Content assessment E2.0/Social media monitoring Digital forensics - re C Image and sound tagging se o a rc n E-mail trending h too te Copyright detec(cid:3)on ls fo n Sen(cid:3)ment analysis r uns t A tru n c Fully familiar with it Fairly familiar with it Some idea what it is Don’t know what it is ture a d l c y o n t te i n c t & s ric h m e d ia 9 © 2010 AIIM - Find, Control, and Optimize Your Information Levels of Adoption E-discovery and Digital Asset Management are generally well known, followed by Web Analytics, which is the most widely adopted (see Figure 8). Interestingly, Enterprise 2.0 and Social Media monitoring is an area of strong interest, but some would say this is a subset of more generic I n Sentiment Analysis. It is possible that “monitoring” is taken to mean a human overview rather than d automatic monitoring and alerting. Wu s t Figure 8: Are you using any of the following technologies in your organizational unit? ary (N=445, non-trade) t c 0% 10% 20% 30% 40% 50% 60% h Web analy(cid:3)cs Digital asset management Faceted search tools E-Discovery tools Content de-duplica(cid:3)on Content assessment Image and sound tagging Text analy(cid:3)cs - re C se o a E2.0/Social media monitoring rc n h Digital forensics too te Copyright detec(cid:3)on ls fo n r u t E-mail trending ns A tru n Sen(cid:3)ment analysis c tu a re Yes Planned in the next 18 months No d c ly o n t te i n c Within our survey sample, there are strong growth indicators for a number of areas including Digital t & s Asset Management, Content Assessment and Content De-duplication. ric h m e d ia 10 © 2010 AIIM - Find, Control, and Optimize Your Information
Description: