Home > DocuBase > Article

« All DocuBase Articles


Follow DocuTicker on Twitter   Feed

Tuesday, 15th April 2014

Text and Data Mining

Source: Directorate-General for Research and Innovation (EU)

From Executive Summary:

Text and data mining (TDM) is an important technique for analysing and extracting new insights and knowledge from the exponentially increasing store of digital data (‘Big Data’). It is important to understand the extent to which the EU’s current legal framework encourages or obstructs this new form of research and to assess the scale of the economic issues at stake.

TDM is useful to researchers of all kinds, from historians to medical experts, and its methods are relevant to organisations throughout the public and private sectors. Because TDM research technology is not prohibitively expensive, it is readily available to lone entrepreneurs, individual post-graduate students, start-ups and small firms. It is also amenable to playful and highly speculative uses, enabling research connections between previously unconnected fields. There is growing recognition that we are at the threshold of the mass automation of service industries (automation of thinking) comparable with the robotic automation of manufacturing production lines (automation of muscle) in an earlier era. TDM will be widely used to provide insights in the re-design of this digital services economy.

When it comes to the deployment of TDM, there are worrying signs that European researchers may be falling behind, especially with regard to researchers in the United States. Researchers in Europe believe that this results, at least in part, from the nature of Europe’s laws with regard to copyright, database protection and, perhaps increasingly, data privacy. In the United States, the ‘fair use’ defence against copyright infringement appears to offer greater re-assurance to researchers than the comparable copyright framework in Europe, which relies upon a closed set of statutory exceptions. Recent court decisions, for example in the ten-year old ‘Google Books’ case, appear to confirm this. The US has no equivalent of Europe’s database protection law.

+ Direct link to document (PDF; 2 MB)



Having begun his career in academic libraries, Adrian Janes has subsequently worked extensively in public libraries, chiefly in enquiry work as an Information Services librarian. In this role he has had particular responsibility for information from both the UK Government and the European Union. He wrote a detailed report on sources for the latter which was published by FreePint in 2007, and has contributed articles to FreePint and ResourceShelf. He is involved in training in information literacy and the use of online reference resources.

A Contributing Editor to DocuTicker, he also write reviews for Pennyblackmusic.

Adrian can be reached at adrian.janes@freepint.com

More articles by Adrian Janes »

Please note: DocuTicker's editors collect citations for full-text PDF reports freely available on the web but we do not archive these reports. When you click a link to find and/or download the report, you are leaving the DocuTicker site. DocuTicker makes no representations regarding the ongoing availability of any report or any external resource. Links were accurate as of the date of posting.

« All DocuBase Articles



FreePint supports the value of information in the enterprise. Read more »

FeedLatest FreePint Content:

All FreePint Content »
FreePint Topics »

A FreePint Subscription delivers articles and reports that support your organisation's information practice, content and strategy.

Find out more and order a FreePint Subscription by visiting the
completing our online form: Subscription Order page.

FreePint Testimonials

"I found the Communities of Practice session one of the most valuable I have attended. It's helpful to talk to so many people who worry about ..."

Read more testimonials and supply yours »




Receive the DocuTicker Newsletter each week.

Find out more »

Article Categories

All Article Categories »


All DocuBase Sources »

Source Categories

All Source Categories »


All Archives »