ARCHIVED CONTENT
You are viewing ARCHIVED CONTENT released online between 1 April 2010 and 24 August 2018 or content that has been selectively archived and is no longer active. Content in this archive is NOT UPDATED, and links may not function.
Daily in data discovery and governance discussions1 we read and hear about semantic disputes regarding the existence and definitions of “unstructured data.”
A semantic dispute is a disagreement that arises if the parties involved disagree about the definition of a word, not because they disagree on material facts, but rather because they disagree on the definitions of a word (or several words) essential to formulating the claim at issue. (Wikipedia)
While some may dispute the existence of unstructured data, definitions for the term “unstructured data” do exist.
A Short List of Definitions
Definitions of Unstructured Data
AIIM: Unstructured information is, well, information that does not have a fully defined structure, and most likely will be read and used by humans. As examples, think of most of the information produced by common office applications (word processors, presentation programs).2
EDRM: Data that is not in tabular or delimited format. File types include word processing files, html files (web pages), project plans, presentation files, spreadsheets, graphics, audio files, video files and emails.3
eWeek: The term unstructured data describes information that is not organized into a well-defined schema. Nearly all that lives outside of relational databases is unstructured and includes images, videos and log files produced by computers, machines and sensors.4
Gartner Research: Gartner defines unstructured data as content that does not conform to a specific, pre-defined data model. It tends to be the human-generated and people-oriented content that does not fit neatly into database tables.5
International Journal of Innovative Technology Research: Unstructured data is heterogeneous and variable in nature and comes in many formats, including text, document, image, video, and more.6
OASIS: Unstructured information may be defined as the direct product of human communication. Examples include natural language documents, email, speech, images and video. It is information that was not specifically encoded for machines to process but rather authored by humans for humans to understand. We say it is “unstructured” because it lacks explicit semantics (“structure”) required for applications to interpret the information as intended by the human author or required by the end-user application.7
Webopedia: The phrase unstructured data usually refers to information that doesn’t reside in a traditional row-column database. As you might expect, it’s the opposite of structured data — the data stored in fields in a database.8
WhatIs.com: Unstructured data is a generic label for describing any data that is not in a database or other type of data structure.9
Wikibon: [Unstructured data is] a file with little or no metadata, and little or no classification data.10
Wikipedia: Unstructured data (or unstructured information) refers to information that either does not have a pre-defined data model or is not organized in a pre-defined manner.11
Relevant References
1 Rewiring to Tackle Unstructured Data | WIRED. (n.d.). Retrieved from http://www.wired.com/2014/07/rewiring-tackle-unstructured-data/
2 AIIM – What is ECM? What is Enterprise Content Management? (n.d.). Retrieved from http://www.aiim.org/What-is-ECM-Enterprise-Content-Management
3 Unstructured Data « EDRM. (n.d.). Retrieved from http://www.edrm.net/resources/glossaries/glossary/u/unstructured-data
4 Unstructured Data Is an Important Untapped Resource: 10 Reasons Why. (n.d.). Retrieved from http://www.eweek.com/storage/slideshows/unstructured-data-is-an-important-untapped-resource-10-reasons-why
5 Big Content: The Unstructured Side of Big Data – Darin Stewart. (n.d.). Retrieved from http://blogs.gartner.com/darin-stewart/2013/05/01/big-content-the-unstructured-side-of-big-data/
6 International Journal of Innovative Technology Research. (n.d.). Retrieved from http://www.ijitr.com/index.php/ojs/article/viewFile/539/pdf
7 OASIS Unstructured Information Management Architecture (UIMA) TC. (n.d.). Retrieved from https://www.oasis-open.org/committees/uima/charter.php
8 What is Unstructured Data? A Webopedia Definition. (n.d.). Retrieved from http://www.webopedia.com/TERM/U/unstructured_data.html
9 What is unstructured data? – Definition from WhatIs.com. (n.d.). Retrieved from http://searchbusinessanalytics.techtarget.com/definition/unstructured-data
10 The Growth And Management Of Unstructured Data – Wikibon. (n.d.). Retrieved May 12, 2015, from http://wikibon.org/wiki/v/The_Growth_and_Management_of_Unstructured_Data
11 Unstructured data – Wikipedia, the free encyclopedia. (n.d.). Retrieved May 12, 2015, from http://en.wikipedia.org/wiki/Unstructured_data