ARCHIVED CONTENT
You are viewing ARCHIVED CONTENT released online between 1 April 2010 and 24 August 2018 or content that has been selectively archived and is no longer active. Content in this archive is NOT UPDATED, and links may not function.
Extract from article by Thomas FoxWhat precisely is big data? I once put that question to Joe Oringel, a co-founder of Visual Risk IQ, who defined it as “unstructured data” generally meaning data across multiple database systems. In the eBook chapter entitled “What is Big Data?” Edd Dumbill expanded on Orinigel’s definition when he wrote, “Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architecture.”
What are some of the key characteristics of big data? Clearly it is big but it is not simply a size storage problem. This ‘bigness’ also means such data can be hard to transport. Of course this brings up the issue of how and where you are going to store all of this data; on the cloud or in dedicated servers. Big data is messy and Dumbill notes, “Big data practitioners consistently report that 80% of the effort involved in dealing with data is cleaning it up in the first place.”
Read the complete article at Big Data in a Best Practices Compliance Program, Part I