Bigger Data Isn’t Always Better Data

Algorithms can be as flawed as the humans they replace — and the more data they use, the more opportunities arise for those flaws to emerge.

Extract of article from Cathy O’Neil

Yet algorithms can be as flawed as the humans they replace — and the more data they use, the more opportunities arise for those flaws to emerge. Most essentially assign points to a candidate depending on the presence of certain attributes that are correlated with success, with no consideration for the nature and nuances of those correlations.

One issue is that the algorithms tend to use linear models, so they assume that more is always better, and way more is way better. This can be fine when dealing with attributes such as education or experience. Something like Facebook activity, by contrast, could have a golden mean — a reasonable amount might suggest engagement in a community, while an abundance could indicate addiction.

ComplexDiscovery combines original industry research with curated expert articles to create an informational resource that helps legal, business, and information technology professionals better understand the business and practice of data discovery and legal discovery.

All contributions are invested to support the development and distribution of ComplexDiscovery content. Contributors can make as many article contributions as they like, but will not be asked to register and pay until their contribution reaches $5.