Take a photo of a barcode or cover
bob_muller 's review for:
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us about Who We Really Are
by Seth Stephens-Davidowitz
This book is really eye-opening on how you can use big data to understand social relations as they are. Main points: people lie everywhere except in their anonymous searches, and the analysis of those searches tell a pretty frightful tale of who "we" are; big data is really detailed data about lots of searches, allowing for much finer analysis by zooming in but also reducing privacy to nil; big data allows analysis of unlikely events by sheer size; and big data lets social scientists do real science, including formal and natural experiments through "A/B" testing (the term Google uses for automated designed experiments about people's behavior). I would love to see big data combined with modeling like Bayesian networks or agent-based simulation, the other main contributors to "real" social science (that is, combine theory with data through mathematical tools--like we do in physics).
Personally, I would have liked more depth on analysis techniques and perhaps some material on how to get data, analysis tools and techniques, and how machine learning can work with big data, but I'm a social scientist/engineer/geek and I don't take away stars for that :). Evidence: I will read Everybody (Still) Lies. And Seth should get a wife/life--now.
Revision 4/5/2020, in the middle of COVID-19 pandemic.
See Seth's article in NYT:
https://www.nytimes.com/2020/04/05/opinion/coronavirus-google-searches.html?action=click&module=Opinion&pgtype=Homepage
I changed the rating to 3 stars because I think now that the data analysis comment above should be taken much more seriously than I took it at the time. Reading this 4/5/2020 article really opened my eyes to the complete invalidity of "big data" analysis in the absence of serious data analysis techniques. The tip of this particular iceberg comes down to the difference between correlation and causation, but if you read reported data analysis, especially in the media, and the analysis is both self-contradictory and full of basic modeling and analysis errors, you begin to realize that so-called big data is no more than hand-waving. Without an underlying theory or causal model that informs the analysis, you have no idea what produced the outward form of the data.
In summary: Everybody Lies, indeed, especially with big data and statistics.
Personally, I would have liked more depth on analysis techniques and perhaps some material on how to get data, analysis tools and techniques, and how machine learning can work with big data, but I'm a social scientist/engineer/geek and I don't take away stars for that :). Evidence: I will read Everybody (Still) Lies. And Seth should get a wife/life--now.
Revision 4/5/2020, in the middle of COVID-19 pandemic.
See Seth's article in NYT:
https://www.nytimes.com/2020/04/05/opinion/coronavirus-google-searches.html?action=click&module=Opinion&pgtype=Homepage
I changed the rating to 3 stars because I think now that the data analysis comment above should be taken much more seriously than I took it at the time. Reading this 4/5/2020 article really opened my eyes to the complete invalidity of "big data" analysis in the absence of serious data analysis techniques. The tip of this particular iceberg comes down to the difference between correlation and causation, but if you read reported data analysis, especially in the media, and the analysis is both self-contradictory and full of basic modeling and analysis errors, you begin to realize that so-called big data is no more than hand-waving. Without an underlying theory or causal model that informs the analysis, you have no idea what produced the outward form of the data.
In summary: Everybody Lies, indeed, especially with big data and statistics.