Can we avoid another Cambridge Analytica? (And, is AI ethics even implementable?)
- Dipyaman Sanyal 
- Jan 12, 2024
- 5 min read
Most ideas come from random juxtaposition of things or events, which in turn makes the human connect dots which were probably not seen by it before. I am sure what I am writing about has been written about before, but it definitely got me thinking about ethics in the time of scientific excitement.
I was recently on a 3-hour long flight and had downloaded some videos to watch undisturbed. I am reviewing reams of math videos (I want to create a set of posts with links on all the relevant math that you need to know for doing AI/ML/DS well, because it is something that I often get asked about). For the first hour or so I listened to the amazing Gilbert Strang invert matrices and do some decomposition for teaching the math behind deep learning. When I was younger, I could watch such videos and do the math along with it for 12 hours without flinching. Now, I can only do an hour and need a break! Age makes you slow. So, I decided to watch a fairly informative but extremely ‘Hollywood’ documentary on Cambridge Analytica available on Netflix. The problem with these pop documentaries (like most popular movies) is that they need to have a clear good guy and a bad guy, and there is a fight, and David kills Goliath or does not, and it is all neat and clean…the emotional music and the ‘epic’ shots of visual expressions of the good guys who are determined to fight the good fight.
The challenge for me was that every time I heard the guys at Cambridge Analytica speak about what they did and how they did it, I thought about inverting a very large matrix, how difficult it will be technically to decompose it, how heuristics play a role in algorithmic optimization and how all of it leads to successful probabilistic models. Cambridge Analytica (CA) stole data and that was illegal. But every day, hundreds of firms are using data in similar manner which they have legally collected from each one of us as we have unthinkingly clicked on the “Accept all Cookies”. And to be honest, if ChatGPT uses my ‘Likes’ information to get better, I am happy to give it that information because I am sure it does not care about me as a person but it cares about me as a data generating process. The phrase psy-ops (psychological operations) was used multiple times and someone mentioned that it is classified weapons grade material! That is balderdash. Every marketing company worth its salt is doing the same ‘psy-ops’ that the guys at CA were doing. You might not like the outcome of those CA psy-ops – Brexit, Trump, political changes in Trinidad and Tobago, and so on. But your (or mine) liking or disliking of it does not make it classified, weapons grade technology. It is after all just matrices being inverted, optimized, decomposed, minima being found, gradients descending or not, and probabilities being calculated at scale which allows for personalized targeting of messages.
As I finished watching the documentary I realized that while I deeply cares about AI ethics, I was more excited about the beauty of the science and the math behind the work of CA than upset about the fact that CA (or any other, less evil marketing firm, politician, influencer, etc.) was toying around with the minds of the American voter to get Trump into power or helping making the British voter move towards Brexit, many of whom might not have had a clear idea of the ramifications. Again, my personal views on these outcomes are not relevant to the discussion. Even if the other side used the same ops to get Hillary voted or avoided Brexit the issue would remain the same.
And that is the unfortunate part of being deeply involved in technology or science. For a data scientist, Deep is a row (whose PII has been purged) about whom she might have 325 columns of information. Among those 325, one is her dependent variable and she will use the other 324 to predict my interest in the dependent variable. Rows and columns of a matrix – nothing more. It is like where we round off the dead when we discuss the numbers who are killed in a war. 500 people died when the true number is maybe 507. Those seven were human beings too with their aspirations and a right to live but got rounded off in the newspaper article. Our individual row in a dataset is also a statistically insignificant data point. And at the aggregate, it is only a challenging problem to solve with the latest technology at our disposal. You get carried away by the beauty of the algorithmic beast that you are using, or in some cases, creating. However much particle physicists tried later, the power of the atom could not be put back in the bottle again. The same will hold true for all the AI/ML algorithms that we are working with today.
Then, should practitioners and data ‘scientists’ keep ignoring the ethics of it and keep getting excited about the science and its developments, because we know that if we do not ‘do it’, someone else will (the classic argument for silence)? Or do we stop developing the science and try to regulate it with ridiculous compliance rules (like the recent one we saw come out of the White House, which mentioned how many petaflops an AI model has to be to become regulated 😳)!
Or is there a small chance that ethical use of AI can be inculcated through self regulation, which can be encouraged by universities, training institutions, influencers and others? Are CFA charter holders more ethical in their dealings in financial markets since they spent hundreds of hour reading ethics for all 3 levels (I would like to believe so, but do not have the data)? Dan Ariely said that when people are asked to sign a ‘honesty pledge’ before tasks, they tend to be more honest. We later realized that it is quite likely that Dan, or his co-authors, were probably lying about their results and making the data up, so I am still not sure what to believe!
I do not have answers. But I can say from personal experience that the excitement of the science and the technology for those who are wired to that world might often make them think less about ethical or (even worse) unintended consequences of their work. And as long as practitioners and academics do not get that, we will have many Cambridge Analyticas, who might not be doing anything illegal.




Comments