It’s the Data Stupid

When James Carville told President Bill Clinton’s successful 1992 presidential campaign team: “It’s the Economy Stupid” he left little doubt about what mattered most.

These days what matters most is the data.

Sure Google is made up of talented engineers, solid leadership, great code, etc.  What is it however, without its ( your ) data?

The companies with the highest earnings –  Apple, Alphabet ( Google ), Amazon, Microsoft, Facebook  – with a combined profit of approximately $25 Billion in the first quarter of the year –  What is their common denominator?


Today none other than the The New York Times is recognizing:

Your Data Is Crucial to a Robotic Age. Shouldn’t You Be Paid for It?

Data is the crucial ingredient of the A.I. revolution. Training systems to perform even relatively straightforward tasks like voice translation, voice transcription or image recognition requires vast amounts of data ..

That much is true.

I don’t blame Facebook for training their AI on all of those tagged puppy pictures uploaded every day. Shoot, I would do the same thing if I was given all that labeled data…

human or machine

My question is, how many social media users understand exactly what is going on?

If only one party of an agreement knows and understands how the data is being used – is that a square deal?

Most users realize that they are opting in to being advertised to when they use an online service.  Some might even know how cookies are used.  Some others may even realize that can be worth up to $2,000 a year.

Most are unaware of the vast amounts of data that are being collected about them and how that’s being used.

How Facebook uses your data

Would you be surprised to learn that Facebook  collected and stored “info on every single call and text for about a year” coming and going from your phone?

Not Facebook voice calls, cell calls.

One way to see what info Facebook collects on its users is to check out the features they offer advertisers to target potentially receptive users.

Some examples:

  • Users in long-distance relationships
  • Mothers, divided by “type” (soccer, trendy, etc.)
  • Conservatives and liberals
  • Users who bought auto parts or accessories recently
  • Expats (divided by what country they are from originally)
  • Users who carry a balance on their credit card
  • Users who recently used a travel app

There are many more.

Even if Facebook having this data about you, your kids, etc. does not bother you – how are you going to feel when there is the inevitable security breach. Do you feel the same peace of mind knowing that dark web denizens have all of that same info?

But… Everyone is doing it

I’ll spare you the tired jumping off a bridge analogy and I am not just picking on Facebook, although they are easy pickings. All big Internet companies are harvesting your data. At this point it is all but a fiduciary duty.

Back to the New York Times link above – the more data, particularly labeled (#tagged) data a company can effectively amass the better their AI will be.

The more good data you have to train test split on the better your model will be.

Now back to my question: Do you think that the average mother, regardless of “type” (soccer, trendy, etc.) knows Facebook’s AI is being trained on all those adorable baby photos new moms cannot help but post?

The Flip Side

Lest this become another “curse the digital darkness” screed let me wrap this up with some positives. First, while Google does collect everything they can about all of us so that they can pimp us out to their advertisers – Google does provide robust privacy tools.  If you invest some time you can learn everything about Google’s data collection and use of your data.

I understand that the average American is not going to do so – but the point is one could if one was so inclined. Google also has a fairly remarkable security record, at the very least it is fair to say that they are serious about security and that theirs is state of the art.

Google also deserves kudos for putting AI into the hands of anyone interested and capable. By releasing TensorFlow for free and offering services like Colab they are giving back to the Machine Learning community in significant ways.


Leave a comment

Your email address will not be published. Required fields are marked *