Showing posts with label data mining. Show all posts
Showing posts with label data mining. Show all posts

December 13, 2018

A Social System that Inspires Pride and Shame


This story continues to fascinate me. 

China's social credit system started in 2015. 

China scores individuals based on public data (social media, financial, insurance, health, shopping, dating, and more), and they have people that act as "information collectors" (i.e. neighborhood watchers) who record what their neighbors are doing--good and bad. 

Each individual starts with a 1,000 points. 

If you do good things in Chinese society--helping people, cleaning up, being honest--you get points added. 

If you do bad things in China--fight with people, make a mess, be dishonest--you get points deducted. 

Fail below 1,000 points and you are in trouble--and can get blacklisted!

A good score is something to be proud of and a bad score is something that shames people to hopefully change for the better. 

But more than that, your social score has tangible social impacts--it can determine your ability to get into certain schools, obtain better jobs, homes, loans/mortgages, high-speed internet, and even high-speed train tickets/airplane flights. 

While maybe well intentioned, certainly, this has the very real potential to become a surveillance state and the embodiment of "Big Brother"!

On one hand, it seems like a great thing to drive people and society to be better. Isn't that what we do with recognizing and rewarding good behavior and with our laws and justice system in punishing bad behavior?

Yet, to me this type of all-encompassing social credit system risks too much from a freedom and privacy perspective. Should the government and all your neighbors be privy to your most intimate doings and dealings?  And should people be controlled to such an extent that literally everything you do is monitored and measured and counted for/against you?

It seems to me that the price of sacrificing your very personal liberty is too high to make in order to push people towards positive social goals.

Guiding people is one thing, and rewarding outstanding acts and punishing horrific ones is understandable, but getting into people's knickers is another. 

This type of social credit system really borders on social control and moves us towards a very disturbing, dystopian future. ;-)
Share/Save/Bookmark

January 25, 2015

Size And Smell

So apparently data mining can be used for all sorts of research...

In the New York Times today, Seth Stephens-Davidowitz tries his hand with google search results to better understand people's feelings about sex. 

Though Stephens-Davidowitz doesn't explain how he gets these google statistics...here are some standouts:

As you might have guessed, the biggest complaint from men--and women--is that they don't get/have enough sex. 

For both (as you might imagine in a primarily--95%--heterosexual world), traditional surveys show that it's about once a week.

However, the author says this is exaggerated (yeah, is it surprising that people exaggerate about this?) and it's actually only about 30 times a year--or once every 12 days.

So there are a lot of search on "sexless" or "won't have sex with me."

Observing that "sex can be quite fun," he questions, "why do we have so little of it?"

And he concludes that it's because we have "enormous anxiety" and insecurity about our bodies and sexuality.

Again, you probably wouldn't need data mining to guess the results, but men's biggest worry is about their penis size, and one of women's most toxic worries--a "strikingly common concern"--is about the smell of their vagina.

For men, they actually google questions about genital size more often than they have questions about any other body part; in fact, more than "about their lungs, liver, feet, ears, nose, throat, and brain combined."

So much for health consciousness versus machismo pride. 

The funny thing is apparently women don't seem to care so much about this with only about 1 search on this topic for every 170 searches that men do on this. 

Surprising to most men, about 40% of the searches women do conduct on this topic is "complaints" that it is too big!

Not that size doesn't matter to women, but for them it's about the size of their breasts and butts--and again, bigger being generally considered better.

In this case, most men seem to agree. 

Another issue men are concerned about is premature ejaculation and how to make the experience last longer.

However, here women seem to be looking for information about half and half on how to make men climax more quickly on one hand, and more slowly on the other. 

Overall, men are from Mars and women from Venus, with lot's of misunderstanding between the sexes.

The conclusion from this big data study...everyone calm down and just try to enjoy each other more.

Amazing the insights we can get from data mining! ;-)

(Source Photo: here with attribution to Daniel)
Share/Save/Bookmark

September 1, 2014

You're Probably Not A 10

There is a review online for nearly everything...from sources such as Amazon to Yelp, Angie's List, IMDb, and more. 

But what you may not realize is that the knife cuts both ways...you are not only the reviewer, but the subject of reviews.

And if you're not all that...then everyone can know it!

The New York Times has an opinion piece by Delia Ephron about how reports cards are no longer just for kids, and that they are "for the rest of my life...[and] this is going on your permanent record."

From cabbies that won't pick you up because you've been rated a bad fare to your therapist that says you can't stop obsessing, restaurants that complain you refused to pay for the chopped liver, and the department store says you wasted their salesperson's time and then bought online, and even your Rabbi says you haven't been giving enough to the synagogue lately. 

People hear things, post things, and can access their records online...your life is not private, and who you are at least in other peoples opinion is just an easy search away. 

In Tweets, Blogs, on Facebook, and even in companies customer records, you have a personal review and rating waiting for discovery.

Your review might be good, but then again...you are not always at your finest moments and these get captured in databases and on social media.

Data mining or exfiltration of your personal information is your public enemy #1.

Of course, you'd like to think (or wish) that you're brand is a 10, but not everyone loves you that way your mother does.  

Too bad you can't tell them, "If I want your opinion, I'll ask for it"--either way, your gonna hear what people think of you loud and clear. ;-)

(Source Photo: Andy Blumenthal)
Share/Save/Bookmark

July 6, 2012

The Information Is On You

There was a fascinating article in the New York Times (17 June 2012) called: "A data giant is Mapping and Sharing the Consumer Genome."

It is about a company called Acxion--with revenues of $1.13 billion--that develops marketing solutions for other companies based on their enormous data collection of everything about you!
 
Acxion has more than 23,000 servers "collecting, collating, and analyzing consumer data...[and] they have amassed the world's largest commercial database on consumers."

Their "surveillance engine" and database on you is so large that they:

- "Process more than 50 trillion data 'transactions' a year."
- "Database contains information about 500 million active consumers."
- "About 1,500 data points per person."
- Have been collecting data for 40 years!

Acxion is the slayer of the consumer big data dragon--doing large-scale data mining and analytics using publicly available information and consumer surveys.

They collect data on demographics, socio-economics, lifestyle, and buying habits and they integrate all this data.

Acxion generates direct marketing solutions and predictive consumer behavior information.

They work with 47 of the Fortune 100 as well as the government after 9/11.

There are many concerns raised by both the size and scope of this activity.
 
Firstly, as to the information itself relative to its:

- Privacy
- Security

Secondly, regarding the consumer in terms of potential: 

- Profiling
- Espionage
- Stalking
- Manipulation 

Therefore, the challenge of big data is a double-edged sword: 

- On one hand we have the desire for data intelligence to make sense of all the data out there and use it to maximum affect.
- On the other hand, we have serious concerns about privacy, security, and the potential abuse of power that the information enables. 

How we harness the power of information to help society, but not hurt people is one of the biggest challenges of our time. 

This will be an ongoing tug of war between the opposing camps until hopefully, the pendulum settles in the healthy middle, that is our collective information sweet spot. 

(Source Photo: Andy Blumenthal)


Share/Save/Bookmark

March 31, 2012

Which Big Brother

About a decade ago, after the events of 9/11, there was a program called Total Information Awareness (TIA) run out the Defense Advanced Research Projects Agency (DARPA).

The intent was develop and use technology to capture data (lots of it), decipher it, link it, mine it, and present and use it effectively to protect us from terrorists and other national security threats. 

Due to concerns about privacy--i.e. people's fear of "Big Brother"--the program was officially moth-balled, but the projects went forward under other names.  

This month Wired (April 2012) reports that the National Security Agency (NSA) has almost achieved the TIA dream--"a massive surveillance center" capable of analyzing yottabytes (10 to the 24th bytes) of data that is being completed in the Utah desert. 

According to the article, the new $2 billion Utah Data (Spy) Center is being built by 10,000 construction workers and is expected to be operational in a little over a year (September 2013), and will capture phone calls, emails, and web posts and process them by a "supercomputer of almost unimaginable speed to look for patterns and unscramble codes."

While DOD is most interested in "deepnet"--"data beyond the reach of the public" such as password protected data, governmental communications, and other "high value" information, the article goes on to describe "electronic monitoring rooms in major US telecom facilities" to collect information at the switch level, monitor phone calls, and conduct deep packet inspection of Internet traffic using systems (like Narus).

Despite accusations of massive domestic surveillance at this center, Fox News (28 March 2012) this week reported that those allegations have been dismissed by NSA. The NSA Director himself, General Keith Alexander provided such assurances at congressional hearings the prior week that the center was not for domestic surveillance purposes, but rather "to protect the nation's cyber security," a topic that he is deeply passionate about. 

Certainly new technologies (especially potentially invasive ones) can be scary from the perspective of civil liberties and privacy concerns.

However, with the terrorists agenda very clear, there is no alternative, but to use all legitimate innovation and technology to our advantage when it comes to national security--to understand our enemies, their networks, their methods, their plans, to stop them, and take them down before they do us harm.

While, it is true that the same technologies that can be used against our enemies, can also be turned against us, we must through protective laws and ample layers of oversight ensure that this doesn't happen. 

Adequate checks and balances in government are essential to ensure that "bad apples" don't take root and potentially abuse the system, even if that is the exception and not the rule. 

There is a difference between the big brother who is there to defend his siblings from the schoolyard bully or pulls his wounded brother in arms off the battlefield, and the one who takes advantage of them.

Not every big brother is the Big Brother from George Orwell's "1984" totalitarian state, but if someone is abusing the system, we need to hold them accountable. 

Protecting national security and civil liberties is a dual responsibility that we cannot wish away, but which we must deal with with common sense and vigilance.  

(Source Photo: here)

Share/Save/Bookmark

February 25, 2012

The False Information G-d

The amount of data in the world is exploding and yet the belief in G-d is evaporating.  

A review in the Wall Street Journal (22 February 2012) of a book called "Abundance" points to the explosion of data with the prevalence of information technology. 

From the earliest civilization to 2003, all written information totaled 5 exabytes (an exabyte is a quintillion bytes or 1 followed by 18 zeros).

But this is nothing compared to the last number of years, "where the change is not just accelerating, but the rate of acceleration of change is itself accelerating."

Between 2003-2010, 5 exabytes of digital information was created every 2 days, and by next year, 5 exabytes will be produced every 10 minutes!

Similarly, Wired Magazine (March 2012) reports in an interview with George Dyson that the "Digital Universe" is growing organically and "cycling faster and faster and it's way, way, way more than doubling in scale every year. Even with the help of [tools like] Google and YouTube and Facebook, we can't consume it all."

According to ComputerWorld (13 February 2012) in Your Big Data To-Do List, with all this data being generated, there is a mistaken assumption that we have to consume it all like drinking from a firehose. The article references a McKinsey study that projects that by 2018, there will be a need for 140,000 to 190,000 additional data analysts and statistical experts to try and make sense of it all. The article suggests that instead of trying to grasp at all the data, we instead "data scoop" and "target projects that can showcase results as opposed to opting for the big-bang, big-data projects."

And tools are being developed and deployed to try to get our arms around the information rolling in around the world. For example, Bloomberg Businessweek (November 28 - December 4, 2011) describes the tool from Palantir being used by the military, Intel, and law enforcement agencies for data mining, link analysis, and even predictive analytics. 

These days, "It's like plugging into the Matrix"--in terms of the amount of data streaming in. One special forces member in Afghanistan describes it as follows: "The first time I saw it, I was like Holy crap. Holy crap. Holy crap." But the thinking is now-a-days that with tools like Palantir (and others), we "can turn data landfills into gold mines."

But while information is power, Harvard Business Review (September 2011) in Learning to Live With Complexity acknowledges "We are hampered by cognitive limits." And moreover, "Most executives think they can take in more information than research suggests they can." And harnessing data into information is constrained by the complexity involved--driven by the number, interconnections, and diversity of interacting elements.

The result is that while we are becoming in a sense data rich, in may ways, we are still information poor. And even with all the sensors, data, and tools available to search, access, and analyze it, we are becoming perhaps overconfident in our ability to get our arms around it all and in turn master the world we live in. 

The hubris in our abilities to use information technology is leading many to worship the proverbial information G-d, and in turn, they are forgetting the real one. According to the Pew Forum on Public and Religious Life (February 2010) as quoted in CNN, "young Americans are significantly less religious than their parents and grandparents were when they were young."  Moreover, a full one in four American millennials--those born after 1980--are not affiliated with any faith--they are agnostic or atheists.

Similarly, Mental Floss Magazine just a few months ago (November-December 2011) had various authored columns asking "Is G-d In Our Genes?" and another "Is G-d In a Pill?" questioning whether the age-old belief in G-d comes either from a genetic disposition in some to a drug-induced states in others.

While religion is a personal matter, and for a long time people have argued whether more people have died in wars over religion or money and power, as a person who believes in G-d, I find it most concerning that with the rise of information (technology) power in the last 30 years, and the exuberance and overconfidence generated from this, there is an associated decline in belief in G-d himself.  

While technology has the potential to raise our standard of living (in leaps and bounds even) and help solve many of our vexing problems, we cannot forget that technology is run by human beings who can choose to be good or evil and use information technology to either better mankind or the opposite, to destroy it. 

Ultimately, I believe that it is but G-d almighty who shapes the thoughts and destiny of mankind, so that one man sees just a string of bits and bytes--a matrix of zeros and ones--while another sees a beautiful new musical composition, the next terrorist attack, or even the amazing cure for cancer.  

(Source Photo: here)

Share/Save/Bookmark

September 3, 2007

Business Intelligence and Enterprise Architecture

“Business intelligence (BI) refers to applications and technologies that are used to gather, provide access to, and analyze data and information about company operations. Business intelligence systems can help companies have a more comprehensive knowledge of the factors affecting their business, such as metrics on sales, production, internal operations, and they can help companies to make better business decisions.” (Wikipedia)

Business intelligence includes warehousing data and mining data (sorting large amounts of data to find relevant information). Metadata (data about data) aids in the mining of useful nuggets of information. The warehousing and mining of data for business intelligence is often referred to as a decision support system.

User-centric EA is business (and technology) intelligence!

  • EA is a knowledge base and warehouse of information: BI warehouses date for decision support applications in the organization. Similarly, EA synthesizes and stores business and technical information across the enterprise to enable better decision making. EA uses applications like Systems Architect, DOORS, Metis, Rationale, and others to capture information in a relational database repository and model business, data, and systems. The intent is to develop a knowledge base for the capture, mining and analysis of data to enhance IT planning and governance.
  • EA provides for mining, querying, and reporting: BI tools use online analytical processing (OLAP) tools like Cognos, BusinessObjects, Hyperion, and SAS that utilize multi-dimensional database cubes for manipulating data into different views, and provides for analysis and reporting. Similarly, User-centric EA provides for analysis and reporting of performance measures, business functions, information requirements, applications systems, technology products and standards, and security measures. While EA tools are more limited than general BI tools in terms of OLAP capabilities like online queries, I believe that these tools will move in this direction in the future.
  • EA uses information visualization to communicate effectively: BI tools provide executive dashboard capabilities for displaying executive information in a user-friendly GUI format. Like an executive dashboard, EA often displays business and technology information in profiles and models that make extensive use of information visualization to communicate effectively and at strategic, high-level views to decision makers.

In is the role of the chief enterprise architect to sponsor, communicate, and educate on the use of EA for business and technology intelligence in the organization.


Share/Save/Bookmark