In 2012, Davenport and Patil’s article in Harvard Business Review titled Data Scientist: The Sexiest Job of the 21st Century, raised the profile of a profession that had been naturally evolving in the modern computing era – an era where data and computing resources are more abundantly and cheaply available than ever before. There was also a shift in our industry leaders adopting a more open and evidence-based approach to guiding the growth of their business. Brilliant data scientists with machine learning and artificial intelligence expertise are invaluable in supporting this new normal.

While there are different opinions on what defines a data scientist, as the leader of the Data Science Practice at Think Big Analytics, the consulting arm of Teradata, I expect data scientist on my team to embody specific characteristics. This expectation is founded on a simple question – Are you having a measurable and meaningful impact on the business outcome?

Any data scientist can dig into data, use statistical techniques to find insights and make recommendations for their business partners to consider. A good data scientist makes sure that the business adopts those insights and recommendations by focusing on the problems that are important to the company and making a compelling case grounded in business value. An impactful data scientist can iterate quickly, address a wide variety of business problems for the organization and deliver meaningful business impact swiftly by using automation and getting their insights integrated into production systems. Consequently, impactful data scientists more often answer ‘yes‘ to the question above.

So what makes a Data Scientist impactful? In my experience, they possess skillsets that I broadly characterize as that of a scientist, a programmer, and an effective communicator. Let us look at each of these in turn.

what_is_a_data_scientist_2.png

Firstly they are a scientist. Data scientists work in highly ambiguous situations and operate on the edge of uncertainty. Not only are they trying to answer the question, they often have to determine what is the question in the first place. They have to ask vital questions to the understand the context quickly, identify the root of the problem that is worth solving, research and explore the myriad of possible approaches and most of all manage the risk and impact of failure. If you are a scientist or have undertaken research projects, you would recognize these as traits of a scientist immediately.

In addition, data scientists are also programmers. Traditional mathematicians, statistician, and analysts who are comfortable using GUI-driven analytical workbenches that allow them to import data and build models with a few clicks often contest this expectation. They argue that they don’t need computer science skills since they are supported by (a) team of data engineers to find and cleanse their data, and (b) software engineers to take their models and operationalize them by re-writing them for the production environment. However, what happens when data engineers are busy, or the sprint backlog of IT department means the model that a data scientist has just found to make a company millions won’t make it to production for the next 6-9 months? They wait, and their amazing insights have no impact on the business.

Programming and computer science skills are essential for data scientists so that they are not ‘blocked’ by organizational constraints. A data scientist shouldn’t have to wait for someone else to find and wrangle the data they need, nor be afraid of getting their hands dirty with the code to ensure their models make it to production. It also means, data scientist do not become a bottleneck to their organization by automating their solutions for production or automatic reports. Given the highly distributed and large volume transactions in online, mobile and IoT applications means data scientists need to consider the design of their solution for scale. For example, will their real-time personalization model scale to the 100,000 requests per second for their company’s website and mobile app?

Finally, a data scientist should be an effective 2-way communicator. Not only should they empathize to understand the business context and customer needs, but also convey the value of their work in a manner that appeals to them. One of the hardest skill to master for some knowledgeable data scientists is often the ability to influence organizations without authority. A data scientist that goes around asserting that everyone should listen to them because he or she has data and insights without cultivating trust is likely to earn them the title of a prima donna and not achieve the impact that they can with those insights. Effective communication is relatable, precise and concise.

Data scientists with these three broad skillsets are in an excellent position to have a meaningful and measurable impact on the business outcomes, making them highly valuable to any organization. Of course, this list doesn’t talk about innate abilities like creativity, bias for action and a sense of ownership. Neither does it consider the organizational culture that may either support or hider their impact. I have focused on skills that can be developed through training and practice. In fact, these are essential elements to the growth and career paths for my team of brilliant and impactful data scientists at Think Big Analytics. 

Credits:

Just before I flew back to Seattle, I gave a talk last week at my alma mater – School of Computer Science & Engineering at UNSW, Australia. It was great to see some familiar faces and meet some new ones that I hope feel more compelled to tackle some interesting problems in data science, machine learning (ML) and artificial intelligence (AI).

In this talk, I shared some the personal lessons that I learnt as part of building AI & ML solutions at companies like Amazon and Oracle. I also opened up about my fears of these technologies, as well as the challenges that the industry faces in delivering intelligent systems for the 99% (?) of businesses. You can find the slides from the talk (PDF) for the references and links that I mentioned. Just send an email to ( avishkar @ gmail dot com) with the subject “AI & ML” to get the password to the PDF.

The most important message that I wanted to impart to the room full of researchers, academics, and industry practitioners was how do we collectively address the shortage of skills needed to develop AI and ML solutions to the broad range of business problems beyond the top 1% of leading-edge tech companies. Education, standards and automated tools can help ensure a certain base level of competency in the application of AI & ML.AddressingSkillsShortage.jpg

The vast majority of the businesses out there are not Google, Amazon or Facebook, with deep pockets and years of R&D experience to tackle the challenge of applying AI and ML. Everyone from schools (i.e. universities) and industry responsible for growing this field must also develop standards and tools that ensure a certain level of quality is maintained for the solutions that we put into production. We have had standards when it comes to mechanical and civil engineering to ensure that things that can impact people’s lives and safety adhere to a certain quality standard. Similarly, we should also develop standards and encourage organizations to validate compliance with those standards when it comes to developing AI & ML solutions with far-reaching consequences.

BiasedDataBiasedModels.jpg

A simple and very personal example was that one of my own photos was rejected by the automated checks to verify that a passport photo complies with the requirements for visas. The fact that the slightly “browner” version of me (left) failed the check seems to suggest an inherent bias in the system due to the kind of data used to build the system. Funny but scary. How many other “brown” people have had their photos rejected by such a system?

Other examples would be Human Resource systems that identify potential candidates, suggests no-/hire decisions or recommends salary packages to new hires. If the system is trained on historical data and uses gender as a feature, is it possible that the system could be biased against women for high-profile or senior positions? Afterall historically women have been under-representative in senior positions. Standards and compliance verification tools can help us identify such biases, ensuring that data and models do not introduce biases that are unacceptable in a modern and equitable society.

Academics, researchers, and industry practitioners cannot absolve themselves of the duty of care and consideration when developing systems that have a broad social impact. Data scientists must think beyond the accuracy metric and the whole ecosystem in which the system operates.

Image Credit:

  • Modeling API by H Alberto Gongora from the Noun Project
  • education by Rockicon from the Noun Project
  • tools by Aleksandr Vector from the Noun Project
  • Checklist by Ralf Schmitzer from the Noun Project

The super secret exciting project that I spent days and nights slogging over when I was at Amazon has finally been announced – Amazon Go. A checkout-less, cashier-less magical shopping experience in a physical store. Check out the video to get a sense of the shopping experience that simplifies the CX around the shopping experience. Walk in, pick up what you need and walk out. No line, no waiting, no registers.

I’m very proud of an awesome team of scientists & engineers covering software, hardware, electrical and optics that rallied together to build an awesome solution of machine learning, computer vision, deep learning and sensor fusion. The project was an exercise in iterative experimentation and continually learning, refining all aspects of the hardware, software as well as innovative vision algorithms. I personally was involved in 5 different prototypes and the winning solutions that ticked all the boxes more than 2 years ago.

I remember watching Jeff Bezos and the senior leadership at Amazon, playing with the system by picking and returning the items back to the shelves. Smiles and high-fives all around as the products were added and removed from the shopper’s virtual cart, with the correct quantity of each item.

Needless to say there is a significant effort after the initial R&D is done to move something like this to production, so it is not surprising that it has taken 2 years since then to get it ready for public. Well done to my friends at Amazon for getting the engineering solution over the line to an actual store launch for early 2017.

Photo Credit: Original Image by USDA – Flickr

 

In March of 2016, Google’s AlphaGo beat the world champion Lee Sedol at the game of Go, a feat hailed as an important milestone for Artificial Intelligence (AI). It was also a big deal with Deep Learning and Reinforcement Learning. But what was the big deal?

Let’s start with a simple game of Noughts and Crosses, also known as Tic-Tac-Toe. A game played on a 3×3 grid by 2 players placing O (noughts) and X (crosses)  in turns with the objective of getting 3 noughts or crosses in a row.

Source: Wikipedia

Naive counting leads to 19,683 possible board layouts (39 since each of the nine spaces can be X, O or blank), and 362,880 (i.e., 9!) possible games (different sequences for placing the Xs and Os on the board). – Wikipedia

Now we (i.e. humans) play the game without enumerating all possible board layouts or exploring all possible games. However, that is how computers are typically programmed to play the game. After each move, computers would generate the tree of moves, where each branch represents the sequence of moves. The computer generates the tree to a certain depth and then identify the ‘branches’ most likely to lead to a victory, and selects that as its next move. The process is repeated after the other players make a move, until the computer or human wins. This brute-force search and prune strategy are fundamentally how Deep Blue beat Garry Kasporav in 1997, and the game of chess has about 1043 number of legal positions.

Now let us look at the game of Go.

There are 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,

000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,

000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,

000,000,000,000,000,000 possible positions—that’s more than the number of atoms in the universe, and more than a googol times larger than chess. – Google Blog 

With 10171 possible positions in Go (I counted the zeros), a brute-force search used by Deep Blue to beat Kasparov in chess, just won’t work with Go. So AlphaGo had to use intuition when selecting between moves, a bit like the way humans do. 

In this context, intuition means working with limited information and taking a shortcut in selecting only a tiny subset of options to arrive at a good move. The idea is similar to the idea of “thin-slicing” that Malcolm Gladwell discusses in his book Blink [ kindle ]. Intuition is a function of experience and practice – leading to both desirable efficiencies and undesirable biases into our daily decision making, but that is a different post for another day. 

During the match against Fan Hui, AlphaGo evaluated thousands of times fewer positions than Deep Blue did in its chess match against Kasparov4; compensating by selecting those positions more intelligently, using the policy network, and evaluating them more precisely, using the value network—an approach that is perhaps closer to how humans play. Furthermore, while Deep Blue relied on a handcrafted evaluation function, the neural networks of AlphaGo are trained directly from gameplay purely through general-purpose supervised and reinforcement learning methods. – Nature Paper

This means, that given data for supervised learning (i.e. deep learning) and time to practice (i.e. reinforcement learning) AlphaGo could continue to get better and better, and not rely on human experts to come up with heuristics or evaluation functions.

This was an extremely promising sign for Artificial Intelligence (AI) and getting us one tiny tiny (did I mention tiny?) step towards general purpose AI. Deep learning is used to reduce the feature engineering that would typically go in setting up a machine learning algorithm. Reinforcement learning mimics the idea of practice or experimentation that we as humans use in learning how to play tennis, play an instrument or make a drawing.

Learning to mimic intuition, without explicit feature engineering with deep learning and practicing it via reinforcement learning offers a very interesting template for teaching machines how to deal with kind of problems that humans excel at with their intuition. It also represents a way for machines to operate on large search space problems (i.e. traveling salesman problems, scheduling, and optimizations) and get reasonably good solutions given a fair trade-off of time and compute resources. This is why AlphaGo is a big deal.

Credits & References:

What is Artificial Intelligence?

Artificial Intelligence (AI) refers to a machine or system behaving with some degree of intelligence in its actions. These actions could be as simple as turn on the sprinkler when the soil moisture level drops a certain level, or complex as recognizing objects or faces in images. AI is a very broad range of capabilities, and while there is no general-purpose-AI to rival human intelligence (yet), they are extremely useful when it comes to helping humans as assistants or automating tasks that can be scaled up.

What is Machine Learning?

In AI, knowledge is essential for computers to make intelligent decisions. This knowledge can be explicitly expressed (think business rules or even software programs) by humans. Alternatively, the knowledge can be learnt automatically from data via the process of induction. Machine learning is the set of techniques and algorithms that allow computers to learn (or discover) the underlying knowledge from data.

Why is Machine Learning relevant?

Machine learning is already a part of our lives as we:

  • shop online – products recommendation and advertisements
  • bank – anomaly detection to prevent fraud
  • buy insurance (health, car, home, travel) – used to determine the premium
  • uploaded photos to Facebook and it automatically tags our friends – computer vision for facial recognition
  • ask Siri the name of the song on the radio – speech & audio recognition
  • search for anything online.

It is increasingly becoming a greater part of our lives, and the main reason for this is the growth in volume, variety, and velocity of data that our society produces. The rate at which we are generating and collecting data has been growing dramatically, commonly referred to as Big Data. Meanwhile, the cost of storing and processing data have decreased with technologies around Big Data.

There is no point in collecting and storing Big Data if we do not put it to good use. This is where computer scientists and engineers have turned to machine learning. Machine learning can help us find new insights and knowledge from the data we collect to improve existing processes or help us invent new products and services.

How does Machine Learning work?

Typically, machine learning involves 2 phases – Training and Prediction.

During training, algorithms analyze the data and build a mathematical model that describes the patterns (or relationships) within the data. The training can be supervised, where the algorithms look to maximize/minimize a specific goal with labeled data. Alternatively, they can be unsupervised where algorithms focus on just describing the data as succinctly as possible. The goal is to produce the smallest and simplest model possible that describes the patterns in the data, guided by the principle of Occam’s Razor.

Once learnt, the mathematical model can then be used to understand the relationship or make predictions about the unknown (i.e. the future, new product or customer). During the prediction phase, we exploit the discovered knowledge for our goals. The goals might be to recommend products, identify the right combination of drugs suitable to help a patient or prevent the waste of water during a drought.

Further Viewing & Exploration

Here are some useful links to get you started:

The human visual attention system is fascinating in how it guides or draws the human gaze in the scene in a movie or the super-market isle as we walk through. To really understand why it is so interesting, consider that the human eyes have only a tiny region that they see in really high resolution. This area is called the fovea and is about the size of 0.3 mm, which covers a region about the size of a thumbnail at an arm’s length from you. The rest of area outside of this region is in low resolution. Yet when we look around, we see everything in what is perfect detail. How is this possible?

Well, this is possible because your eyes are constantly moving and filling in the gaps of the information you have of the real world. These movements are called saccades, in which different parts of the scene or image is given time for fovea’s high precision image processing. Although fovea is only 1% of the retina, it uses up more than 50% of the processing power in the human visual cortex.

So the big question in computer vision is what is the control mechanism behind the movements and if we can model it accurately then we open up fantastic applications in robotics, medicine, design, art and many more. We can optimize the computation resources on a robot to only process the most relevant image data. We can use it better design ads and product placements in the shop to ensure they catch the person’s eye.

There are two main schools of thoughts behind the control of the human gaze:

  1. There is something salient within the scene that draws the attention. For example, bright colours and yellows tend to draw attention more than dull colours. Or perhaps motion that draws us to pay attention to something that is moving when everything else is still.
  2. Eyes move with the intent to learn more about the scene itself. We don’t know what is there and so the brain needs to fill that part of the world model by getting information about it. Here the eyes movement is controlled by task and intent.

So I decided to capture data on how my own gaze is drawn to things happening in a scene. To do this I built this rather stylish contraption – a bike helmet with an LCD screen at the front and a webcam pointing at my eye. Okay, not very stylish, but its a very cheap prototype.

Homemade Gaze Tracker – LCD screen in the front, small webcam pointing at the eye, mounted on a bike helmet.

That little black clip is holding the CMOS chip of a webcam straight at my left eye. Below is a screenshot of what it sees. Note that it isn’t a very crisp image, there are regions of shadow and light, as well as reflection from the retina of the external light source itself.

View of the video of the eye used to track the gaze.

I then used the awesome software written by the guys at OpenEyes. Now they had written the software to make it work for infrared cameras (which I didn’t have). I tweaked the software to add a pre-processing step that works with a natural light rather than infrared. Infrared makes life so much easier.

Then after some calibration (which maps where I was looking with the spot I was looking at on the LCD screen in front of me), it was time to sit back and watch some videos, while recording where my gaze went.

Once we have the data, I looked at if there were some simple algorithms that could predict the gaze on the acquired data. I came up with 4 very simple algorithms that attempted to predict where the gaze would be at a particular point in the scene. Without going into details, they were:

  1. Maintain – tries to maintain the trajectory of the movement of the gaze across the scene.
  2. Skin – tries to find regions in the scene that are likely to be skin tones (and therefore people) using pixel level classifier trained incrementally using Ripple Down Rules.
  3. Motion – tries to find regions of the highest degree of motion with respect to previous frames.
  4. RDR – an Ripple Down Rules based predictor (i.e. incrementally trained decision tree) that tries to pick between 1, 2 and 3 depending on the scene and historical properties.

Below is a screenshot of the system in action. The big yellow plus (+) is where I had actually focused on while I was watching the movie. Note that none of this was visible to me while I was capturing my own gaze. Then when I replayed the video, I let the other 4 algorithms attempt to predict my gaze, marked by the smaller pluses and box annotated by algorithm numbers.  

Gaze Tracking – Big yellow + is the actual location of the gaze. The smaller + and pink square are the different algorithms’ prediction of the location of the gaze for that frame.

You can watch the short video for yourself here:

Although no one algorithm was good enough to predict the gaze, there were some intuitive indicators that motion and skin tended to do better. Then again this might be biased because the scene that I had used had people standing around and talking with some odd movement. The most interesting part was how to create a training paradigm for RDR with temporal data. The underlying processing pipeline was also interesting and later gave rise to my ProcessNet work [ book chapter, PKAW paper].