Background

The changing holy grail of software

Wisdom lies in including unknowns, not eliminating them. This isn’t philosophy but the new world order for programmers.

Srikumar-75x75

Srikumar Subramanian

Director of technology, Imaginea

Ever wondered how much energy we spend on a Google search? Answer: the equivalent of the amount of energy consumed by a human body in 10 seconds. It takes eight different organisations to create a database containing images of fish, yet it involves relatively little effort and expense to devise a machine learning model and make predictions from it.
That is, the amount of effort expended in collecting data exceeds the effort spent on coding on top of it to gain insights. Code isn’t valuable in and of itself. It’s commodity. More often, the insights code generates are only as good as the reference dataset that forms the basis of its working. Data is valuable. For most startups, data is the key driver because understanding customers is prime. Explicit coding creates dead applications from day one.
Imagine that your task is to write a function that can parse a piece of text and find the presence of any indication of time in it: timestamps, mention of time indicators (today, tomorrow etc.,), mention of numbers, numbers following a pattern, and so on..
This is a classic fuzzy problem. To solve it, we can’t go about creating a data-set containing all the patterns that could indicate time. This is a tiring and non-exhaustive exercise. Most solutions for such a problem are dependent on a readily available labelled data set that categorises a right event and wrong event. And that’s data, not code.┬áThis problem can’t be perfectly solved. An always ideal output is impossible. And a test case for such a function can never be called all-encompassing.
What does all this mean? Developers need a “we don’t know” perspective. The hacker mindset of diving into code to get a quick prototype is outdated. User research for insights should be table stakes. Developers are data-starved, though collecting and analyzing data to solve a problem is a necessary skill.

“Collating a data set is a first level issue in solving any complex engineering problem. There is no glory in algorithm. A vast majority need to be working on data.”

Not all problems can be solved by explicit coding alone. The need for a data sensitive programmer is going to increase in future. The effort to understand large data on which the lines of code work is key.┬áData being unpredictable isn’t a valid reason to underplay its role. Because technology can have unpredictable consequences on society, we don’t stop ideating about technology, do we? Similarly, thinking about data is essential and in fact, will be a rallying cry for programmers in the future.