WHAT CAN MIGRATORY BIRDS TELL US ABOUT CREATING BETTER AI-BASED PREDICTIVE ANALYTICAL
11 Jan 2021
In July 2018, an Ornithologist named
Christopher Heckscher tweeted a prediction that
the severity of Atlantic Ocean hurricane season would be greater than
average. This prediction was quite at odds with the forecasts generated by
most sophisticated computer models using decades of meteorological data. It
may be pertinent to point here that Hecksher is an Ornithologist and not a
data/computer scientist. His prediction was based on the correlation he
observed (observations made over 2 decades) between the time of migration of veery birds from the southern
USA to Amazon forest in Brazil and the severity of
Atlantic Ocean hurricane season.
When the hurricane season got over, about
five months later, Hecksher’s prediction was found to be bang on spot.
“The birds were saying bad season, and everyone else was saying
below-average season,” says Heckscher, an associate professor at
Delaware State University. “In the end, the birds
were more accurate.”¹
So what has this story
got to do with AI-based prediction models?
To understand this, we need to look at how the
AI-based prediction models work. In simple terms, the AI models predict a
specific outcome based on the underlying factors (known as features in AI
parlance). Using several alternate techniques, the AI models establish a
statistical correlation between the factors and the outcome. This is done
using the training data set. Once the model has been trained, it is then
provided with a set of values of features, using which the model then gives
the prediction for the outcome.
Now, imagine that you are a domain expert in
specific field and want to create such a prediction model. You have data
scientists and AI experts who can create the model for you. However, they
look at you to help them identify the relevant factors (or features) that
need to be considered while creating the model. After all, you understand
your field much better than they do.
In such a scenario, it is natural to assume
that you would give them all the possible factors that you think might have
a role to play. At times, I have seen data sets with upwards of 1,000 such
features for predicting an outcome. Data scientists in your team would then
use this data set to try and create a prediction model. However, since this
requires very close coordination between the domain experts and data
scientists (maybe months of painful efforts), I have seen this step
overlooked in most cases.
This is where most prediction models start to
go awry. With many irrelevant or cross-correlated features in the training
data set, a lot of so-called noise gets introduced in the model which
reduces the accuracy of the prediction model. To reduce noise, the model
creators then use a technique known as “hyper parameterisation.” It involves
tweaking the correlation algorithm across a wide array of possibilities and
coming up with a set of parameters which have the least noise. It is
fundamentally like repairing the car engine when the fault lies with the
fuel that is being injected into it.
Now, coming back to the
migratory birds’ story.
Given the limited brain size and capacity of
veery birds, it would be safe to assume that these migratory birds use very
few factors to predict weather conditions that would be prevalent a few
months later. Yet, they were able to give better predictions, because in
their case, they acted both as the domain experts, and, the data scientists.
The birds zero down on a few relevant factors, out of the possibly thousands
of factors that might impact the weather prediction and therefore create a
better model. Compare this to the computer models that, one would assume,
uses hundreds and thousands of such factors and, as a result, creates noise.
experienced this at Spalba, when we were trying to implement our AI use cases that
predict an outcome (cannot give many details here due to IP related
constraints). The data sets that we received from domain experts had
hundreds of factors, which helped train the model.
However, using this data set, our trained
models were able to give predictions that were sometimes off by more than
40% even with the best of hyper-parameterisation leading to extensive
domain-specific discussions and introspections. Post weeks of deliberations
and back and forth, we were then able to reduce the input data set to about
15 key features. This has given us fantastic results, with the average
error, reduced to less than 7%.
Needless to say, we are now looking at more
such nature-based inspirations to drive our software development efforts. Do
share, in case you have any.