DeepSea publishes AI-model accuracy evaluation research

Greece-based maritime tech company DeepSea has published a new research paper outlining a process for verifying the accuracy of models of ships generated by artificial intelligence (AI) systems when applied to in real-world conditions.

The research was carried out by seven of DeepSea’s thirteen-strong team of research scientists, headed up by Dr Antonis Nikitakis.

“This research is an important step in helping our customers and the wider market to understand the true power, while alleviating the limitations, of an AI-based approach,” said Dr Nikitakis.

“Coupled with the daily real-world impact we’re seeing on fuel consumption and CII ratings, we believe this sort of information is key to popularising this incredible technology throughout the industry.”

The research notes that most current AI models provide an estimation of their accuracy based on testing with data obtained from the same distribution as the data used to train the model (i.e., representative of similar conditions and containing similar biases).

For example, if the model is trained on data from the vessel’s historical behaviour, in a narrow range of common wind speeds or drafts, it is also then tested on data with these speeds and drafts. As a result, the tests performed can’t properly assess if the model is reproducing biases that exist in the training data, or how it might operate in different conditions that have never been previously encountered.

As an alternative, the researchers proposed an evaluation methodology built around a specifically designed dataset partitioning scheme to expose a model’s robustness to large distributional shifts. The proposed methodology includes analysis of results through the lens of ‘predictive uncertainty’ to assess the model’s fitness in handling uncertain and noisy regions in the modelled dataset.

In testing the proposed method, the group found that splitting the dataset as described successfully exposed models’ performance drop when moving from in-domain to out-of-domain dataset splits. Predictive uncertainty was also found to correlate well with such drops, making it possible to assess the model’s performance after deployment without access to the true target values.

The full research paper is available to download here.

Share this story

About the Author

Picture of Rob O'Dwyer
Rob O'Dwyer

Rob is Chief Network Officer and one of the founders of Smart Maritime Network. He also serves as Chairman of the Smart Maritime Council. Rob has worked in the maritime technology sector since 2005, managing editorial for a range of leading publications in the transport and logistics sector. Get in touch by email by clicking here, or on LinkedIn by clicking here.

Further Reading

News Archive