R&D

Whilst no two projects we worked on are alike, we often come up against the same problems in our applied Research and Development work:

  • How to optimally decompose a business problem into AI-solvable sub-problems to reduce the development spend and total cost of ownership (TCO)
  • How to compensate for the incomplete and poor quality data, check the results for common-sense and neutralise false correlations
  • How to maximise the value of unlabelled data, especially when labelling is complex, expensive and takes a long time
  • How can a high generalisation performance be achieved with only a small labelled dataset?
  • How to ensure the high degree of Adversarial Robustness of the AI models and increase the reliability and cybersecurity of the data analysis system
  • How to ensure computational efficiency and generalisation capabilities for data stream mining?
  • Can an optimal balance between exploration and convergence  in the search optimisation algorithms be obtained?
  • How can a learning algorithm self-organise complexity it faces?
  • How to reach peak productivity and stability of the modular multi-task learning system with heterogenous inputs?

Our Approach

To respond to these practical challenges, we use both original algorithms of our own design and some that we have optimized. Regardless of the source, however, they are all based on the same core ideas and concepts:

  • Deep analysis of the task at hand, research to determine the optimal, state-of-the-art, solution
  • Adaptation of micro- and macro-architectures of models and protocols to the specifics of the task
  • Synergistic, compensatory approach – applying diversity, modularity and hybridisation principles to compensate for the shortcomings of individual AI models and methods
  • Optimisation of the precision-recall tradeoff, taking into account the financial implications of the false negatives and positives
  • Enriching the datasets with alternative data, finding and integrating external informative data sources with meaningful causal relationship to the problem
  • Transfer Learning – transfer and adaptation of knowledge from models trained on a large volume of open-source data to solve new problems
  • Meta-learning – algorithms which are “learning to learn” over multiple tasks to become more efficient in acquiring new knowledge

 

Continual Learning

No model stays perfect in the wild forever – the world changes constantly and our Machine Learning models and algorithms are built to adapt to change and stay robust in the new environment. The key concept underpinning our efforts to make our models adaptable is that of Continual Learning.

Continual Learning approach allows us to create the constant self-learning loop and allow our models to become smarter with time. To achieve this, we make good use of several techniques, such as:

  • Managing training regimes to ensure stability and prevent catastrophic forgetting
  • Adaptive regularisation and overparameterisation
  • Meta-Learning applied to finding “model agnostic” solutions

Ongoing development and scaling-up of AI-based products and continual appearance of new AI algorithms and hardware for their deployment necessitate an integrated approach unifying ML system development (Dev) and ML system operation (Ops).

In our work we actively apply DevOps principles to ML-based systems (MLOps) as follows:

  • Integrated cloud services for version control, collective access and effective management of both data and AI models
  • DevOps techniques for automation and monitoring of full ML system lifecycle, including integration, testing, releasing, deployment and infrastructure management
  • Retraining and knowledge distillation for knowledge transfer between old and new models

 

Label Tyranny

As the new data is coming in for incremental retraining, it needs to be labelled. To reduce the effort required to produce a labelled dataset, we utilise a number of approaches:

  • Active learning which expedites the process of acquiring new labelled data by ordering unlabelled samples by the expected value
  • Self-Taught learning when model’s prediction is used as label for further training when uncertainty is below a threshold
  • Weakly-Supervised learning or supervised learning with “noisy” labels
  • Classic methods and Generative Adversarial learning for data augmentation
  • Self-supervised model pre-training on unlabelled data, where model is searching for labels in the data itself
  • Synthetic data, obtained by simulation of the process or object in focus
  • Feature extractors based on Siamese networks with contrastive or triplet based loss. These require fewer data samples for effective training and can work with highly imbalanced data by combining pairs or triplets to increase the training sample