top of page

MoleHD: Accelerating Molecule Discovery with Hyperdimensional Computing (HDC)

Jul 9

6 min read

2

69

0

Introduction

Drug discovery is a multifaceted process that leverages knowledge from biology, chemistry, and pharmacology to identify effective and safe medications. Traditionally, this process involves a laboriously inefficient, expensive and time-consuming screening phase, where discovery candidates are manually selected from extensive chemical databases like ChEMBL and OpenChem to build smaller, more focused in-house databases for further synthesis.


Challenges in Drug Discovery

Among the many challenges that plague the drug discovery journey, the following issues are common in the traditional discovery process:

  • High Costs and Long Timelines: The drug discovery process is extremely expensive and time-consuming, often taking over a decade and billions of dollars to bring a new drug to market.

  • High Failure Rates: The majority of drug candidates fail during clinical trials due to safety concerns, lack of efficacy, or unforeseen side effects, leading to significant financial losses.

  • Regulatory Hurdles: Stringent regulatory requirements and lengthy approval processes can delay the introduction of new drugs, complicating the journey from laboratory to market.

  • Limited Innovation: There is often a lack of novel therapeutic targets, and many new drugs are incremental improvements over other compounds, rather than groundbreaking innovations, limiting significant advancement of treatments for complex diseases.

  • Complexity of Diseases: Many diseases, especially chronic and multifactorial ones like cancer and Alzheimer's, present significant efficacy challenges due to their complex biology, making it difficult to develop effective treatments.


Current Tech Brings Limited Improvement

In recent years, machine learning (ML) and (AI) algorithms such as random forest, support vector machines, k-nearest neighbors, and gradient boosting have been explored to enhance drug discovery efforts. While these models use molecular representations to predict properties, they often fall short due to their limited ability to capture the complex structural nuances of molecules. Consequently, deep learning models, particularly Graph Neural Networks (GNNs), have gained popularity due to their superior performance in learning detailed molecular features. 

However, GNNs still require significant pre-processing and computational resources, limiting their efficiency and accessibility.


Introducing MoleHD: A Paradigm Shift in Molecule Discovery

Zscale Labs™ is pleased to introduce MoleHD, an innovative, ultra-low-cost model based on hyperdimensional computing (HDC) that significantly reduces computational demands, pre-processing efforts, and overall development time. Thus enabling faster rollout for faster return on investment (ROI) and swift time-to-market (TTM).

Figure 1: Overview of MoleHD. MoleHD has 5 major steps: Tokenization, Encoding, Training, Retraining and Inference.


What is High-Dimensional Computing (HDC)?


HDC is inspired by brain-like (neural) attributes such as high-dimensionality and distributed holographic representation, which allows it to generate, manipulate, and compare symbols represented by high-dimensional vectors. Compared to deep neural networks (DNNs), HDC offers several advantages including smaller model sizes, reduced computational costs, and the capability for one-shot or few-shot learning.

Hyperdimensional computing (HDC) differs from traditional AI in several key ways:


Representation of Data: