
- On this page:
-
Featured Publications
-
News
-
Lead Researcher
Machine Learning and Image Processing
Machine learning and image processing are closely related fields that focus on enabling computers to interpret, analyze, and make decisions based on visual data. Image processing involves techniques for enhancing, transforming, and extracting information from images, such as filtering, edge detection, and segmentation. Machine learning, particularly through models like neural networks and deep learning, builds on these techniques by allowing systems to automatically learn patterns and features from large datasets of images. Together, they power applications such as facial recognition, medical imaging analysis, autonomous vehicles, and object detection, where machines not only process visual information but also interpret and act on it intelligently.
Featured Projects
At the Repository of Image Databases you can find image databases with augmented and distilled image features and software to create your own augmented or distilled image databases.
Featured Publications
Deep learning models require large training datasets. Incorporating additional data into small training datasets can enhance the model’s performance. However, acquiring additional data may sometimes be challenging or beyond one’s control. In such situations, data augmentation becomes essential to overcome the limited supply of labeled data by generating new data that preserves the essential properties of the original dataset. The primary objective of our research is to develop an iterative data distillation and augmentation (IDDA) method that enlarges the size of a limited image training dataset while preserving its properties. At every iteration, our method distills a set of images from the training set of the previous iteration utilizing the kernel inducing point (KIP) method, and the union of the training and distilled sets creates the new training set. However, our experiments show that IDDA is computationally expensive, increasing processing time by approximately 17%–27 for MNIST and Fashion-MNIST, 31%–39 for CIFAR-10, and up to 48%–49 for CIFAR-100 compared to state-of-the-art augmentation methods, due to the additional step of applying KIP for image distillation. We have experimentally determined that for a few iterations the classification accuracy increases and then drops afterward. We validate the IDDA capabilities by comparing it with conventional augmenting methods and MixUp on the following publicly available image datasets: MNIST digit, Fashion-MNIST, CIFAR-10, and CIFAR-100. Our approach proves highly effective for very limited datasets, addressing the challenge of database expansion for improved performance of deep learning models.
Data distillation is an emerging research area, attracting the attention of machine learning (ML) and big data scientists and experts. The main goal of a distillation approach is to generate a compact dataset that preserves the essential characteristics of a larger one. In our study, we considered an initial large set of images and developed a novel method for distilling images from the initial set. The method combined discrete wavelet transform (DWT) and modified principal component analysis (M-PCA). Hence, our method first transforms images into vectors of low-band (LL) wavelet coefficients and then applies M-PCA to modify and reduce the number of vectors rather than their dimensionality. This distinguishes our approach from the traditional PCA method. We implemented the new method in Python 3.10 and validated it on public image databases, including Extended YaleB, digit-MNIST, and the ISIC2020. We demonstrated that creating a dictionary from a small set of distilled images and training a sparse representation wavelet-based classifier (SRWC) provides higher accuracy if compared to a classification when the SRWC method is trained with the entire initial training set of images.
Classical methods, including sparse representation classification (SRC) and neural networks (NNs), classify image object(s) using features like intensity, color, texture, and geometry extracted from the image. For the purpose of classification the present study proposes to augment the set of image features with vector field (VF) features, like singular points (SPs) and trajectories. To generate VFs with such features, our approach solves the Poisson Image (PI) equation. Using its solution \( \hat u (x,y) \), our approach defines two functions \( \hat \Phi (x,y) \) and \( \hat \Psi (x,y) \) and develops two Poisson gradient VFs (PGVFs) \( \nabla \hat \Phi (x,y) \) and \( \nabla \hat \Psi (x,y) \), used to embed features into a database of images. The embedding of VF features into image objects constitutes the main novelty of this study. The advantage that comes from the novelty is increased statistics of machine learning (ML) classifiers. To validate the advantage, the PGVFs \( \nabla \hat \Phi (x,y) \) and \( \nabla \hat \Psi (x,y) \) were embedded into the public image databases COIL100, YaleB, ISIC2018, and ISIC2020. The original and the eight VF covered databases were classified with two ML classifiers: sparse representation wavelet classification (SRWC) and SRC in the quaternion wavelets (SRCQW) domain. The results obtained are presented in the paper and confirm the claimed advantage.
The main objective of this paper is to present a repository of image databases whose features are augmented with embedded vector field (VF) features. The repository is designed to provide the user with image databases that enhance machine learning (ML) classification. Also, six VFs are provided, and the user can embed them into her/his own image database with the help of software named ELPAC. Three of the VFs generate real-shaped singular points (SPs): springing, sinking, and saddle. The other three VFs generate seven kinds of SPs, which include the real-shaped SPs and four complex-shaped SPs: repelling and attracting (out and in) spirals and clockwise and counterclockwise orbits (centers). Using the repository, this work defines the locations of the SPs according to the image objects and the mappings between the SPs' shapes if separate VFs are embedded into the same image. Next, this paper produces recommendations for the user on how to select the most appropriate VF to be embedded in an image database so that the augmented SP shapes enhance ML classification. Examples of images with embedded VFs are shown in the text to illustrate, support, and validate the theoretical conclusions. Thus, the contributions of this paper are the derivation of the SP locations in an image; mappings between the SPs of different VFs; and the definition of an imprint of an image and an image database in a VF. The advantage of classifying an image database with an embedded VF is that the new database enhances and improves the ML classification statistics, which motivates the design of the repository so that it contains image features augmented with VF features.
The objects' features play significant role in the machine learning classification. The present paper proofs and validates that the shapes of vector field (VF) singular points (SPs) embedded into image objects may improve classification accuracy. For this purpose the present paper develops two VFs \( v_{\hat u} \) and \( v_{\hat \Phi} \) with real and complex SPs. The VFs are developed on the solution \( \hat u (x,y) \) of a particular form of the Poisson equation. Further, we define the mappings between the SPs of \( \nabla \hat u (x,y) \) and \( v_{\hat u} \) and \( v_{\hat \Phi} \). Next, we develop the local Polya's model of a VF and prove that the shapes of the SPs are invariant according to scaling, translation and weak rotations. This property implies that embedding the shapes of the SPs into the image objects augments the set of objects features, which leads to the advantage of increasing the classification statistics. We validate the invariance and the advantage with sets of experiments classifying the public image datasets ISIC2020 and COIL100. For the purpose of classification, we designed a new convolution neural network optimized to classify SP shapes and image objects features. The paper ends with conclusions on the contributions, advantages and the bottlenecks of this study.
This work develops a new method for vector data augmentation. The proposed method applies principal component analysis (PCA), determines the eigenvectors of a set of training vectors for a machine learning (ML) method and uses them to generate the distilled vectors. The training and PCA-distilled vectors have the same dimension. The user chooses the number of vectors to be distilled and augmented to the set of training vectors. A statistical approach determines the lowest number of vectors to be distilled such that when augmented to the original vectors, the extended set trains an ML classifier to achieve a required accuracy. Hence, the novelty of this study is the distillation of vectors with the PCA method and their use to augment the original set of vectors. The advantage that comes from the novelty is that it increases the statistics of ML classifiers. To validate the advantage, we conducted experiments with four public databases and applied four classifiers: a neural network, logistic regression and support vector machine with linear and polynomial kernels. For the purpose of augmentation, we conducted several distillations, including nested distillation (double distillation). The latter notion means that new vectors were distilled from already distilled vectors. We trained the classifiers with three sets of vectors: the original vectors, original vectors augmented with vectors distilled by PCA and original vectors augmented with distilled PCA vectors and double distilled by PCA vectors. The experimental results are presented in the paper, and they confirm the advantage of the PCA-distilled vectors increasing the classification statistics of ML methods if the distilled vectors augment the original training vectors.
Classification methods use image object features to distinguish between objects and assign them to classes. In the present study we develop a convolutional neural network (CNN) optimized to classify images with embedded vector fields (VFs), generated on the solution \(\hat u(x,y)\) of the Poisson equation, which contains the image function in its right-hand side. The embedded VF features subject to extraction, by our CNN, are trajectories and singular points (SP), which augment the image object features. The aim of this paper is to validate that the set of augmented image features increases the separability of the image objects and improves the classification statistics. To reach the aim, we implement our CNN along with four contemporary CNNs to classify two public image databases COIL100 and ISIC2020 as well as their derivatives with embedded VFs. The obtained results are presented in the paper and confirm that embedding VFs with real and complex SPs increases the classification statistics.
Classical methods, including sparse representation classification (SRC) and neural networks (NNs), classify image object(s) using features like intensity, color, texture, and geometry extracted from the image. For the purpose of classification this study proposes to augment the set of image features with vector field (VF) features, like singular points (SPs) and trajectories. To generate VFs with such features, our approach solves the Poisson Image (PI) equation. Using its solution \(\hat u(x,y)\), this approach defines two functions \(\hat \phi(x,y)\) and \(\hat \psi(x,y)\) and develops two gradient VFs (GVFs) \(\nabla \hat \phi(x,y)\) and \(\nabla \hat \psi(x,y)\), used to embed features into a database of images. The embedding of VF features into image objects constitutes the main novelty of this study. The advantage that comes from the novelty is increased statistics for machine learning (ML) classifiers. To validate the advantage, the VFs \(\nabla \hat \phi(x,y)\) and \(\nabla \hat \psi(x,y)\) were embedded into the public image databases COIL100, YaleB, ISIC2018, and ISIC2020. The original and the eight VF covered databases were classified with two ML classifiers: sparse representation wavelet classification (SRWC) and SRC in the quaternion wavelets (SRCQW) domain. The results obtained are presented in the paper and confirm the claimed advantage.
Prediction of changes in biomedical signals, such as vital signs, is useful for many clinical applications. Several signal prediction (forecasting) tools were developed, but their evaluation and applicability to a specific clinical use is context dependent. In this work, we propose a novel method for evaluation and comparison of vital sign predictors for intervention based clinical studies. Specifically, we study and compare nine deep learning and statistics based predictive models for multi-step prediction of bradycardia events in preterm infants, but the proposed method could be applied to other biomedical signals. Our results on testing sets with several days of vital sign recordings show that simple statistical predictors could outperform state-of-the-art deep learning architectures for low-dimensional signals.
Precise tracking of a point-target on a nonlinear trajectory is challenging and has applications ranging from traffic analysis to microscopic particle tracking. To solve such a problem, we developed an algorithm which is independent of statistical-probabilistic and mechanical modeling, and free of analytical extrapolation methods. Our main objective was, to predict target's future location using its previous locations by a deep neural network, trained on a large data set of linear and nonlinear trajectories. To design our data-driven prediction approach, we developed a freely available database of up to second-order algebraic curves uniformly distributed in a given domain. This database could be used for training and testing point-target tracking algorithms. Simulated noisy test sets of trajectories were produced using Gaussian noise for analyzing the forecasting performance and noise sensitivity of our model. Further, the newly designed long short-term memory-based network that uses polar coordinates for its training is capable of predicting the target's future locations on real-world smooth trajectories. We compared the proposed predictor network to classical and state-of-the-art predictors based on average absolute and relative errors. The experimental results demonstrated that our novel predictor achieved up to 47% improvement on test data sets. The observed area under the noise response curve has improved by up to 11%.
In this study, we propose a novel sparse representation learning method in the Quaternion Wavelet (QW) domain for multi-class image classification. The proposed method takes advantages from: i) the QW decomposition, which promotes sparsity and provides structural information about the image data while allowing approximate shift-invariance, to extract meaningful features from low-frequency QW sub-bands, ii) the dimensionality reduction method using Principal Component Analysis (PCA) for reducing the complexity of the problem, and iii) the sparse representation of the generated QW features to efficiently learn and capture the meaningful and compact information of this data. After the QW decomposition, the features extracted from low-frequency image sub-bands information are projected, by the PCA, into a new feature space with lower dimensionality. The features extracted from the training samples are used to construct a dictionary, while the features of the test samples are sparsely coded for the classification step. The sparse coding problem is formulated in a QW Least Absolute Shrinkage and Selection Operator (QWLasso) model applying quaternion \( l_1 \) minimization. A novel Quaternion Fast Iterative Shrinkage-Thresholding Algorithm (QFISTA) is developed to solve the QWLasso model. The experiments conducted on various public image datasets validated that the proposed method possesses higher accuracy, sparsity, and robustness in comparison with several contemporary methods in the field including Neural Networks.
Automated melanoma classification remains a challenging task because skin lesion images are prone to low contrast and many kinds of artifacts. To handle these challenges, we introduce a novel and efficient method for skin lesion classification based on the machine learning approach and sparse representation (SR) in the quaternion wavelet (QW) domain. Further, we investigate the application of the SR approach with low, high, and mixed wavelet frequencies. Using QW coefficients, the classification problem is mapped onto the algebra of quaternions. Using the public skin lesion image datasets ISIC2017 and ISIC2019, we experimentally validated that creating dictionary with quaternions of low-frequency wavelet sub-band leads to the most accurate classification of skin lesions to melanoma or benign. We compared our approach with contemporary methods including neural networks.
Melanoma is a deadly skin disease. Availability of digital skin lesion datasets ease the exploration of ample classification studies. Both theoretical and heuristics improvements are achieved thanks to these new datasets. Being one of many high-level feature-driven classification methods, support vector machines (SVMs) are widely used in the literature as melanoma classifiers. Almost all of these studies are using a limited set of predefined kernels. In this study, we propose a newly developed Clifford kernel for the classification of dermoscopic skin lesions. We develop Clifford-based linear, polynomial, and exponential kernels in the Clifford algebra (CA) Cℓ5, 0 0-, 2-, and 4-vector subspaces. CAs are noncommutative but associative and distributive over addition. We showed that the newly developed Clifford kernels are embedded into SVM classifiers to successfully identify malignant skin lesions in a binary classification settings. Clifford kernel results are compared with mostly used gaussian and polynomial kernels with real-valued SVM classifiers. Accuracy of all classifiers are assessed with cross-validation using imbalanced and balanced datasets of 112, 162, and 192 lesions. SVM kernels in comparison are parameterized to scan wide range of possibilities. We show that Clifford-based polynomial kernels outperforms in all, balanced and imbalanced, datasets having average accuracy of 83%. The consistence of high accuracies obtained with Clifford polynomial kernel shows that skin lesion features are logically designed and Clifford-based SVM is able to model class separations in the feature space.
This paper develops a new active contour (AC) model capable of multiple complex objects segmentation in the presence of heavy noise. The model segments images in the framework of two types of partial differential equations (PDEs): the Euler–Lagrange and Poisson PDEs. The former is used to build an evolution algorithm, while the Poisson solution gradient vector field (PGVF) directs the evolution toward the boundaries of all image objects. The AC halts on boundaries and PGVF separatrices, splits on the latter, and leaves at least one segment (called label) on every boundary. Each label tracks its boundary until the corresponding object is enveloped. The advantages of the new method are validated on a number of skin lesions, road, and aircraft images of varying sizes and in the presence of Gaussian noise. The obtained results are compared against results by contemporary and established active contours and neural networks.

