Mammography is a standardized imaging technique crucial for the early detection of breast cancer, primarily aimed at identifying abnormalities, or ’findings’ in the breasts that cannot be detected through palpation. This study proposes different models for classifying breast abnormalities, integrating machine learning and deep learning methods to improve classification rates. The proposed methodology involves several key steps: preprocessing of image datasets, training of base classification models, and construction of a meta-classifier. By enhancing the performance of individual classifiers, the model is benchmarked against various machine learning models. The evaluation of this method is conducted using the CBIS-DDSM mammography dataset, demonstrating its effectiveness in improving classification accuracy and reliability. The hybrid approach leverages convolutional neural networks (CNNs) such as VGG16, VGG19, and DenseNet121 for feature extraction, followed by machine learning algorithms for final classification. The VGG16 network, combined with machine learning techniques, aimed to surpass the results obtained by VGG19. Ensemble methods, particularly Voting and Stacking Classifiers, showed that combining VGG16 and DenseNet121 yielded the highest accuracy of 91.66%. These findings underscore the potential of hybrid models in breast cancer classification, offering significant improvements over single classifiers and providing valuable insights for future research in medical image analysis.
The paper deals with the design of a fast algorithm for computing the hopping discrete cosine transform in equidistant signal windows using a recursive relationship between transform spectra. Discrete cosine transform is widely used in digital signal processing such as image coding, spectral analysis, feature extraction, and filtering. Short-time transform is suitable for adaptive processing and time-frequency analysis of quasi-stationary data. Hopping transform refers to a transform computed on the signal of a fixed-size window that slides over the signal with an integer hop step. Hopping discrete transform can be employed for time-frequency analysis and adaptive processing quasi-stationary data such as speech, biomedical, radar and communication signals. The performance of the algorithm with respect to computational costs and execution time is compared with that of conventional sliding and fast algorithms.
KEYWORDS: Pathology, Mammography, Feature extraction, Image segmentation, Deep learning, Education and training, Breast, Classification systems, Cancer detection, Breast cancer
In this study, the main goal is to improve the performance of existing computer diagnostic systems by proposing new processing methods. We use the public CBIS-DDSM dataset for training and validation. The dataset consists of normal screenings with benign tumors and malignant tumors, with all pathologies carefully selected and checked by a radiologist. The data set also includes ROI masks and pathology bounding boxes, as well as labels corresponding to the class of each pathology diagnosis. To achieve better results on the dataset, we transform the data for their more efficient representation using autoencoders in order to obtain features with low intraclass and high interclass variance, and apply LDA to the encoded features to classify pathologies. Methods for automated pathology detection are not considered in this article, since it is mainly focused on the classification task itself. The entire pipeline of the system consists of the following steps: first, feature extraction using pathology segmentation; dividing the data into two clusters; feature transformation using linear discriminant analysis to minimize intra-class variance; finally, the classification of pathologies. The results of this study for the classification of pathologies using various deep learning methods are presented and discussed.
Breast cancer is the most common cancer and one of the main causes of death in women. Early diagnosis of breast cancer is essential to ensure a high chance of survival for the affected women. Computer-aided detection (CAD) systems based on convolutional neural networks (CNN) could assist in the classification of abnormalities such as masses and calcifications. In this paper, several convolutional network models for the automatic classification of pathology in mammograms are analyzed. As well as different preprocessing and tuning techniques, such as data augmentation, hyperparameter tuning, and fine-tuning are used to train the models. Finally, these models are validated on various publicly available benchmark datasets.
Breast cancer in women is a worldwide health problem that has a high mortality rate. A strategy to reduce breast cancer mortality in women is to implement preventive programs such as mammography screening for early breast cancer diagnosis. In this presentation, a method for automatic detection of breast pathologies using a deep convolutional neural network and a class activation map is proposed. The neural network is pretrained on the regions of interest in order to modify the output layers to have two output classes. The proposed method is compared with different CNN models and applied to classify the public dataset Curated Breast Imaging Subset of DDSM (CBIS-DDSM).
In visual simultaneous localization and mapping (SLAM), the odometry estimation and navigation map building are carried out concurrently using only cameras. An important step in the SLAM process is the detection and analysis of the keypoints found in the environment. Performing a good correspondence of these points allows us to build an optimal point cloud for maximum localization accuracy of the mobile robot and, therefore, to build a precise map of the environment. In this presentation, we perform an extensive comparison study of the correspondences made by various combinations of detectors/descriptors and contrast the performance of two iterative closest points (ICP) algorithms used in the RGB-D SLAM problem. An adaptive RGB-D SLAM system is proposed, and its performance with the TUM RGB-D dataset is presented and discussed.
Visual SLAM is widely known in robotics for computing, concurrently, the odometry of a robot and construct a 3D navigation map with only a camera. In visual SLAM systems, detection and description of local features are extremely important because they identify unique and invariant points in an observed frame. Although there are various detectors and descriptors, the proper detector/descriptor combination for extraction has not yet been generalized for the problem. In this work, a comprehensive performance evaluation of combinations for different feature detectors and descriptors is presented. This evaluation will help determine the best detector/descriptor combination for designing a visual SLAM system based on RGB-D data. The considered methods are evaluated in terms of accuracy and robustness in both, a single and overall visual SLAM system.
There are several factors that affect the performance of a 3D scene reconstruction system. Among them the most important are the choice of feature detectors and descriptors, number of visual features and correct match between them, reliable tracking of the correspondences along selected keyframes.
In this work, we propose a fast method for generation of a 3D map from the time sequence of RGB-D images selecting the minimum number of keypoints and keyframes that still ensures correct feature correspondences and as a result a high quality of the 3D map. The performance of the proposed algorithm of the 3D scene reconstruction is evaluated by computer simulation using real indoor environment data.
It is well known that the accuracy and resolution of depth data decreases when the distance from a RGB-D sensor to a 3D object of interest increases, affecting the performance of 3D scene reconstruction systems based on an ICP algorithm. In this paper, to improve the 3D map accuracy by aligning multiple cloud points we propose: first, to split the depth plane into sub-clouds with a similar resolution; then, in each sub-cloud to select a minimum number of keypoints for aligning them separately with an ICP algorithm; finally, to merge all clouds into a dense 3D map. Computer simulation results show the performance of the proposed algorithm of the 3D scene reconstruction using real indoor environment data.
In order to design a tracking algorithm with invariance to pose, occlusion, clutter, and illumination changes of a scene, non-overlapping signal models for input scenes as well as for objects of interest and Synthetic Discriminant Function approach are exploited. A set of optimum correlation filters with respect to peak-to-output energy is derived for different target versions in each frame. A prediction method is utilized to locate a target patch in the coming frame. The algorithm performance is tested in terms of recognition and localization errors in real scenarios and compared with that of the state-of-the-art tracking algorithms.
With the development of RGB-D sensors, a new alternative to generation of 3D maps is appeared. First, features extracted from color and depth images are used to localize them in a 3D scene. Next, Iterative Closest Point (ICP) algorithm is used to align RGB-D frames. As a result, a new frame is added to the dense 3D model. However, the spatial distribution and resolution of depth data affect to the performance of 3D scene reconstruction systems based on ICP. In this paper we propose to divide the depth data into sub-clouds with similar resolution, to align them separately, and unify in the entire points cloud. The presented computer simulation results show an improvement in accuracy of 3D scene reconstruction using real indoor environment data.
In this work, we propose a new algorithm for matching of coming video sequences to a simultaneous localization and
mapping system based on a RGB-D camera. Basically, this system serves for estimation in real-time the trajectory of
camera motion and generates a 3D map of indoor environment. The proposed algorithm is based on composite
correlation filters with adjustable training sets depending on appearance of indoor environment as well as relative
position and perspective from the camera to environment components. The algorithm is scale-invariant because it
utilizes the depth information from RGB-D camera. The performance of the proposed algorithm is evaluated in terms of
accuracy, robustness, and processing time and compared with that of common feature-based matching algorithms based
on the SURF descriptor.
This paper considers the face identification task in video sequences where the individual’s face presents variations;
such as expressions, pose, scale, shadow/lighting and occlusion. The principles of Synthetic Discriminant
Functions (SDF) and K-Law filters are used to design an adaptive unconstrained correlation filter (AUNCF). We
developed a face tracking algorithm which together with a face recognition algorithm were carefully integrated
into a video-based face identification method. First, a manually selected face in the first video frame is identified.
Then, in order to build an initial correlation filter, the selected face is distorted so that it generates a training set.
Finally, the face tracking task is performed using the initial correlation filter which is updated through the video
sequence. The efficiency of the proposed method is shown by experiments on video sequences, where different
facial variations are presented. The proposed method correctly identifies and tracks the face under observation
on the tested video sequences.
Correlation filters have become an important tool for detection, localization, recognition and object tracking in digital
media. This interest in correlation filters has increased thanks to the processing speed advances of the computers that
enable the implementation of digital correlation filters in real-time. This paper compares the performance of three
correlation filters in the activity of object recognition, specifically human faces with variations in facial expression, pose,
rotation, partial occlusion, illumination and additive white Gaussian noise. The analyzed filters are k-law, MACE and
OTSDF. Simulation results show that the k-law nonlinear composite filter has the best performance in terms of accuracy
and false acceptance rate. Finally, we conclude that a preprocessing algorithm improves significantly the performance of
correlation filters for recognizing objects when they have variations in illumination and noise.
Automatic estimation of human activities is a topic widely studied. However the process becomes difficult when we
want to estimate activities from a video stream, because human activities are dynamic and complex. Furthermore, we
have to take into account the amount of information that images provide, since it makes the modelling and estimation
activities a hard work. In this paper we propose a method for activity estimation based on object behavior. Objects are
located in a delimited observation area and their handling is recorded with a video camera. Activity estimation can be
done automatically by analyzing the video sequences. The proposed method is called "signature recognition" because it
considers a space-time signature of the behaviour of objects that are used in particular activities (e.g. patients' care in a
healthcare environment for elder people with restricted mobility). A pulse is produced when an object appears in or
disappears of the observation area. This means there is a change from zero to one or vice versa. These changes are
produced by the identification of the objects with a bank of nonlinear correlation filters. Each object is processed
independently and produces its own pulses; hence we are able to recognize several objects with different patterns at the
same time. The method is applied to estimate three healthcare-related activities of elder people with restricted mobility.
During a cognitive stimulation session where elders with cognitive decline perform stimulation activities, such as
solving puzzles, we observed that they require constant supervision and support from their caregivers, and caregivers
must be able to monitor the stimulation activity of more than one patient at a time. In this paper, aiming at providing
support for the caregiver, we developed a vision-based system using an Phase-SDF filter that generates a composite
reference image which is correlated to a captured wooden-puzzle image. The output correlation value allows to
automatically verify the progress on the puzzle solving task, and to assess its completeness and correctness.
This work presents the development and utilization of vectorial signatures filters obtained from the application of properties of the scale and Fourier transform for images recognition. The filters were applied to different input scene, which consisted in the 26 letters of the alphabet. Each letter is an image of 256 × 256 pixels of black background with a centered white Arial letter. The image was rotated 360 degrees in increment of 1o and scaled from 70% to 130% in increment of 0.5%. In order to find a new invariant correlation digital system we obtained two unidimensional vector after to achieve different mathematical transformation in the target as well as the input scene. To recognize a target, signatures were compared, calculating the Euclidean distance between the target and the input scene; then, confidence levels are obtained. The results demonstrate that this system has a good performance to discriminate between letters.
One of the main problems in visual image processing is incomplete information owing an occlusion of objects by other objects. Since correlation filters mainly use contour information of objects to carry out pattern recognition then conventional correlation filters without training often yield a poor performance to recognize partially occluded objects. Adaptive correlation filters based on synthetic discriminant functions for recognition of partially occluded objects imbedded into a cluttered background are proposed. The designed correlation filters are adaptive to an input test scene, which is constructed with fragments of the target, false objects, and background to be rejected. These filters are able to suppress sidelobes of the given background as well as false objects. The performances of the adaptive filters in real scenes are compared with those of various correlation filters in terms of discrimination capability and robustness to noise.
New adaptive correlation filters based on a conventional synthetic discriminant function (SDF) for reliable recognition of an object in cluttered background are proposed. The information about an object to be recognized, false objects, and a background to be rejected is utilized in an iterative training procedure to design a correlation filter with a given value of discrimination capability. Computer simulation results obtained with the proposed adaptive filter in test scenes are discussed and compared with those of various correlation filters in terms of discrimination capability, tolerance to input additive noise that is always present in image sensors, and to small geometric image distortions.
One of the main problems in visual signal processing is incomplete information owing an occlusion of objects by other objects. It is well known that correlation filters mainly use contour information of objects to carry out pattern recognition. However, in real applications object contours are often disappeared. In these cases conventional correlation filters without training yield a poor performance. In this paper two novel methods based on correlation filters with training for recognition of partially occluded objects are proposed. The methods improve significantly discrimination capability of conventional correlation filters. The first method performs training of a correlation filter with both a target and objects to be rejected. In the second proposal two different correlation filters are designed. They deal independently with contour and texture information to improve recognition of partially occluded objects. Computer simulation results for various test images are provided and discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.