Publication:
On the Suitability of Bagging-Based Ensembles with Borderline Label Noise

dc.contributor.authorSaez, Jose A.
dc.contributor.authorRomero-Bejar, Jose L.
dc.contributor.authoraffiliation[Saez, Jose A.] Univ Granada, Dept Stat & Operat Res, Fuentenueva S-N, E-18071 Granada, Spain
dc.contributor.authoraffiliation[Romero-Bejar, Jose L.] Univ Granada, Dept Stat & Operat Res, Fuentenueva S-N, E-18071 Granada, Spain
dc.contributor.authoraffiliation[Romero-Bejar, Jose L.] Inst Invest Biosanitaria IbsGRANADA, Granada 18012, Spain
dc.contributor.authoraffiliation[Romero-Bejar, Jose L.] Univ Granada IMAG, Inst Math, Ventanilla 11, Granada 18001, Spain
dc.contributor.funderMCIU/AEI/ERDF, UE
dc.contributor.funderERDF Operational Programme 2014-2020
dc.contributor.funderEconomy and Knowledge Council of the Regional Government of Andalusia, Spain
dc.contributor.funderMCIN/AEI
dc.date.accessioned2023-05-03T14:13:25Z
dc.date.available2023-05-03T14:13:25Z
dc.date.issued2022-05-26
dc.description.abstractReal-world classification data usually contain noise, which can affect the accuracy of the models and their complexity. In this context, an interesting approach to reduce the effects of noise is building ensembles of classifiers, which traditionally have been credited with the ability to tackle difficult problems. Among the alternatives to build ensembles with noisy data, bagging has shown some potential in the specialized literature. However, existing works in this field are limited and only focus on the study of noise based on a random mislabeling, which is unlikely to occur in real-world applications. Recent research shows that other types of noise, such as that occurring at class boundaries, are more common and challenging for classification algorithms. This paper delves into the analysis of the usage of bagging techniques in these complex problems, in which noise affects the decision boundaries among classes. In order to investigate whether bagging is able to reduce the impact of borderline noise, an experimental study is carried out considering a large number of datasets with different noise levels, and several noise models and classification algorithms. The results obtained reflect that bagging obtains a better accuracy and robustness than the individual models with this complex type of noise. The highest improvements in average accuracy are around 2-4% and are generally found at medium-high noise levels (from 15-20% onwards). The partial consideration of noisy samples when creating the subsamples from the original training set in bagging can make it so that only some parts of the decision boundaries among classes are impaired when building each model, reducing the impact of noise in the global system.
dc.description.versionSi
dc.identifier.citationSaez, Jose A., Romero-Bejar, Jose L. On the Suitability of Bagging-Based Ensembles with Borderline Label Noise; Mathematics 2022, 10(11), 1892
dc.identifier.doi10.3390/math10111892
dc.identifier.essn2227-7390
dc.identifier.unpaywallURLhttps://www.mdpi.com/2227-7390/10/11/1892/pdf?version=1654071599
dc.identifier.urihttp://hdl.handle.net/10668/21411
dc.identifier.wosID808711300001
dc.issue.number11
dc.journal.titleMathematics
dc.journal.titleabbreviationMathematics
dc.language.isoen
dc.organizationInstituto de Investigación Biosanitaria de Granada (ibs.GRANADA)
dc.publisherMdpi
dc.relation.publisherversionmdpi.com/2227-7390/10/11/1892
dc.rightsAttribution 4.0 International
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectborderline noise
dc.subjectlabel noise
dc.subjectbagging
dc.subjectensembles
dc.subjectrobust learners
dc.subjectclassification
dc.subjectNonparametric statistical tests
dc.subject.decsPesos y medidas
dc.subject.decsclasificación
dc.subject.decsÁrboles de decisión
dc.subject.decsAlgoritmos
dc.subject.meshComplexity-measures
dc.subject.meshDecision trees
dc.subject.meshClassification
dc.subject.meshMachine
dc.subject.meshModel
dc.subject.meshClassifiers
dc.subject.meshRanking
dc.subject.meshRobust
dc.subject.meshAlgorithms
dc.titleOn the Suitability of Bagging-Based Ensembles with Borderline Label Noise
dc.typeresearch article
dc.type.hasVersionVoR
dc.volume.number10
dc.wostypeArticle
dspace.entity.typePublication

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Saez_OnTheSuitability.pdf
Size:
509.49 KB
Format:
Adobe Portable Document Format