More than fifty years after its introduction, fuzzy sets theory is still thriving and continues to play a relevant role in a wide number of scientific applications. Nevertheless, while the enrichments that fuzzy logic and set theory can provide are manifold, the recognition of fuzzy set and logic inside the machine learning community remains rather moderate. In this thesis, we present several approaches aimed at improving machine learning techniques using tools borrowed from fuzzy set theory and logic. Particularly, we try to focus more on the machine learning perspective, thus inviting machine learning researcher to appreciate the modelling strengths of fuzzy set theory. We begin presenting FDT-Boost, a boosting approach shaped according to the SAMME- Adaboost scheme, which leverages fuzzy binary decision trees as base classifiers; then, we explore a distributed fuzzy random forest DFRF, that leverages the Apache Spark framework, to generate an efficient and effective classifier for big data. We also propose a novel approach for generating, out of big data, a set of fuzzy rule-based classifiers characterised by different optimal trade-offs between accuracy and interpretability. The approach, dubbed DPAES-FDT-GL, extends a state-of-the-art distributed multi-objective evolutionary learning scheme, implemented in the Apache Spark environment. Lastly, we focus on an application, showing how fuzzy systems could be employed in helping medical decision; we propose a novel pipeline to support tumour type classification and rule extraction based on somatic CNV data. The pipeline outputs an interpretable Fuzzy Rule-Based Classifier (FRBC). Much work remains to be done, and fuzzy set theory has still a big role to play in machine learning.
Fuzzy Methods for Machine Learning. A Big Data Perspective / Marco Barsacchi. - (2019).
Fuzzy Methods for Machine Learning. A Big Data Perspective
BARSACCHI, MARCO
2019
Abstract
More than fifty years after its introduction, fuzzy sets theory is still thriving and continues to play a relevant role in a wide number of scientific applications. Nevertheless, while the enrichments that fuzzy logic and set theory can provide are manifold, the recognition of fuzzy set and logic inside the machine learning community remains rather moderate. In this thesis, we present several approaches aimed at improving machine learning techniques using tools borrowed from fuzzy set theory and logic. Particularly, we try to focus more on the machine learning perspective, thus inviting machine learning researcher to appreciate the modelling strengths of fuzzy set theory. We begin presenting FDT-Boost, a boosting approach shaped according to the SAMME- Adaboost scheme, which leverages fuzzy binary decision trees as base classifiers; then, we explore a distributed fuzzy random forest DFRF, that leverages the Apache Spark framework, to generate an efficient and effective classifier for big data. We also propose a novel approach for generating, out of big data, a set of fuzzy rule-based classifiers characterised by different optimal trade-offs between accuracy and interpretability. The approach, dubbed DPAES-FDT-GL, extends a state-of-the-art distributed multi-objective evolutionary learning scheme, implemented in the Apache Spark environment. Lastly, we focus on an application, showing how fuzzy systems could be employed in helping medical decision; we propose a novel pipeline to support tumour type classification and rule extraction based on somatic CNV data. The pipeline outputs an interpretable Fuzzy Rule-Based Classifier (FRBC). Much work remains to be done, and fuzzy set theory has still a big role to play in machine learning.File | Dimensione | Formato | |
---|---|---|---|
PhdThesis_MBarsacchi.pdf
Open Access dal 26/02/2022
Tipologia:
Tesi di dottorato
Licenza:
Open Access
Dimensione
2.77 MB
Formato
Adobe PDF
|
2.77 MB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.