Summary: | This thesis presents a number of investigations leading to introduction of novel applications of intelligent algorithms in the fields of informatics and analytics. This research aims to develop novel methodologies to reduce dimensions and clustering of highly non-linear multidimensional data. Improving the performance of existing methodologies has been based on two fundamental approaches. The first is to look into making novel structural re-arrangements by hybridisation of conventional intelligent algorithms which are Auto-Associative Neural Networks (AANN) and Self Organizing Maps (SOM) for data clustering improvement. The second is to enhance data clustering and classification performance by introducing novel fundamental algorithmic changes known as M3-SOM in the data processing and training procedure of conventional SOM. Both approaches are tested, benchmarked and analysed using three datasets which are Iris Flowers, Italian Olive Oils and Wine through case studies for dimension reduction, clustering and classification of complex and non-linear data. The study on AANN alone shows that this non-linear algorithm is able to efficiently reduce dimensions of the three datasets. This paves the way towards structurally hybridising AANN as dimension reduction method with SOM as clustering method (AANNSOM) for data clustering enhancement. This hybrid AANNSOM is then introduced and applied to cluster Iris Flowers, Italian Olive Oils and Wine datasets. The hybrid methodology proves to be able to improve data clustering accuracy, reduce quantisation errors and decrease computational time when compared to SOM in all case studies. However, the topographic errors showed inconsistency throughout the studies and it is still difficult for both AANNSOM and SOM to provide additional inherent information of the datasets such as the exact position of a data in a cluster. Therefore, M3-SOM, a novel methodology based on SOM training algorithm is proposed, developed and studied on the same datasets. M3-SOM was able to improve data clustering and classification accuracy for all three case studies when compared to conventional SOM. It is also able to obtain inherent information about the position of one data or "sub-cluster" towards other data or sub-cluster within the same class in Iris Flowers and Wine datasets. Nevertheless, it faces difficulties in achieving the same level of performance when clustering Italian Olive Oils data due to high number of data classes. However, it can be concluded that both methodologies have been able to improve data clustering and classification performance as well as to discover inherent information inside multidimensional data.
|