Clustering-Based Self-Imputation of Unlabeled Fault Data in a Fleet of Photovoltaic Generation Systems

This work proposes a fault detection and imputation scheme for a fleet of small-scale photovoltaic (PV) systems, where the captured data includes unlabeled faults. On-site meteorological information, such as solar irradiance, is helpful for monitoring PV systems. However, collecting this type of wea...

Full description

Bibliographic Details
Main Authors: Sunme Park, Soyeong Park, Myungsun Kim, Euiseok Hwang
Format: Article
Language:English
Published: MDPI AG 2020-02-01
Series:Energies
Subjects:
Online Access:https://www.mdpi.com/1996-1073/13/3/737
id doaj-d938f41d2ab84960b59ab07a7e48516a
record_format Article
spelling doaj-d938f41d2ab84960b59ab07a7e48516a2020-11-25T02:03:34ZengMDPI AGEnergies1996-10732020-02-0113373710.3390/en13030737en13030737Clustering-Based Self-Imputation of Unlabeled Fault Data in a Fleet of Photovoltaic Generation SystemsSunme Park0Soyeong Park1Myungsun Kim2Euiseok Hwang3School of Mechanical Engineering, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005, KoreaSchool of Mechanical Engineering, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005, KoreaSchool of Mechanical Engineering, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005, KoreaSchool of Mechanical Engineering, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005, KoreaThis work proposes a fault detection and imputation scheme for a fleet of small-scale photovoltaic (PV) systems, where the captured data includes unlabeled faults. On-site meteorological information, such as solar irradiance, is helpful for monitoring PV systems. However, collecting this type of weather data at every station is not feasible for a fleet owing to the limitation of installation costs. In this study, to monitor a PV fleet efficiently, neighboring PV generation profiles were utilized for fault detection and imputation, as well as solar irradiance. For fault detection from unlabeled raw PV data, K-means clustering was employed to detect abnormal patterns based on customized input features, which were extracted from the fleet PVs and weather data. When a profile was determined to have an abnormal pattern, imputation for the corresponding data was implemented using the subset of neighboring PV data clustered as normal. For evaluation, the effectiveness of neighboring PV information was investigated using the actual rooftop PV power generation data measured at several locations in the Gwangju Institute of Science and Technology (GIST) campus. The results indicate that neighboring PV profiles improve the fault detection capability and the imputation accuracy. For fault detection, clustering-based schemes provided error rates of 0.0126 and 0.0223, respectively, with and without neighboring PV data, whereas the conventional prediction-based approach showed an error rate of 0.0753. For imputation, estimation accuracy was significantly improved by leveraging the labels of fault detection in the proposed scheme, as much as 18.32% reduction in normalized root mean square error (NRMSE) compared with the conventional scheme without fault consideration.https://www.mdpi.com/1996-1073/13/3/737pv fleetclustering-based pv fault detectionunsupervised learningself-imputation
collection DOAJ
language English
format Article
sources DOAJ
author Sunme Park
Soyeong Park
Myungsun Kim
Euiseok Hwang
spellingShingle Sunme Park
Soyeong Park
Myungsun Kim
Euiseok Hwang
Clustering-Based Self-Imputation of Unlabeled Fault Data in a Fleet of Photovoltaic Generation Systems
Energies
pv fleet
clustering-based pv fault detection
unsupervised learning
self-imputation
author_facet Sunme Park
Soyeong Park
Myungsun Kim
Euiseok Hwang
author_sort Sunme Park
title Clustering-Based Self-Imputation of Unlabeled Fault Data in a Fleet of Photovoltaic Generation Systems
title_short Clustering-Based Self-Imputation of Unlabeled Fault Data in a Fleet of Photovoltaic Generation Systems
title_full Clustering-Based Self-Imputation of Unlabeled Fault Data in a Fleet of Photovoltaic Generation Systems
title_fullStr Clustering-Based Self-Imputation of Unlabeled Fault Data in a Fleet of Photovoltaic Generation Systems
title_full_unstemmed Clustering-Based Self-Imputation of Unlabeled Fault Data in a Fleet of Photovoltaic Generation Systems
title_sort clustering-based self-imputation of unlabeled fault data in a fleet of photovoltaic generation systems
publisher MDPI AG
series Energies
issn 1996-1073
publishDate 2020-02-01
description This work proposes a fault detection and imputation scheme for a fleet of small-scale photovoltaic (PV) systems, where the captured data includes unlabeled faults. On-site meteorological information, such as solar irradiance, is helpful for monitoring PV systems. However, collecting this type of weather data at every station is not feasible for a fleet owing to the limitation of installation costs. In this study, to monitor a PV fleet efficiently, neighboring PV generation profiles were utilized for fault detection and imputation, as well as solar irradiance. For fault detection from unlabeled raw PV data, K-means clustering was employed to detect abnormal patterns based on customized input features, which were extracted from the fleet PVs and weather data. When a profile was determined to have an abnormal pattern, imputation for the corresponding data was implemented using the subset of neighboring PV data clustered as normal. For evaluation, the effectiveness of neighboring PV information was investigated using the actual rooftop PV power generation data measured at several locations in the Gwangju Institute of Science and Technology (GIST) campus. The results indicate that neighboring PV profiles improve the fault detection capability and the imputation accuracy. For fault detection, clustering-based schemes provided error rates of 0.0126 and 0.0223, respectively, with and without neighboring PV data, whereas the conventional prediction-based approach showed an error rate of 0.0753. For imputation, estimation accuracy was significantly improved by leveraging the labels of fault detection in the proposed scheme, as much as 18.32% reduction in normalized root mean square error (NRMSE) compared with the conventional scheme without fault consideration.
topic pv fleet
clustering-based pv fault detection
unsupervised learning
self-imputation
url https://www.mdpi.com/1996-1073/13/3/737
work_keys_str_mv AT sunmepark clusteringbasedselfimputationofunlabeledfaultdatainafleetofphotovoltaicgenerationsystems
AT soyeongpark clusteringbasedselfimputationofunlabeledfaultdatainafleetofphotovoltaicgenerationsystems
AT myungsunkim clusteringbasedselfimputationofunlabeledfaultdatainafleetofphotovoltaicgenerationsystems
AT euiseokhwang clusteringbasedselfimputationofunlabeledfaultdatainafleetofphotovoltaicgenerationsystems
_version_ 1724947359522095104