Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices
The problem of incomplete data matrices is repeatedly found in large databases; posing a significant obstacle for an effective treatment of data. This paper examines a self-organizing-map (SOM) based method of data imputation under the concept of distance object per one weight; to predict physicoche...
Auteurs principaux: | , , , |
---|---|
Format: | Texto completo |
Langue: | Inglés |
Publié: |
Elsevier Science Bv
2015
|
Sujets: | |
Accès en ligne: | https://ri.unsam.edu.ar/handle/123456789/1009 |
id |
ds-123456789-1009 |
---|---|
record_format |
dspace |
institution |
Repositorio Institucional |
collection |
RI |
language |
Inglés |
topic |
CHEMOMETRICS ARTIFICIAL NEURAL NETWORK SELF-ORGANIZING MAPS MISSING DATA IMPUTATION ENVIRONMENTAL DATA SET CIENCIAS QUÍMICAS CIENCIAS EXACTAS Y NATURALES |
spellingShingle |
CHEMOMETRICS ARTIFICIAL NEURAL NETWORK SELF-ORGANIZING MAPS MISSING DATA IMPUTATION ENVIRONMENTAL DATA SET CIENCIAS QUÍMICAS CIENCIAS EXACTAS Y NATURALES Folguera, Laura Zupan, Jure Cicerone, Daniel Magallanes, Jorge Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
description |
The problem of incomplete data matrices is repeatedly found in large databases; posing a significant obstacle for an effective treatment of data. This paper examines a self-organizing-map (SOM) based method of data imputation under the concept of distance object per one weight; to predict physicochemical parameters of water samples in a data set where concentrations of different analytes were missed. The method was evaluated according to two different possibilities: (a) including vectors of samples with and without missing data in the training data set and (b) pre-training a SOM for a data set with no missing values and then making imputations for a second data set (prediction set) of samples with missing values. Evaluations were made using a surface water data set of 270 samples from Reconquista River; in Buenos Aires Province; Argentina; by artificially setting a range of 17% to 39% of the data to missing. Results were compared to imputations made through professional criteria. SOMs gave reasonable estimates; with no statistically significant differences from estimates made through professional criteria; proving thus to be a suitable time-saving imputation method. |
format |
Texto completo |
author |
Folguera, Laura Zupan, Jure Cicerone, Daniel Magallanes, Jorge |
author_facet |
Folguera, Laura Zupan, Jure Cicerone, Daniel Magallanes, Jorge |
author_sort |
Folguera, Laura |
title |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
title_short |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
title_full |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
title_fullStr |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
title_full_unstemmed |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
title_sort |
self-organizing maps for imputation of missing data in incomplete data matrices |
publisher |
Elsevier Science Bv |
publishDate |
2015 |
url |
https://ri.unsam.edu.ar/handle/123456789/1009 |
work_keys_str_mv |
AT folgueralaura selforganizingmapsforimputationofmissingdatainincompletedatamatrices AT zupanjure selforganizingmapsforimputationofmissingdatainincompletedatamatrices AT ciceronedaniel selforganizingmapsforimputationofmissingdatainincompletedatamatrices AT magallanesjorge selforganizingmapsforimputationofmissingdatainincompletedatamatrices |
_version_ |
1747968924248440832 |