Module es.upm.fi.cig.multictbnc
Class AbstractCSVReader
java.lang.Object
es.upm.fi.cig.multictbnc.data.reader.AbstractCSVReader
- All Implemented Interfaces:
DatasetReader
- Direct Known Subclasses:
MultipleCSVReader,SingleCSVReader
Common attributes and methods for dataset readers.
-
Constructor Summary
ConstructorsConstructorDescriptionAbstractCSVReader(String datasetFolder) Receives the path to the dataset folder and initialises the reader as out-of-date. -
Method Summary
Modifier and TypeMethodDescriptionvoidextractVariableNames(File csvFile) Extracts the names of the variables given in some CSV files.Returns the name of the class variables.Returns the name of the feature variables.Returns the name of the time variable.Returns the names of all the variables of the dataset, including those that are not used.booleanIndicates if the dataset is out-of-date.Reads a CSV file.readDataset(int numFiles) Creates a dataset using only the specified number of files.voidremoveZeroVarianceVariables(boolean removeZeroVarianceVariables) Defines if the feature variables with no variance should be removed.voidsetDatasetAsOutdated(boolean outdated) Defines a previously read dataset as out-of-date, so it should be reloaded.voidsetTimeAndClassVariables(String nameTimeVariable, List<String> nameClassVariables) Receives the names of the time and class variables of a dataset.voidsetTimeAndFeatureVariables(String nameTimeVariable, List<String> nameFeatureVariables) Receives the names of the time and feature variables of a dataset.voidsetTimeVariable(String nameTimeVariable) Receives the name of the time variable of a dataset.voidsetVariables(String nameTimeVariable, List<String> nameClassVariables, List<String> nameFeatureVariables) Receives the names of the time variable, feature variables and class variables of a dataset.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface es.upm.fi.cig.multictbnc.data.reader.DatasetReader
readDataset
-
Constructor Details
-
AbstractCSVReader
Receives the path to the dataset folder and initialises the reader as out-of-date. In that way, the dataset will be loaded when it is requested.- Parameters:
datasetFolder- path to the dataset folder
-
-
Method Details
-
extractVariableNames
Extracts the names of the variables given in some CSV files. This method assumes that the names are in the first row.- Parameters:
csvFile- CSV file- Throws:
FileNotFoundException- if the CSV file was not found
-
readCSV
public List<String[]> readCSV(String pathFile, List<String> excludeVariables) throws VariableNotFoundException, FileNotFoundException Reads a CSV file.- Parameters:
pathFile- path to the CSV fileexcludeVariables- names of variables to ignore when reading the CSV- Returns:
- list with the rows (lists of strings) of the CSV
- Throws:
VariableNotFoundException- if a specified variable was not found in the provided filesFileNotFoundException- if the CSV file was not found
-
getNameClassVariables
Description copied from interface:DatasetReaderReturns the name of the class variables.- Specified by:
getNameClassVariablesin interfaceDatasetReader- Returns:
- name of the class variables.
-
getNameFeatureVariables
Description copied from interface:DatasetReaderReturns the name of the feature variables.- Specified by:
getNameFeatureVariablesin interfaceDatasetReader- Returns:
- name of the feature variables.
-
getNameTimeVariable
Description copied from interface:DatasetReaderReturns the name of the time variable.- Specified by:
getNameTimeVariablein interfaceDatasetReader- Returns:
- name of the time variable.
-
getNameVariables
Description copied from interface:DatasetReaderReturns the names of all the variables of the dataset, including those that are not used.- Specified by:
getNameVariablesin interfaceDatasetReader- Returns:
- names of the variables
-
isDatasetOutdated
public boolean isDatasetOutdated()Description copied from interface:DatasetReaderIndicates if the dataset is out-of-date.- Specified by:
isDatasetOutdatedin interfaceDatasetReader- Returns:
trueif dataset is out-of-date;falseotherwise.
-
readDataset
Description copied from interface:DatasetReaderCreates a dataset using only the specified number of files. This method allows reading datasets using batches.- Specified by:
readDatasetin interfaceDatasetReader- Parameters:
numFiles- number of files- Returns:
- a
Dataset - Throws:
UnreadDatasetException- thrown if the dataset could not be read
-
removeZeroVarianceVariables
public void removeZeroVarianceVariables(boolean removeZeroVarianceVariables) Description copied from interface:DatasetReaderDefines if the feature variables with no variance should be removed.- Specified by:
removeZeroVarianceVariablesin interfaceDatasetReader- Parameters:
removeZeroVarianceVariables-trueto remove zero variance feature variables,false
-
setDatasetAsOutdated
public void setDatasetAsOutdated(boolean outdated) Description copied from interface:DatasetReaderDefines a previously read dataset as out-of-date, so it should be reloaded.- Specified by:
setDatasetAsOutdatedin interfaceDatasetReader- Parameters:
outdated-trueto set dataset as out-of-date;falseotherwise.
-
setTimeAndClassVariables
Description copied from interface:DatasetReaderReceives the names of the time and class variables of a dataset. All the other variables are considered feature variables.- Specified by:
setTimeAndClassVariablesin interfaceDatasetReader- Parameters:
nameTimeVariable- name of the time variablenameClassVariables- name of the class variables
-
setTimeAndFeatureVariables
Description copied from interface:DatasetReaderReceives the names of the time and feature variables of a dataset. This method can be used, for example, when reading datasets to be classified.- Specified by:
setTimeAndFeatureVariablesin interfaceDatasetReader- Parameters:
nameTimeVariable- name of the time variablenameFeatureVariables- name of the feature variables
-
setTimeVariable
Description copied from interface:DatasetReaderReceives the name of the time variable of a dataset.- Specified by:
setTimeVariablein interfaceDatasetReader- Parameters:
nameTimeVariable- name of the time variable
-
setVariables
public void setVariables(String nameTimeVariable, List<String> nameClassVariables, List<String> nameFeatureVariables) Description copied from interface:DatasetReaderReceives the names of the time variable, feature variables and class variables of a dataset. This method can be used, for example, when read training datasets.- Specified by:
setVariablesin interfaceDatasetReader- Parameters:
nameTimeVariable- name of the time variablenameClassVariables- names of the class variablesnameFeatureVariables- names of the feature variables
-