Class AbstractCSVReader

java.lang.Object
es.upm.fi.cig.multictbnc.data.reader.AbstractCSVReader
All Implemented Interfaces:
DatasetReader
Direct Known Subclasses:
MultipleCSVReader, SingleCSVReader

public abstract class AbstractCSVReader extends Object implements DatasetReader
Common attributes and methods for dataset readers.
  • Constructor Details

    • AbstractCSVReader

      public AbstractCSVReader(String datasetFolder)
      Receives the path to the dataset folder and initialises the reader as out-of-date. In that way, the dataset will be loaded when it is requested.
      Parameters:
      datasetFolder - path to the dataset folder
  • Method Details

    • extractVariableNames

      public void extractVariableNames(File csvFile) throws FileNotFoundException
      Extracts the names of the variables given in some CSV files. This method assumes that the names are in the first row.
      Parameters:
      csvFile - CSV file
      Throws:
      FileNotFoundException - if the CSV file was not found
    • readCSV

      public List<String[]> readCSV(String pathFile, List<String> excludeVariables) throws VariableNotFoundException, FileNotFoundException
      Reads a CSV file.
      Parameters:
      pathFile - path to the CSV file
      excludeVariables - names of variables to ignore when reading the CSV
      Returns:
      list with the rows (lists of strings) of the CSV
      Throws:
      VariableNotFoundException - if a specified variable was not found in the provided files
      FileNotFoundException - if the CSV file was not found
    • getNameClassVariables

      public List<String> getNameClassVariables()
      Description copied from interface: DatasetReader
      Returns the name of the class variables.
      Specified by:
      getNameClassVariables in interface DatasetReader
      Returns:
      name of the class variables.
    • getNameFeatureVariables

      public List<String> getNameFeatureVariables()
      Description copied from interface: DatasetReader
      Returns the name of the feature variables.
      Specified by:
      getNameFeatureVariables in interface DatasetReader
      Returns:
      name of the feature variables.
    • getNameTimeVariable

      public String getNameTimeVariable()
      Description copied from interface: DatasetReader
      Returns the name of the time variable.
      Specified by:
      getNameTimeVariable in interface DatasetReader
      Returns:
      name of the time variable.
    • getNameVariables

      public List<String> getNameVariables()
      Description copied from interface: DatasetReader
      Returns the names of all the variables of the dataset, including those that are not used.
      Specified by:
      getNameVariables in interface DatasetReader
      Returns:
      names of the variables
    • isDatasetOutdated

      public boolean isDatasetOutdated()
      Description copied from interface: DatasetReader
      Indicates if the dataset is out-of-date.
      Specified by:
      isDatasetOutdated in interface DatasetReader
      Returns:
      true if dataset is out-of-date; false otherwise.
    • readDataset

      public Dataset readDataset(int numFiles) throws UnreadDatasetException
      Description copied from interface: DatasetReader
      Creates a dataset using only the specified number of files. This method allows reading datasets using batches.
      Specified by:
      readDataset in interface DatasetReader
      Parameters:
      numFiles - number of files
      Returns:
      a Dataset
      Throws:
      UnreadDatasetException - thrown if the dataset could not be read
    • removeZeroVarianceVariables

      public void removeZeroVarianceVariables(boolean removeZeroVarianceVariables)
      Description copied from interface: DatasetReader
      Defines if the feature variables with no variance should be removed.
      Specified by:
      removeZeroVarianceVariables in interface DatasetReader
      Parameters:
      removeZeroVarianceVariables - true to remove zero variance feature variables, false
    • setDatasetAsOutdated

      public void setDatasetAsOutdated(boolean outdated)
      Description copied from interface: DatasetReader
      Defines a previously read dataset as out-of-date, so it should be reloaded.
      Specified by:
      setDatasetAsOutdated in interface DatasetReader
      Parameters:
      outdated - true to set dataset as out-of-date; false otherwise.
    • setTimeAndClassVariables

      public void setTimeAndClassVariables(String nameTimeVariable, List<String> nameClassVariables)
      Description copied from interface: DatasetReader
      Receives the names of the time and class variables of a dataset. All the other variables are considered feature variables.
      Specified by:
      setTimeAndClassVariables in interface DatasetReader
      Parameters:
      nameTimeVariable - name of the time variable
      nameClassVariables - name of the class variables
    • setTimeAndFeatureVariables

      public void setTimeAndFeatureVariables(String nameTimeVariable, List<String> nameFeatureVariables)
      Description copied from interface: DatasetReader
      Receives the names of the time and feature variables of a dataset. This method can be used, for example, when reading datasets to be classified.
      Specified by:
      setTimeAndFeatureVariables in interface DatasetReader
      Parameters:
      nameTimeVariable - name of the time variable
      nameFeatureVariables - name of the feature variables
    • setTimeVariable

      public void setTimeVariable(String nameTimeVariable)
      Description copied from interface: DatasetReader
      Receives the name of the time variable of a dataset.
      Specified by:
      setTimeVariable in interface DatasetReader
      Parameters:
      nameTimeVariable - name of the time variable
    • setVariables

      public void setVariables(String nameTimeVariable, List<String> nameClassVariables, List<String> nameFeatureVariables)
      Description copied from interface: DatasetReader
      Receives the names of the time variable, feature variables and class variables of a dataset. This method can be used, for example, when read training datasets.
      Specified by:
      setVariables in interface DatasetReader
      Parameters:
      nameTimeVariable - name of the time variable
      nameClassVariables - names of the class variables
      nameFeatureVariables - names of the feature variables