Class DataStreamMultipleCSVReader

java.lang.Object
es.upm.fi.cig.multictbnc.data.reader.DataStreamMultipleCSVReader

public class DataStreamMultipleCSVReader extends Object
The class is designed for reading and processing streaming data from multiple CSV files.
  • Constructor Details

    • DataStreamMultipleCSVReader

      public DataStreamMultipleCSVReader(String datasetFolder, String nameTimeVariable, List<String> nameClassVariables) throws UnreadDatasetException, FileNotFoundException
      This constructor prepares the reader to process CSV files from the specified folder. The constructor also sorts the CSV files by name and checks for their existence.
      Parameters:
      datasetFolder - path to the folder containing the CSV files
      nameTimeVariable - name of the time variable
      nameClassVariables - list of names of class variables
      Throws:
      UnreadDatasetException - if the datasets could be not be read as no CSV files were found in the specified folder
      FileNotFoundException - if a file cannot be read
  • Method Details

    • isDataArriving

      public boolean isDataArriving()
      Checks if there is more data to be read.
      Returns:
      true is there are data arriving from the data stream, false otherwise.
    • readDataset

      public Dataset readDataset(int numFiles) throws UnreadDatasetException, FileNotFoundException
      Reads a specified number of CSV files from the dataset folder and processes them into a dataset. It detects any new feature variables.
      Parameters:
      numFiles - number of CSV files to read in the current batch
      Returns:
      dataset containing the data from the read CSV files
      Throws:
      UnreadDatasetException - if an issue occurs while processing the sequences in the files
      FileNotFoundException - if a file cannot be read
    • detectNewFeatureVariables

      public List<Map.Entry<String,Integer>> detectNewFeatureVariables(File csvFile) throws FileNotFoundException
      Extracts the names of the variables from a CSV file. It is assumed that the names are in the first row.
      Parameters:
      csvFile - CSV file
      Returns:
      list of entries, where each entry contains the name of a new feature variable and its index
      Throws:
      FileNotFoundException - if the CSV file was not found