Module es.upm.fi.cig.multictbnc
Class DataStreamMultipleCSVReader
java.lang.Object
es.upm.fi.cig.multictbnc.data.reader.DataStreamMultipleCSVReader
The class is designed for reading and processing streaming data from multiple CSV files.
-
Constructor Summary
ConstructorDescriptionDataStreamMultipleCSVReader
(String datasetFolder, String nameTimeVariable, List<String> nameClassVariables) This constructor prepares the reader to process CSV files from the specified folder. -
Method Summary
Modifier and TypeMethodDescriptiondetectNewFeatureVariables
(File csvFile) Extracts the names of the variables from a CSV file.boolean
Checks if there is more data to be read.readDataset
(int numFiles) Reads a specified number of CSV files from the dataset folder and processes them into a dataset.
-
Constructor Details
-
DataStreamMultipleCSVReader
public DataStreamMultipleCSVReader(String datasetFolder, String nameTimeVariable, List<String> nameClassVariables) throws UnreadDatasetException, FileNotFoundException This constructor prepares the reader to process CSV files from the specified folder. The constructor also sorts the CSV files by name and checks for their existence.- Parameters:
datasetFolder
- path to the folder containing the CSV filesnameTimeVariable
- name of the time variablenameClassVariables
- list of names of class variables- Throws:
UnreadDatasetException
- if the datasets could be not be read as no CSV files were found in the specified folderFileNotFoundException
- if a file cannot be read
-
-
Method Details
-
isDataArriving
public boolean isDataArriving()Checks if there is more data to be read.- Returns:
true
is there are data arriving from the data stream,false
otherwise.
-
readDataset
Reads a specified number of CSV files from the dataset folder and processes them into a dataset. It detects any new feature variables.- Parameters:
numFiles
- number of CSV files to read in the current batch- Returns:
- dataset containing the data from the read CSV files
- Throws:
UnreadDatasetException
- if an issue occurs while processing the sequences in the filesFileNotFoundException
- if a file cannot be read
-
detectNewFeatureVariables
public List<Map.Entry<String,Integer>> detectNewFeatureVariables(File csvFile) throws FileNotFoundException Extracts the names of the variables from a CSV file. It is assumed that the names are in the first row.- Parameters:
csvFile
- CSV file- Returns:
- list of entries, where each entry contains the name of a new feature variable and its index
- Throws:
FileNotFoundException
- if the CSV file was not found
-