Reading mutiple arff dataset files in python

I'm new to python and I'm trying to read more than 20 dataset arff files. So far I couldn't find a dynamic method to extract X and y to call train_test_split and other ML functions. Most solutions I've passed extract X and y from a specific arff file where they know the number of columns. Where X : {array-like, sparse matrix}, shape (n_samples, n_features) Matrix containing the data which have to be sampled and y : array-like, shape (n_samples,) Corresponding label for each sample in X.


The problem is: 1- Data are not read correctly into (X , y) 2- When calling train_test_split an error aries since it require 2D array and it receives a 1D array

Here is the method I'm using to read the file and store (X,y) in lists:

_Datasets_=['glass.arff'  , 'mushroom.arff' , 
            'autos.arff' , 'car.arff','iris.arff' ,
            'zoo.arff'] #subset of the benchmarks 

Numfile = len(_Datasets_)

for x in range(0 , len(_Datasets_)):
     filename = path + _Datasets_[x]
     data = []
     target = []
     reading_data = False
     with open(filename , 'r') as handle:
        for line in handle:
            line = line.strip()

        # Ignore comments and whitespace
            if line.startswith('%%') or not line:
                continue

        # If we have already reached the @data section, we just read indefinitely
        # If @data doesn't come last, this will not work
           if reading_data:
                 data.append(line)
                 row = line.split()

                 data = (row[:-1]) <----  this doesnt get all data rows
                 target = (row[-1]) <---- this sometimes get entire rows not 
                                          the last column only (the target 
                                          column)
                 continue

        # Otherwise, try parsing the file
            if line.startswith('@attribute'):
                key, value = line.split(' ', 2)[1:]
                attributes[key] = value
              #  print(attributes[key])
            elif line.startswith('@data'):
                reading_data = True
            else:
            #raise ValueError('Cannot parse line {!r}'.format(line))
                 pass

3- when trying to convert (data , target) list to array and sparse matrix by:

     a = np.asarray(data)
     X=sparse.csr_matrix(a)
     y = np.asarray(target) 

I get this error.

TypeError: Singleton array array('%',
          dtype='|S1') cannot be considered a valid collection.

I'd appreciate any help and guidance to how extract the right data for (X , y) and in the right format to call functions from sklearn and imblearn.under_sampling.