Read multiple *.txt files into Pandas Dataframe with filename as the first column

Issue

This Content is from Stack Overflow. Question asked by Simon

currently I have these codes which only reads one specific txt file and split into different columns. Each txt file is stored in the same directory and looks like this
2a.txt

spark.read.format('csv').options(header='false').load("/mnt/datasets/model1/train/labels/2a.txt").toPandas()
df_2a.columns = ['Value']
df_2a_split = df_2a['Value'].str.split(' ', n=0, expand=True)
df_2a_split.columns = ['class','c1','c2','c3','c4']
display(df_2a_split)

And the output like this
Current result

I want to ingest all txt.files in a directory including the filename as the first column in the pandas dataframe. The expected result looks like below

file_name class   c1       c2      c3          c4
2a.txt  0   0.712518    0.61525 0.43918     0.2065
2a.txt  1   0.635078    0.81175 0.292786    0.0925
2b.txt  2   0.551273    0.5705  0.30198     0.0922
2b.txt  0   0.550212    0.31125 0.486563    0.2455



Solution

This question is not yet answered, be the first one who answer using the comment. Later the confirmed answer will be published as the solution.

This Question and Answer are collected from stackoverflow and tested by JTuto community, is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.

people found this article helpful. What about you?