python find elemet form array in pandas dataframe
I have a csv file (the prefix could be any string ) :
prefix,path
pref1,path1
pref2,path2
and files :
pref1_file.txt
pref2_file.txt
pref3_file.txt
I want to get the path of a file based on his prefix
result for this example :
pref1_file.txt : path1
pref2_file.txt : path2
pref3_file.txt : path_not_found
Here is my code :
dirName = 'C:\\Users\\TEST\\Desktop\\Test'
# get all files in all folders
listOfFiles = list()
for (dirpath, dirnames, filenames) in os.walk(dirName):
listOfFiles += [os.path.join(dirpath, file) for file in filenames]
df = pd.read_csv(dir_path + 'file.csv')
for elem in listOfFiles:
file_name = os.path.basename(elem)
for index, row in df.iterrows():
if file_name.startswith(row['prefix']):
print(file_name + ":" + row['mask'])
else:
print(file_name + ":" + "path_not_found")
it's work but without else conditon (i need to display "path_not_found" if the prefix is not found in the csv file)
Thanks
3 answers
-
answered 2022-01-19 16:46
Vivek Kalyanarangan
Use -
dict(zip(files, pd.Series(files).str.split('_').str[0].map(df1.set_index('prefix')['path']).fillna('path_not_found')))
Output
{'pref1_file.txt': 'path1', 'pref2_file.txt': 'path2', 'pref3_file.txt': 'path_not_found'}
Here,
files
islistOfFiles
in your dataExplanation
- Convert
files
topd.Series
- Split by
_
and take the first part - Use pandas
map
to get thepath
- Convert to
dict
- Convert
-
answered 2022-01-19 17:06
Henry
Try this:
dirName = 'C:\\Users\\TEST\\Desktop\\Test' # get all files in all folders listOfFiles = list() for (dirpath, dirnames, filenames) in os.walk(dirName): listOfFiles += [os.path.join(dirpath, file) for file in filenames] df = pd.read_csv(dir_path + 'file.csv') for elem in listOfFiles: file_name = os.path.basename(elem) df_prefix = df[df['prefix'].apply(lambda x: file_name.startswith(x))] if df_prefix.size > 0: print( df_prefix['prefix'].loc[0] + ":" + file_name) else: print(file_name + ": Not found")
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#selection-by-callable
-
answered 2022-01-20 08:30
Med_siraj
To complete @Henry's solution:
df_prefix = df[df['prefix'].apply(lambda x: file_name.startswith(x))] if df_prefix.size > 0: print(file_name + " : " + df_prefix['path'].iloc[0]) else: print(file_name + ": path_not_found")
How many English words
do you know?
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
how many words do you know
Powered by Examplum