[SOLVED] Splitting a list of strings based on substring with variable character

Issue

This Content is from Stack Overflow. Question asked by Woodywoodleg

I have the following list of strings:

my_list = ['2022-09-18 1234 name O0A raw.txt',
'2022-09-18 1234 name O0P raw.txt',
'2022-09-18 1234 name O1A raw.txt',
'2022-09-18 1234 name O1P raw.txt',
'2022-09-18 1234 name O2A raw.txt',
'2022-09-18 1234 name O2P raw.txt',
'2022-09-18 1234 name O3A raw.txt',
'2022-09-18 1234 name O3P raw.txt',
'2022-09-18 1234 name O4A raw.txt',
'2022-09-18 1234 name O4P raw.txt',
'2022-09-18 1234 name O5A raw.txt',
'2022-09-18 1234 name O5P raw.txt',
'2022-09-18 1234 name M0A raw.txt',
'2022-09-18 1234 name M0P raw.txt',
...
'2022-09-18 1234 name M5P raw.txt']

I want to split this into a new list containing let’s say all “O?A”, so

my_list_split = ['2022-09-18 1234 name O0A raw.txt',
'2022-09-18 1234 name O1A raw.txt',
'2022-09-18 1234 name O2A raw.txt',
'2022-09-18 1234 name O3A raw.txt',
'2022-09-18 1234 name O4A raw.txt',
'2022-09-18 1234 name O5A raw.txt',]

Based on previous posts on string list substring splitting, it seems the fastest way to do this is by

[s for s in my_list if ' O?A raw' in s]

but this returns an empty string. I guess there is some syntax that I am missing?

Thank you.



Solution

It seems like what you’re trying to do is a regular expression to match ‘ O?A raw’, where ‘?’ is any character. Here’s what you want to do:

import re

# ... the lists ...

lst = [s for s in my_list if re.search(".+ O.P raw.+", s)]
print(lst)


This Question was asked in StackOverflow by Woodywoodleg and Answered by Michael M. It is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.

people found this article helpful. What about you?