Using regex in Python if statement

I am trying to figure out how to incorporate regex into a python if statement. I have a pandas dataframe where I am iterating over the rows and want to perform an action every time the row has a specific combination of text. The regex should match any 7 character string that begins with a capital letter followed by 6 numbers (ie. R142389)

        for index, row in df1.iterrows():
             if row[4] == REGEX HERE:
                  Perform Action

Am I going about this the right way? Any help would be greatly appreciated!

2 answers

  • answered 2018-01-11 21:18 user3483203

    Yes, you can do this, just use match, which will only match at the beginning of the string it is being compared to. You would have to use search to search the entire string.

    A bit of explanation about the regex:

    ^ asserts position at start of the string

    [A-Z] A-Z a single character in the range between A (index 65) and Z (index 90) (case sensitive)

    \d{6} matches a digit (equal to [0-9]) {6} Quantifier — Matches exactly 6 times

    $ asserts position at the end of the string, or before the line terminator right at the end of the string

    import re
    
    regex = re.compile('^[A-Z]\d{6}$')
    
    possibles = ['R142389', 'hello', 'J123456']
    
    for line in possibles:
        if regex.match(line):
            print(line)
    

    Output:

    R142389
    J123456
    

  • answered 2018-01-11 21:23 Marcelo Villa

    I would use the re module

    import re
    
    re.search(pattern, string, flags=0)
    

    where pattern is the regular expression to be matched, string is the string going to be searched and flags which are optional modifiers. This funcion returns None when there is no match.

    Here is the re documentation: https://docs.python.org/2/library/re.html

    And here is an example of the implementation: https://www.tutorialspoint.com/python/python_reg_expressions.htm