Regex to find strings containing substring, but not ending on same substring

I'm trying to write a regex that checks if a string contains the substring "ing", but it most not end on "ing".

So the word sing would not work but singer would.

I think I have figured out how to make sure that the string does not end with ing, for that I'm using

(!<?(ing))$

But I can't seem to get it to work when I want the word to contain "ing" as well. I was thinking something like

(\w+(ing))(!<?(ing))$

But that does not work, all of my solution that sort of makes it work will take in more than one word as well. So it will match singer but not singer crafting, it should still match singer here, just not crafting.

3 answers

  • answered 2020-09-24 15:46 Tim Biegeleisen

    You may use the pattern:

    ing(?=\w)
    

    This would only be true for words which contain ing which is also followed by another word character. Here is an example:

    inp = 'singer'
    if re.search(r'ing(?=\w)', inp):
        print('singer is a MATCH')
    
    inp = 'sing'
    if re.search(r'ing(?=\w)', inp):
        print('sing is a MATCH')
    

    This prints:

    singer is a MATCH
    

    Edit:

    To match entire words containing non terminal ing, I suggest using re.findall:

    inp = "Madonna is a singer who likes to sing."
    matches = re.findall(r'\b\w*ing\w+\b', inp)
    print(matches)    # prints ['singer']
    

  • answered 2020-09-24 15:50 The fourth bird

    If the word can not end with ing but must contain ing:

    \b\w*ing(?!\w*ing\b)\w+
    

    Explanation

    • \b A word boundary
    • \w* Match 0+ word characters
    • ing Match the required ing
    • (?!\w*ing\b) Negaetive lookahead, assert the ing is not at the end of the word
    • \w+ Match 1+ word chars so that there must be at least a single char following

    Regex demo | Python demo

    For example

    import re
    
    items = ["singer","singing","ing","This is a ing testing singalong"]
    pattern = r"\b\w*ing(?!\w*ing$)\w+\b"
    
    for item in items:
        result = re.findall(pattern, item)
        if result:
            print(result)
    

    Output

    ['singer']
    ['singalong']
    

  • answered 2020-09-24 15:54 windstorm

    You can use this pattern:

    import re
    
    pattern = re.compile('\w*ing\w+')
    print(pattern.match('sing'))  # No match
    print(pattern.match('singer')) # Match