Perl Regex- how to use localFile(.txt) for matching

I'm working on a project that requires to add some Password Authentication using Perl Regular Expressions.

The only thing that i have is a line textblock ,imagine something like this: https://rubular.com/r/9OZvpmtUpP. And for that reason i can put programming code there but only regular expressions [for example :^abc$].

Also i can use only one textblock so i have to combine all the expressions in "one" line.

So far i complete 2 requirements:

  1. match if at least 3 types of chars included

(((?=.{4,})((?=.\d)(?=.[a-z])(?=.[A-Z])|(?=.\d)(?=.[a-zA-Z])(?=.[\W_])|(?=.[a-z])(?=.[A-Z])(?=.[\W_])).))

  1. containing many(4 max) of the same characters

(?=^(?:(.)(?!(?:.?\1){4}))$)

and i combine them like this:

(((?=.{4,})((?=.*\d)(?=.*[a-z])(?=.*[A-Z])|(?=.*\d)(?=.*[a-zA-Z])(?=.*[\W_])|(?=.*[a-z])(?=.*[A-Z])(?=.*[\W_])).*))(?=^(?:(.)(?!(?:.*?\1){4}))*$)

Now i have 2 requirements to go. The first one is to read a blacklist from a local file (.txt) words that included [one row one word] and not to match with them)

     for example :    |  BadWord
   path/myText.txt    |  TestingWord
  have these 3 words  |  TestingBlacklist

These words must not my Included in a password

The second requirement is that the password except these 3 req above must not have over 2 times a char repetition

     for example :    Z@2gmacaiooi*77    Match - 2 times a char reppeated
                      982iuionjna%$sd    Match - 0 times a char reppeated
                      88asf$$1233ada4  NO MatcH- 3 times a char reppeated

Its important that the regex is in this format so i can make the join in one regex line. Thank you

2 answers

  • answered 2019-04-18 17:28 Abigail

    match if at least 3 types of chars included

    You don't say what the types of characters are. So, I will assume digits, letters, and anything which isn't a digit or letter.

    Matching a digit is easy: /\p{Nd}/ (I will assume Unicode since we're living in 2019).

    Matching a letter is also easy: /\pL/.

    Matching anything which isn't a letter or digit: /[^\pL\p{Nd}]/

    containing many(4 max) of the same characters

    I guess that's a negative, it should not contain 4 times the same character. You can match that as: !/(.)(?:.*\g{1}){3}/

    To combine that, you'd write something like:

    use experimental 'signatures';
    sub is_valid_password ($password) {
        $password =~ /\p{Nd}/       &&
        $password =~ /\pL/          &&
        $password =~ /[^\p{Nd}\pL]/ &&
        $password !~ /(.)(?:.*\g{1}){3}/;
    }
    

    The first one is to read a blacklist from a local file (.txt) words that included [one row one word] and not to match with them

    For that, I'd read in the file, put them in a hash, and simply check against the hash. I only want to read in the file once, even if I call is_valid_password multiple times, so I use a state variable. Now my sub becomes:

    my $file = "path/myText.txt";
    use experimental 'signatures';
    sub is_valid_password ($password) {
        state $badwords = do {
            open my $fh, "<", $file or die $!;
            chomp (my @words = <$fh>);
            +{map {$_ => 1} @words}
        };
        $password =~ /\p{Nd}/             &&
        $password =~ /\pL/                &&
        $password =~ /[^\p{Nd}\pL]/       &&
        $password !~ /(.)(?:.*\g{1}){3}/  &&
       !$$badwords {$password};
    }
    

    The second requirement is that the password except these 3 req above must not have over 2 times a char repetition

    This confuses me. How is that different from the earlier restriction on repetitions (other than the amount)? Anyway, removing the {3} in one of the clauses above should do.

  • answered 2019-04-19 06:24 abdan

    (((?=.{4,})((?=.\d)(?=.[a-z])(?=.[A-Z])|(?=.\d)(?=.[a-zA-Z])(?=.[\W_])|(?=.[a-z])(?=.[A-Z])(?=.[\W_])).))
    

    I am of one mind with Abigail, simply doesn't make sense if you appreciate rearrange and rewrite your problem neatly

    ... (?=.[a-z])(?=.[A-Z]) ...is it possible ?... isn't it (?i) or (?-i) default case sensitive etc..