How do I quickly check whether a string contains concatenated English words
Is there a fast way to check whether a string without spaces contains a valid (English) sentence?
An example
In the example below I've encrypted the sentence "thisisavalidenglishsentence" using a modern shift cipher with an unknown key/shift. The resulting ciphertext is "aopzpzhchspklunspzozlualujl".
key = 0, plaintext = aopzpzhchspklunspzozlualujl
key = 1, plaintext = znoyoygbgrojktmroynyktzktik
key = 2, plaintext = ymnxnxfafqnijslqnxmxjsyjshj
key = 3, plaintext = xlmwmwezepmhirkpmwlwirxirgi
key = 4, plaintext = wklvlvdydolghqjolvkvhqwhqfh
key = 5, plaintext = vjkukucxcnkfgpinkujugpvgpeg
key = 6, plaintext = uijtjtbwbmjefohmjtitfoufodf
key = 7, plaintext = thisisavalidenglishsentence <- !!
key = 8, plaintext = sghrhrzuzkhcdmfkhrgrdmsdmbd
key = 9, plaintext = rfgqgqytyjgbclejgqfqclrclac
key = 10, plaintext = qefpfpxsxifabkdifpepbkqbkzb
key = 11, plaintext = pdeoeowrwhezajcheodoajpajya
key = 12, plaintext = ocdndnvqvgdyzibgdncnziozixz
key = 13, plaintext = nbcmcmupufcxyhafcmbmyhnyhwy
key = 14, plaintext = mablbltotebwxgzeblalxgmxgvx
key = 15, plaintext = lzakaksnsdavwfydakzkwflwfuw
key = 16, plaintext = kyzjzjrmrczuvexczjyjvekvetv
key = 17, plaintext = jxyiyiqlqbytudwbyixiudjudsu
key = 18, plaintext = iwxhxhpkpaxstcvaxhwhtcitcrt
key = 19, plaintext = hvwgwgojozwrsbuzwgvgsbhsbqs
key = 20, plaintext = guvfvfninyvqratyvfufragrapr
key = 21, plaintext = ftueuemhmxupqzsxueteqzfqzoq
key = 22, plaintext = estdtdlglwtopyrwtdsdpyepynp
key = 23, plaintext = drscsckfkvsnoxqvscrcoxdoxmo
key = 24, plaintext = cqrbrbjejurmnwpurbqbnwcnwln
key = 25, plaintext = bpqaqaiditqlmvotqapamvbmvkm
If I were to brute-force the decryption of this ciphertext it would still be possible to pick out the correct plaintext by manually going through all the candidates. (In this case that would be shift = 7). However, usually the amount of decryptions is much, much larger than the 26 from the example.
Possible methods
Possible ways that I know of to detect a valid sentence would be to either check whether the letter frequency matches that of English, or I could split the sentence into ngrams and compare every single permutation to a dictionary of known English words.
The problem with these two approaches are that letter frequency isn't reliable for short sentences. It might be fine if I have only 26 possibilities, but with thousands of candidates it becomes less helpful. The other method of splitting the text into ngrams and comparing it to a dictionary takes a very long time if there are many possible decryptions.
Question
Is there an alternative approach to quickly find whether a string contains valid (English) words? Or a fast way to compare a sentence of concatenated words against a (English) dictionary?