regex - Regular expression to match a line that doesn't contain a word? -
i know it's possible match word , reverse matches using other tools (e.g. grep -v). however, i'd know if it's possible match lines don't contain specific word (e.g. hede) using regular expression.
input:
hoho hihi haha hede code:
grep "<regex 'doesn't contain hede'>" input desired output:
hoho hihi haha
the notion regex doesn't support inverse matching not entirely true. can mimic behavior using negative look-arounds:
^((?!hede).)*$ the regex above match string, or line without line break, not containing (sub)string 'hede'. mentioned, not regex "good" @ (or should do), still, is possible.
and if need match line break chars well, use dot-all modifier (the trailing s in following pattern):
/^((?!hede).)*$/s or use inline:
/(?s)^((?!hede).)*$/ (where /.../ regex delimiters, i.e., not part of pattern)
if dot-all modifier not available, can mimic same behavior character class [\s\s]:
/^((?!hede)[\s\s])*$/ explanation
a string list of n characters. before, , after each character, there's empty string. list of n characters have n+1 empty strings. consider string "abhedecd":
┌──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┐ s = │e1│ │e2│ b │e3│ h │e4│ e │e5│ d │e6│ e │e7│ c │e8│ d │e9│ └──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┘ index 0 1 2 3 4 5 6 7 where e's empty strings. regex (?!hede). looks ahead see if there's no substring "hede" seen, , if case (so else seen), . (dot) match character except line break. look-arounds called zero-width-assertions because don't consume characters. assert/validate something.
so, in example, every empty string first validated see if there's no "hede" ahead, before character consumed . (dot). regex (?!hede). once, wrapped in group, , repeated 0 or more times: ((?!hede).)*. finally, start- , end-of-input anchored make sure entire input consumed: ^((?!hede).)*$
as can see, input "abhedecd" fail because on e3, regex (?!hede) fails (there is "hede" ahead!).
wiki
Comments
Post a Comment