Multiline Vim Regexps

12 AM August 19, 2004

My current (paid) project involves interpreting messages that have been formatted for humans to read. Of course the format is completely documented and needs to be reverse engineered by inspecting a large number of sample messages. I've got all the sample messages sitting in a large text file, which I manipulate with Vim.

This morning, I found myself seaching through the file with this command:

/^\nF\d\_.\{-}\_^\n\zs.*/+

It means "Find a block of lines that start with F and a digit, then scan forward to the next blank line and select the line after that." This is how it works:

^\n Matches the start of a line, followed by a newline - i.e a blank line
F\d The next line starts with an F followed by a digit
\_.\{-} '\_.' is like '.', but also matches newline. '\{-}' matches the minimum number of the preceeding '\_.'. (If I were to use '*' instead of '\{-}', it would match to near the end-of file.)
\_^\n Matches a blank line. '\_^' is like '^', but '^' only works at the start of a regular expression.
\zs When the match is finished, set the start of match to this point. I use this because I don't want the preceding text to be highlighted.
.* Matches the whole line.

The '+' after the regular expression tells Vim to put the cursor on the line after the selection. This gets the cursor out of the way so that I can see the selection easily.

By alang | # | Comments (3)
(Posted to Software Development and javablogs)

Comments

At 07:39, 19 Aug 2004 Darren wrote:

Clever stuff, I didn't know about the + at the end of the regex at all, that's well handy.

(#)
At 10:53, 25 Apr 2005 swang wrote:

Great stuffs and very helpful. Thanks.
But, could you tell me where to find the definiton of "\_" underscore thing. I am having difficult to find it in the manual.

(#)
At 00:59, 05 May 2005 Alan Green wrote:

"\_" doesn't mean anything by itself, it's just the start of the "\_." and "\_^" escapes.

(#)

Add Comment




(Not displayed)






(Leave blank line between paragraphs. URLs converted to links. HTML stripped. Indented source code will be formatted with <pre> tags.)




© 2003-2006 Alan Green