Search articles from thousands of Examiners
Write for us
Louisville Gadgets and Tech Open Source Examiner
Open Source Examiner

Advanced Search and Replace in Dreamweaver, Vim, Sed and BBedit using regular expressions

April 21, 12:12 PMOpen Source ExaminerAndrey Samode
Comment Print Email RSS Subscribe

Subscribe


Get alerts when there is a new article from the Open Source Examiner. Read Examiner.com's terms of use.
Email Address


  Include other special offers from Examiner.com
Terms of Use


 

Find and replace complex patterns on Linux, Mac or Windows using regex. We'll look some examples from very easy to quite complex. We'll use Dreamweaver, BBedit, Vim and Sed.

Back in the day of my innocent web developer days when I was just starting to dabble in PHP, I had no idea about regular expressions. Then one function changed everyting - preg_replace(). I was so excited when I discovered it. I used it for everything - pulling blogs from other sites, reformatting html to add links automatically, remove unwanted tags. In this article, I want to introduce you to regular expressions and show you some useful examples of using it in different programs.

Example #1 (Dreamweaver)

"I've got this drop-down list of names, but it's so long I really don't want to go and change it entry by entry. Can I do a global search and replace?"

Here's what the list looks like:

<option value="http://www.wanttoknow.info/#Baer">Baer, Robert - Case Officer, 21 Years in CIA, Career Intelligence Medal</option>
<option value="http://www.wanttoknow.info/#Bowman">Bowman, Col. Robert - Director of Advanced Space Programs Development under Ford and Carter</option>
<option value="http://www.wanttoknow.info/#Burks">Burks, Fred - State Department Interpreter for Presidents George W. Bush and Bill Clinton</option>
<option value="http://www.wanttoknow.info/#Christison">Christison, William - Director of the CIA's Office of Regional and Political Analysis</option>
<option value="http://www.wanttoknow.info/#Cleland">Cleland, Senator Max - U.S. Senator from Georgia. Member of 9/11 Commission</option>

We want to extract the parts in between the <option> tags. The challenge here is that the <option...> tag is different every time. This is easily accomplished through regular expression search and replace.

Let's use Dreamweaver this time.

  • Paste the list into a new document
  • Open Find and Replace
  • Check the "Use regular expression" box, and search for the following pattern:

<option[^>]*>

This pattern looks for "<option", and then any number of characters that do not equal to ">", followed by a ">".

Note that even though we used Dreamweaver here, you can use BBedit or Vim to accomplish the same thing.

Example #2 (BBedit)

Now we have a nice list:

Baer, Robert - Case Officer, 21 Years in CIA, Career Intelligence Medal
Bowman, Col. Robert - Director of Advanced Space Programs Development under Ford and Carter
Burks, Fred - State Department Interpreter for Presidents George W. Bush and Bill Clinton
Christison, William - Director of the CIA's Office of Regional and Political Analysis
Cleland, Senator Max - U.S. Senator from Georgia. Member of 9/11 Commission

However, the data is in reverse of what we desire, and we want to switch the names (red) with their descriptions (orange). Luckily for us, we have the dash (-) in between the two, so we can use that to make the distinction. Now to do this by hand to a 2000-entry list would take forever, but thanks to regular expressions and back referencing, this will be a breeze.

Let's use BBedit for this.

  • Paste the text into a new BBedit document
  • Go to Search > Find
  • Check "Grep"
  • Enter the following for Find:

(.*) - (.*)

  • And the following for Replace:

\2 - \1

... and voila! Our list looks like this:

Case Officer, 21 Years in CIA, Career Intelligence Medal - Baer, Robert
Director of Advanced Space Programs Development under Ford and Carter - Bowman, Col. Robert
State Department Interpreter for Presidents George W. Bush and Bill Clinton - Burks, Fred
Director of the CIA's Office of Regional and Political Analysis - Christison, William
U.S. Senator from Georgia. Member of 9/11 Commission - Cleland, Senator Max

In this case, we use enclose the two patterns in parentheses "()", separated by a dash (-) with two spaces on either end. Note that if the program you're using doesn't like spaces, the regex code for space is \s. The dot (.) stands for any one character, and the asterisk (*) following the dot (.) tells it that the character can be repeated 0 or more times.

Under Replace, we use what is called back-reference. The contents of the first set of parentheses can be called back with \1, and the contents of the second with \2. The space-dash-space separator can be anything.

Example #3 (Vim and Sed)

Some projects that I work on have Drupal development servers. It's a good practice not to use the live server for development and experimentation. Sometimes it's great to be able to just run a script, which copies the database and files from the live server onto the mirror development server, thus creating a mirror on which I can play around and try all kinds of experiments. (I won't include the complete script here, but I can post it if there's interest.)

As a part of the process I need to a database replacement of the URLs from www.site.com to dev.site.com, otherwise all the links will just point to the live site. This is easily accomplished using regex. Since we'll be working in a Linux (bash) environment, we'll take a look at two ways of doing it: using Vim and Sed.

First let's use Vim:

:%s/www\.site\.com/dev.site.com/g

This is what everything stands for:

  • %s - substitute command - all search and replace commands start with that
  • / - the forward slashes divide the four parts of the operation - command/pattern/replace/options
  • \. - since dot (.) is a special character, to actually match the dot as a non-special character it needs to be escaped by a backslash (\)
  • dev.site.com - most special characters don't need to be escaped in the replacement pattern
  • g - the flag "g" stands for "global", which means that it will search the entire document, not just one occurrance

This is a very powerful feature to an already amazing cross platform (Linux, Mac and Windows) editor Vim.

Now let's take a look at Sed.

Sed is amazing in that it can be used as a part of a bash script. In our backup/mirror solution it's obvious that we want as little human intervention as possible. Whereas Vim is great for custom searches, once we know the set pattern that needs to be replaced every time, Sed can be integrated into the script and do it automatically.

For this example, let's assume that we did a mysql dump of a database into a file called db.sql. We'll now use Sed to automatically replace all the occurrances of "www.site.com" with "dev.site.com".

sed -r "s/www\.site\.com/dev.site.com/g" db.sql > db-dev.sql

 Where:

  • -r is the replace flag
  • quotes "" enclose the command
  • s stands for substitute
  • \. is an escaped period
  • g is the global replace flag
  • db.sql is the input file
  • db-dev.sql is the output file

I use this as a part of my development server migration bash script, and it works wonderfully.

Besides these examples, regular expression (regex or reg-ex) can be found in PHP and almost every other computer language.

Enjoy!

Was this article helpful? Do you have thoughts or questions? Feel free to aks or comment below.

 

Add article to:  
More About: Open Source · Tutorial · Linux

Add a Comment

Name:


Comments:
characters left

NOTE: Do Not Alter These Fields:

Holiday Guide
Examiners spread the seasonal cheer with the Examiner.com Holiday Guide.

Recent Articles

Wednesday, November 18, 2009
Convert SWF to FLV or other formats You can capture any video that plays on your computer including SWF in a few very easy steps with iShowU, and …
Tuesday, November 17, 2009
CUDA is the technology of leveraging nVidia's GPU (Graphics Processing Unit) to process problems much more quickly and efficiently than the CPU. This …