Over the past few weeks I've been helping a friend investigate the contents of a PC we're working on, in order to troubleshoot some weird problems. It's actually quite fun to do some in-depth examinations of the contents of a hard disk, but you soon discover that such tasks require some specific software tools.
A couple of days ago, for example, I needed to look through the contents of some binary files which mostly contained garbage characters, but in which were also some occasional ocurrences of readable words and sentences. The files were actually the hibernation files that Windows uses when it goes to sleep. Before nodding off, the PC dumps the contents of its memory to a hibernation file, and I needed to know what was in that file.
To separate the garbage characters from the readable ones, I needed to delete every character in the file that wasn't a letter or a digit. WIth 26 letters, 2 cases, and 10 digits, that's 38 characters to look for. Or to put it another way, 38 characters not to delete from the file, out of a repertoire of 250-odd. This wasn't something I intended to do with a standard search and replace facility.
Luckily, my mind was cast back to a really neat text editor that I actually wrote about in this column back in August. EmEditor (http://www.techsupportalert.com/content/fastest-text-editor-ive-ever-see...) supports a feature in its search/replace facility called Regular Expressions. Such things, commonly known as regex, are a widely used standard within the IT industry for specifying patterns of characters in order to do searches or filters. For example, when you type a credit card number or an email address into a web form and the submission is rejected because what you typed doesn't look like a credit card number or email address, the form is using a regex to check whether what you typed conforms to the expected pattern. Which in the case of an email address, for example, would be 2 words or short sentences, without spaces, separated by an "@" symbol.
The syntax is regex is horribly complicated. If you want to learn it, just type "regex" into Google and follow the endless tutorials. But as an example, the expression [^a-zA-Z] means "any character which isn't alphabetica" (the caret symbol at the start means "not"). So if, say, you search for such an expression within EmEditor, and tick the "use regular expressions" box, and then choose to replace all matches with a space or even nothing at all, you quickly end up with a file that now contains no garbage characters at all. Which is just what I wanted.
If a search and replace feature with regex support isn't something that you need right now, keep it in mind. One day, you might just need to do it, and knowing about such things can save you a load of time.
Please rate this article: