Getting Started
Basic Operation Macro/Shell Extensions
Selecting Text Shell Commands and Filters
Finding and Replacing Text Learn/Replay
Cut and Paste Macro Language
Using the Mouse Macro Subroutines
Keyboard Shortcuts Highlighting Information
Shifting and Filling Range Sets
Tabbed Editing Action Routines
File Format
Customizing
Features for Programming Customizing NEdit
Programming with NEdit Preferences
Tabs/Emulated Tabs X Resources
Auto/Smart Indent Key Binding
Syntax Highlighting Highlighting Patterns
Finding Declarations (ctags) Smart Indent Macros
Calltips
Regular Expressions NEdit Command Line
Basic Regular Expression Syntax Client/Server Mode
Metacharacters Crash Recovery
Parenthetical Constructs Version
Advanced Topics GNU General Public License
Example Regular Expressions Mailing Lists
Problems/Defects
Welcome to NEdit!
NEdit is a standard GUI (Graphical User Interface) style text editor for programs and plain-text files. Users of Macintosh and MS Windows based text editors should find NEdit a familiar and comfortable environment. NEdit provides all of the standard menu, dialog, editing, and mouse support, as well as all of the standard shortcuts to which the users of modern GUI based environments are accustomed. For users of older style Unix editors, welcome to the world of mouse-based editing!
Help sections of interest to new users are listed under the "Basic Operation" heading in the top-level Help menu:
Selecting Text
Finding and Replacing Text
Cut and Paste
Using the Mouse
Keyboard Shortcuts
Shifting and Filling
Programmers should also read the introductory section under the "Features for Programming" section:
Programming with NEdit
If you get into trouble, the Undo command in the Edit menu can reverse any modifications that you make. NEdit does not change the file you are editing until you tell it to Save.
To open an existing file, choose Open... from the file menu. Select the file that you want to open in the pop-up dialog that appears and click on OK. You may open any number of files at the same time. Depending on your settings (cf. "Tabbed Editing") each file can appear in its own editor window, or it can appear under a tab in the same editor window. Using Open... rather than re-typing the NEdit command and running additional copies of NEdit, will give you quick access to all of the files you have open via the Windows menu, and ensure that you don't accidentally open the same file twice. NEdit has no "main" window. It remains running as long as at least one editor window is open.
If you already have an empty (Untitled) window displayed, just begin typing in the window. To create a new Untitled window, choose New Window or New Tab from the File menu. To give the file a name and save its contents to the disk, choose Save or Save As... from the File menu.
NEdit maintains periodic backups of the file you are editing so that you can recover the file in the event of a problem such as a system crash, network failure, or X server crash. These files are saved under the name `~filename` (on Unix) or `_filename` (on VMS), where filename is the name of the file you were editing. If an NEdit process is killed, some of these backup files may remain in your directory. (To remove one of these files on Unix, you may have to prefix the `~' (tilde) character with a (backslash) to prevent the shell from interpreting it as a special character.)
As you become more familiar with NEdit, substitute the control and function keys shown on the right side of the menus for pulling down menus with the mouse.
Dialogs are also streamlined so you can enter information quickly and without using the mouse*. To move the keyboard focus around a dialog, use the tab and arrow keys. One of the buttons in a dialog is usually drawn with a thick, indented, outline. This button can be activated by pressing Return or Enter. The Cancel or Dismiss button can be activated by pressing escape. For example, to replace the string "thing" with "things" type:
<ctrl-r>thing<tab>things<return>
To open a file named "whole_earth.c", type:
<ctrl-o>who<return>
(how much of the filename you need to type depends on the other files in the directory). See the section called "Keyboard Shortcuts" for more details.
* Users who have set their keyboard focus mode to "pointer" should set "Popups Under Pointer" in the Default Settings menu to avoid the additional step of moving the mouse into the dialog.
NEdit has two general types of selections, primary (highlighted text), and secondary (underlined text). Selections can cover either a simple range of text between two points in the file, or they can cover a rectangular area of the file. Rectangular selections are only useful with non-proportional (fixed spacing) fonts.
To select text for copying, deleting, or replacing, press the left mouse button with the pointer at one end of the text you want to select, and drag it to the other end. The text will become highlighted. To select a whole word, double click (click twice quickly in succession). Double clicking and then dragging the mouse will select a number of words. Similarly, you can select a whole line or a number of lines by triple clicking or triple clicking and dragging. Quadruple clicking selects the whole file. After releasing the mouse button, you can still adjust a selection by holding down the shift key and dragging on either end of the selection. To delete the selected text, press delete or backspace. To replace it, begin typing.
To select a rectangle or column of text, hold the Ctrl key while dragging the mouse. Rectangular selections can be used in any context that normal selections can be used, including cutting and pasting, filling, shifting, dragging, and searching. Operations on rectangular selections automatically fill in tabs and spaces to maintain alignment of text within and to the right of the selection. Note that the interpretation of rectangular selections by Fill Paragraph is slightly different from that of other commands, the section titled "Shifting and Filling" has details.
The middle mouse button can be used to make an additional selection (called the secondary selection). As soon as the button is released, the contents of this selection will be copied to the insert position of the window where the mouse was last clicked (the destination window). This position is marked by a caret shaped cursor when the mouse is outside of the destination window. If there is a (primary) selection, adjacent to the cursor in the window, the new text will replace the selected text. Holding the shift key while making the secondary selection will move the text, deleting it at the site of the secondary selection, rather than copying it.
Selected text can also be dragged to a new location in the file using the middle mouse button. Holding the shift key while dragging the text will copy the selected text, leaving the original text in place. Holding the control key will drag the text in overlay mode.
Normally, dragging moves text by removing it from the selected position at the start of the drag, and inserting it at a new position relative to to the mouse. Dragging a block of text over existing characters, displaces the characters to the end of the selection. In overlay mode, characters which are occluded by blocks of text being dragged are simply removed. When dragging non-rectangular selections, overlay mode also converts the selection to rectangular form, allowing it to be dragged outside of the bounds of the existing text.
The section "Using the Mouse" summarizes the mouse commands for making primary and secondary selections. Primary selections can also be made via keyboard commands, see "Keyboard Shortcuts".
The Search menu contains a number of commands for finding and replacing text.
The Find... and Replace... commands present dialogs for entering text for searching and replacing. These dialogs also allow you to choose whether you want the search to be sensitive to upper and lower case, or whether to use the standard Unix pattern matching characters (regular expressions). Searches begin at the current text insertion position.
Find Again and Replace Again repeat the last find or replace command without prompting for search strings. To selectively replace text, use the two commands in combination: Find Again, then Replace Again if the highlighted string should be replaced, or Find Again again to go to the next string.
Find Selection searches for the text contained in the current primary selection (see Selecting Text). The selected text does not have to be in the current editor window, it may even be in another program. For example, if the word dog appears somewhere in a window on your screen, and you want to find it in the file you are editing, select the word dog by dragging the mouse across it, switch to your NEdit window and choose Find Selection from the Search menu.
Find Incremental, which opens the interactive search bar, is yet another variation on searching, where every character typed triggers a new search. After you've completed the search string, the next occurrence in the buffer is found by hitting the Return key, or by clicking on the icon to the left (magnifying glass). Holding a Shift key down finds the previous occurrences. To the right there is a clear button with an icon resembling "|<". Clicking on it empties the search text widget without disturbing selections. A middle click on the clear button copies the content of any existing selection into the search text widget and triggers a new search.
Holding down the shift key while choosing any of the search or replace commands from the menu (or using the keyboard shortcut), will search in the reverse direction. Users who have set the search direction using the buttons in the search dialog, may find it a bit confusing that Find Again and Replace Again don't continue in the same direction as the original search (for experienced users, consistency of the direction implied by the shift key is more important).
To replace only some occurrences of a string within a file, choose Replace... from the Search menu, enter the string to search for and the string to substitute, and finish by pressing the Find button. When the first occurrence is highlighted, use either Replace Again (^T) to replace it, or Find Again (^G) to move to the next occurrence without replacing it, and continue in such a manner through all occurrences of interest.
To replace all occurrences of a string within some range of text, select the range (see Selecting Text), choose Replace... from the Search menu, type the string to search for and the string to substitute, and press the "R. in Selection" button in the dialog. Note that selecting text in the Replace... dialog will unselect the text in the window.
The easiest way to copy and move text around in your file or between windows, is to use the clipboard, an imaginary area that temporarily stores text and data. The Cut command removes the selected text (see Selecting Text) from your file and places it in the clipboard. Once text is in the clipboard, the Paste command will copy it to the insert position in the current window. For example, to move some text from one place to another, select it by dragging the mouse over it, choose Cut to remove it, click the pointer to move the insert point where you want the text inserted, then choose Paste to insert it. Copy copies text to the clipboard without deleting it from your file. You can also use the clipboard to transfer text to and from other Motif programs and X programs which make proper use of the clipboard.
There are many other methods for copying and moving text within NEdit windows and between NEdit and other programs. The most common such method is clicking the middle mouse button to copy the primary selection (to the clicked position). Copying the selection by clicking the middle mouse button in many cases is the only way to transfer data to and from many X programs. Holding the Shift key while clicking the middle mouse button moves the text, deleting it from its original position, rather than copying it. Other methods for transferring text include secondary selections, primary selection dragging, keyboard-based selection copying, and drag and drop. These are described in detail in the sections: "Selecting Text", "Using the Mouse", and "Keyboard Shortcuts".
Mouse-based editing is what NEdit is all about, and learning to use the more advanced features like secondary selections and primary selection dragging will be well worth your while.
If you don't have time to learn everything, you can get by adequately with just the left mouse button: Clicking the left button moves the cursor. Dragging with the left button makes a selection. Holding the shift key while clicking extends the existing selection, or begins a selection between the cursor and the mouse. Double or triple clicking selects a whole word or a whole line.
This section will make more sense if you also read the section called, "Selecting Text", which explains the terminology of selections, that is, what is meant by primary, secondary, rectangular, etc.
General meaning of mouse buttons and modifier keys:
Button 1 (left) Cursor position and primary selection
Button 2 (middle) Secondary selections, and dragging and
copying the primary selection
Button 3 (right) Quick-access programmable menu and pan
scrolling
Shift On primary selections, (left mouse button):
Extends selection to the mouse pointer
On secondary and copy operations, (middle):
Toggles between move and copy
Ctrl Makes selection rectangular or insertion
columnar
Alt* (on release) Exchange primary and secondary
selections.
The left mouse button is used to position the cursor and to make primary selections.
Click Moves the cursor
Double Click Selects a whole word
Triple Click Selects a whole line
Quad Click Selects the whole file
Shift Click Adjusts (extends or shrinks) the
selection, or if there is no existing
selection, begins a new selection
between the cursor and the mouse.
Ctrl+Shift+ Adjusts (extends or shrinks) the
Click selection rectangularly.
Drag Selects text between where the mouse
was pressed and where it was released.
Ctrl+Drag Selects rectangle between where the
mouse was pressed and where it was
released.
The right mouse button posts a programmable menu for frequently used commands.
Click/Drag Pops up the background menu (programmed
from Preferences -> Default Settings ->
Customize Menus -> Window Background).
Ctrl+Drag Pan scrolling. Scrolls the window
both vertically and horizontally, as if
you had grabbed it with your mouse.
The middle mouse button is for making secondary selections, and copying and dragging the primary selection.
Click Copies the primary selection to the
clicked position.
Shift+Click Moves the primary selection to the
clicked position, deleting it from its
original position.
Drag 1) Outside of the primary selection:
Begins a secondary selection.
2) Inside of the primary selection:
Moves the selection by dragging.
Ctrl+Drag 1) Outside of the primary selection:
Begins a rectangular secondary
selection.
2) Inside of the primary selection:
Drags the selection in overlay
mode (see below).
When the mouse button is released after creating a secondary selection:
No Modifiers If there is a primary selection,
replaces it with the secondary
selection. Otherwise, inserts the
secondary selection at the cursor
position.
Shift Move the secondary selection, deleting
it from its original position. If
there is a primary selection, the move
will replace the primary selection
with the secondary selection.
Otherwise, moves the secondary
selection to to the cursor position.
Alt* Exchange the primary and secondary
selections.
While moving the primary selection by dragging with the middle mouse button:
Shift Leaves a copy of the original
selection in place rather than
removing it or blanking the area.
Ctrl Changes from insert mode to overlay
mode (see below).
Escape Cancels drag in progress.
Overlay Mode: Normally, dragging moves text by removing it from the selected position at the start of the drag, and inserting it at a new position relative to to the mouse. When you drag a block of text over existing characters, the existing characters are displaced to the end of the selection. In overlay mode, characters which are occluded by blocks of text being dragged are simply removed. When dragging non-rectangular selections, overlay mode also converts the selection to rectangular form, allowing it to be dragged outside of the bounds of the existing text.
Mouse buttons 4 and 5 are usually represented by a mouse wheel nowadays. They are used to scroll up or down in the text window.
* The Alt key may be labeled Meta or Compose-Character on some keyboards. Some window managers, including default configurations of mwm, bind combinations of the Alt key and mouse buttons to window manager operations. In NEdit, Alt is only used on button release, so regardless of the window manager bindings for Alt-modified mouse buttons, you can still do the corresponding NEdit operation by using the Alt key AFTER the initial mouse press, so that Alt is held while you release the mouse button. If you find this difficult or annoying, you can re-configure most window managers to skip this binding, or you can re-configure NEdit to use a different key combination.
Most of the keyboard shortcuts in NEdit are shown on the right hand sides of the pull-down menus. However, there are more which are not as obvious. These include; dialog button shortcuts; menu and dialog mnemonics; labeled keyboard keys, such as the arrows, page-up, page-down, and home; and optional Shift modifiers on accelerator keys, like [Shift]Ctrl+F.
Pressing the key combinations shown on the right of the menu items is a shortcut for selecting the menu item with the mouse. Some items have the shift key enclosed in brackets, such as [Shift]Ctrl+F. This indicates that the shift key is optional. In search commands, including the shift key reverses the direction of the search. In Shift commands, it makes the command shift the selected text by a whole tab stop rather than by single characters.
Pressing the Alt key in combination with one of the underlined characters in the menu bar pulls down that menu. Once the menu is pulled down, typing the underlined characters in a menu item (without the Alt key) activates that item. With a menu pulled down, you can also use the arrow keys to select menu items, and the Space or Enter keys to activate them.
One button in a dialog is usually marked with a thick indented outline. Pressing the Return or Enter key activates this button.
All dialogs have either a Cancel or Dismiss button. This button can be activated by pressing the Escape (or Esc) key.
Pressing the tab key moves the keyboard focus to the next item in a dialog. Within an associated group of buttons, the arrow keys move the focus among the buttons. Shift+Tab moves backward through the items.
Most items in dialogs have an underline under one character in their name. Pressing the Alt key along with this character, activates a button as if you had pressed it with the mouse, or moves the keyboard focus to the associated text field or list.
You can select items from a list by using the arrow keys to move the selection and space to select.
In file selection dialogs, you can type the beginning characters of the file name or directory in the list to select files
The labeled function keys on standard workstation and PC keyboards, like the arrows, and page-up and page-down, are active in NEdit, though not shown in the pull-down menus.
Holding down the control key while pressing a named key extends the scope of the action that it performs. For example, Home normally moves the insert cursor the beginning of a line. Ctrl+Home moves it to the beginning of the file. Backspace deletes one character, Ctrl+Backspace deletes one word.
Holding down the shift key while pressing a named key begins or extends a selection. Combining the shift and control keys combines their actions. For example, to select a word without using the mouse, position the cursor at the beginning of the word and press Ctrl+Shift+RightArrow. The Alt key modifies selection commands to make the selection rectangular.
Under X and Motif, there are several levels of translation between keyboard keys and the actions they perform in a program. The "Customizing NEdit", and "X Resources" sections of the Help menu have more information on this subject. Because of all of this configurability, and since keyboards and standards for the meaning of some keys vary from machine to machine, the mappings may be changed from the defaults listed below.
Ctrl Extends the scope of the action that the key
would otherwise perform. For example, Home
normally moves the insert cursor to the beginning
of a line. Ctrl+Home moves it to the beginning of
the file. Backspace deletes one character, Ctrl+
Backspace deletes one word.
Shift Extends the selection to the cursor position. If
there's no selection, begins one between the old
and new cursor positions.
Alt When modifying a selection, makes the selection
rectangular.
(For the effects of modifier keys on mouse button presses, see the section titled "Using the Mouse")
Escape Cancels operation in progress: menu
selection, drag, selection, etc. Also
equivalent to cancel button in dialogs.
Backspace Delete the character before the cursor
Ctrl+BS Delete the word before the cursor
Arrows --
Left Move the cursor to the left one character
Ctrl+Left Move the cursor backward one word
(Word delimiters are settable, see
"Customizing NEdit", and "X Resources")
Right Move the cursor to the right one character
Ctrl+Right Move the cursor forward one word
Up Move the cursor up one line
Ctrl+Up Move the cursor up one paragraph.
(Paragraphs are delimited by blank lines)
Down Move the cursor down one line.
Ctrl+Down Move the cursor down one paragraph.
Ctrl+Return Return with automatic indent, regardless
of the setting of Auto Indent.
Shift+Return Return without automatic indent,
regardless of the setting of Auto Indent.
Ctrl+Tab Insert an ASCII tab character, without
processing emulated tabs.
Alt+Ctrl+<c> Insert the control-code equivalent of
a key <c>
Ctrl+/ Select everything (same as Select
All menu item or ^A)
Ctrl+\ Unselect
Ctrl+U Delete to start of line
Ctrl+Insert Copy the primary selection to the
clipboard (same as Copy menu item or ^C)
for compatibility with Motif standard key
binding
Shift+Ctrl+
Insert Copy the primary selection to the cursor
location.
Delete Delete the character before the cursor.
(Can be configured to delete the character
after the cursor, see "Customizing NEdit",
and "X Resources")
Ctrl+Delete Delete to end of line.
Shift+Delete Cut, remove the currently selected text
and place it in the clipboard. (same as
Cut menu item or ^X) for compatibility
with Motif standard key binding
Shift+Ctrl+
Delete Cut the primary selection to the cursor
location.
Home Move the cursor to the beginning of the
line
Ctrl+Home Move the cursor to the beginning of the
file
End Move the cursor to the end of the line
Ctrl+End Move the cursor to the end of the file
PageUp Scroll and move the cursor up by one page.
PageDown Scroll and move the cursor down by one
page.
F10 Make the menu bar active for keyboard
input (Arrow Keys, Return, Escape,
and the Space Bar)
Alt+Home Switch to the previously active document.
Ctrl+PageUp Switch to the previous document.
Ctrl+PageDown Switch to the next document.
On machines with different styles of keyboards, generally, text editing actions are properly matched to the labeled keys, such as Remove, Next-screen, etc.. If you prefer different key bindings, see the section titled "Key Binding" under the Customizing heading in the Help menu.
While shifting blocks of text is most important for programmers (See Features for Programming), it is also useful for other tasks, such as creating indented paragraphs.
To shift a block of text one tab stop to the right, select the text, then choose Shift Right from the Edit menu. Note that the accelerator keys for these menu items are Ctrl+9 and Ctrl+0, which correspond to the right and left parenthesis on most keyboards. Remember them as adjusting the text in the direction pointed to by the parenthesis character. Holding the Shift key while selecting either Shift Left or Shift Right will shift the text by one character.
It is also possible to shift blocks of text by selecting the text rectangularly, and dragging it left or right (and up or down as well). Using a rectangular selection also causes tabs within the selection to be recalculated and substituted, such that the non-whitespace characters remain stationary with respect to the selection.
Text filling using the Fill Paragraph command in the Edit menu is one of the most important concepts in NEdit. And it will be well worth your while to understand how to use it properly.
In plain text files, unlike word-processor files, there is no way to tell which lines are continuations of other lines, and which lines are meant to be separate, because there is no distinction in meaning between newline characters which separate lines in a paragraph, and ones which separate paragraphs from other text. This makes it impossible for a text editor like NEdit to tell parts of the text which belong together as a paragraph from carefully arranged individual lines.
In continuous wrap mode (Preferences -> Wrap -> Continuous), lines automatically wrap and unwrap themselves to line up properly at the right margin. In this mode, you simply omit the newlines within paragraphs and let NEdit make the line breaks as needed. Unfortunately, continuous wrap mode is not appropriate in the majority of situations, because files with extremely long lines are not common under Unix and may not be compatible with all tools, and because you can't achieve effects like indented sections, columns, or program comments, and still take advantage of the automatic wrapping.
Without continuous wrapping, paragraph filling is not entirely automatic. Auto-Newline wrapping keeps paragraphs lined up as you type, but once entered, NEdit can no longer distinguish newlines which join wrapped text, and newlines which must be preserved. Therefore, editing in the middle of a paragraph will often leave the right margin messy and uneven.
Since NEdit can't act automatically to keep your text lined up, you need to tell it explicitly where to operate, and that is what Fill Paragraph is for. It arranges lines to fill the space between two margins, wrapping the lines neatly at word boundaries. Normally, the left margin for filling is inferred from the text being filled. The first line of each paragraph is considered special, and its left indentation is maintained separately from the remaining lines (for leading indents, bullet points, numbered paragraphs, etc.). Otherwise, the left margin is determined by the furthest left non-whitespace character. The right margin is either the Wrap Margin, set in the preferences menu (by default, the right edge of the window), or can also be chosen on the fly by using a rectangular selection (see below).
There are three ways to use Fill Paragraph. The simplest is, while you are typing text, and there is no selection, simply select Fill Paragraph (or type Ctrl+J), and NEdit will arrange the text in the paragraph adjacent to the cursor. A paragraph, in this case, means an area of text delimited by blank lines.
The second way to use Fill Paragraph is with a selection. If you select a range of text and then chose Fill Paragraph, all of the text in the selection will be filled. Again, continuous text between blank lines is interpreted as paragraphs and filled individually, respecting leading indents and blank lines.
The third, and most versatile, way to use Fill Paragraph is with a rectangular selection. Fill Paragraph treats rectangular selections differently from other commands. Instead of simply filling the text inside the rectangular selection, NEdit interprets the right edge of the selection as the requested wrap margin. Text to the left of the selection is not disturbed (the usual interpretation of a rectangular selection), but text to the right of the selection is included in the operation and is pulled in to the selected region. This method enables you to fill text to an arbitrary right margin, without going back and forth to the wrap-margin dialog, as well as to exclude text to the left of the selection such as comment bars or other text columns.
NEdit is able to display files in distinct editor windows, or to display files under tabs in the same editor window. The Options for controlling the tabbed interface are found under Preferences -> Default Settings -> Tabbed Editing (cf. "Preferences", also "NEdit Command Line").
Notice that you can re-group tabs at any time by detaching and attaching them, or moving them, to other windows. This can be done using the Windows menu, or using the context menu, which pops up when right clicking on a tab.
You can switch to a tab by simply clicking on it, or you can use the keyboard. The default keybindings to switch tabs (which are Ctrl+PageUp/-Down and Alt+Home, see "Keyboard Shortcuts") can be changed using the actions previous_document(), next_document() and last_document().
While plain-text is probably the simplest and most interchangeable file format in the computer world, there is still variation in what plain-text means from system to system. Plain-text files can differ in character set, line termination, and wrapping.
While character set differences are the most obvious and pose the most challenge to portability, they affect NEdit only indirectly via the same font and localization mechanisms common to all X applications. If your system is set up properly, you will probably never see character-set related problems in NEdit. NEdit can not display Unicode text files, or any multi-byte character set.
The primary difference between an MS DOS format file and a Unix format file, is how the lines are terminated. Unix uses a single newline character. MS DOS uses a carriage-return and a newline. NEdit can read and write both file formats, but internally, it uses the single character Unix standard. NEdit auto-detects MS DOS format files based on the line termination at the start of the file. Files are judged to be DOS format if all of the first five line terminators, within a maximum range, are DOS-style. To change the format in which NEdit writes a file from DOS to Unix or visa versa, use the Save As... command and check or un-check the MS DOS Format button.
Wrapping within text files can vary among individual users, as well as from system to system. Both Windows and MacOS make frequent use of plain text files with no implicit right margin. In these files, wrapping is determined by the tool which displays them. Files of this style also exist on Unix systems, despite the fact that they are not supported by all Unix utilities. To display this kind of file properly in NEdit, you have to select the wrap style called Continuous. Wrapping modes are discussed in the sections: Customizing -> Preferences, and Basic Operation -> Shifting and Filling.
The last and most minute of format differences is the terminating newline. Some Unix compilers and utilities require a final terminating newline on all files they read and fail in various ways on files which do not have it. Vi and approximately half of Unix editors enforce the terminating newline on all files that they write; Emacs does not enforce this rule. Users are divided on which is best. NEdit makes the final terminating newline optional (Preferences -> Default Settings -> Terminate with Line Break on Save).
Though general in appearance, NEdit has many features intended specifically for programmers. Major programming-related topics are listed in separate sections under the heading: "Features for Programming": Syntax Highlighting, Tabs/Emulated Tabs, Finding Declarations (ctags), Calltips, and Auto/Smart Indent. Minor topics related to programming are discussed below:
When NEdit initially reads a file, it attempts to determine whether the file is in one of the computer languages that it knows about. Knowing what language a file is written in allows NEdit to assign highlight patterns and smart indent macros, and to set language specific preferences like word delimiters, tab emulation, and auto-indent. Language mode can be recognized from both the file name and from the first 200 characters of content. Language mode recognition and language-specific preferences are configured in: Preferences -> Default Settings -> Language Modes....
You can set the language mode manually for a window, by selecting it from the menu: Preferences -> Language Modes.
NEdit can be made to set the background color of particular classes of characters to allow easy identification of those characters. This is particularly useful if you need to be able to distinguish between tabs and spaces in a file where the difference is important. The colors used for backlighting are specified by a resource, "nedit*backlightCharTypes". You can turn backlighting on and off through the Preferences -> Apply Backlighting menu entry.
If you prefer to have backlighting turned on for all new windows, use the Preferences -> Default Settings -> Apply Backlighting menu entry. This settings can be saved along with other preferences using Preferences -> Save Defaults.
Important: In future versions of NEdit, the backlighting feature will be extended and reworked such that it becomes easier to configure. The current way of controlling it through a resource is generally considered to be below NEdit's usability standards. These future changes are likely to be incompatible with the current format of the "nedit*backlightCharTypes" resource, though. Therefore, it is expected that there will be no automatic migration path for users who customize the resource.
To find a particular line in a source file by line number, choose Goto Line #... from the Search menu. You can also directly select the line number text in the compiler message in the terminal emulator window (xterm, decterm, winterm, etc.) where you ran the compiler, and choose Goto Selected from the Search menu.
To find out the line number of a particular line in your file, turn on Statistics Line in the Preferences menu and position the insertion point anywhere on the line. The statistics line continuously updates the line number of the line containing the cursor.
To go to a specific column on a given line, choose Goto Line #... from the Search menu and enter a line number and a column number separated by a comma. (e.g. Enter "100,12" for line 100 column 12.) If you want to go to a column on the current line just leave out the line number. (e.g. Enter ",45" to go the column 45 on the current line.)
To help you inspect nested parentheses, brackets, braces, quotes, and other characters, NEdit has both an automatic parenthesis matching mode, and a Goto Matching command. Automatic parenthesis matching is activated when you type, or move the insertion cursor after a parenthesis, bracket, or brace. It momentarily highlights either the opposite character ('Delimiter') or the entire expression ('Range') when the opposite character is visible in the window. To find a matching character anywhere in the file, select it or position the cursor after it, and choose Goto Matching from the Search menu. If the character matches itself, such as a quote or slash, select the first character of the pair. NEdit will match {, (, [, <, ", ', `, /, and \. Holding the Shift key while typing the accelerator key (Shift+Ctrl+M, by default), will select all of the text between the matching characters.
When syntax highlighting is enabled, the matching routines can optionally make use of the syntax information for improved accuracy. In that case, a brace inside a highlighted string will not match a brace inside a comment, for instance.
The Open Selected command in the File menu understands the C preprocessor's #include syntax, so selecting an #include line and invoking Open Selected will generally find the file referred to, unless doing so depends on the settings of compiler switches or other information not available to NEdit.
Integrated software development environments such as SGI's CaseVision and Centerline Software's Code Center, can be interfaced directly with NEdit via the client server interface. These tools allow you to click directly on compiler and runtime error messages and request NEdit to open files, and select lines of interest. The easiest method is usually to use the tool's interface for character-based editors like vi, to invoke nc, but programmatic interfaces can also be derived using the source code for nc.
There are also some simple compile/review, grep, ctree, and ctags browsers available in the NEdit contrib directory on ftp.nedit.org.
Tabs are important for programming in languages which use indentation to show nesting, as short-hand for producing white-space for leading indents. As a programmer, you have to decide how to use indentation, and how or whether tab characters map to your indentation scheme.
Ideally, tab characters map directly to the amount of indent that you use to distinguish nesting levels in your code. Unfortunately, the Unix standard for interpretation of tab characters is eight characters (probably dating back to mechanical capabilities of the original teletype), which is usually too coarse for a single indent.
Most text editors, NEdit included, allow you to change the interpretation of the tab character, and many programmers take advantage of this, and set their tabs to 3 or 4 characters to match their programming style. In NEdit you set the hardware tab distance in Preferences -> Tabs... for the current window, or Preferences -> Default Settings -> Tabs... (general), or Preferences -> Default Settings -> Language Modes... (language-specific) to change the defaults for future windows.
Changing the meaning of the tab character makes programming much easier while you're in the editor, but can cause you headaches outside of the editor, because there is no way to pass along the tab setting as part of a plain-text file. All of the other tools which display, print, and otherwise process your source code have to be made aware of how the tabs are set, and must be able to handle the change. Non-standard tabs can also confuse other programmers, or make editing your code difficult for them if their text editors don't support changes in tab distance.
An alternative to changing the interpretation of the tab character is tab emulation. In the Tabs... dialog(s), turning on Emulated Tabs causes the Tab key to insert the correct number of spaces and/or tabs to bring the cursor the next emulated tab stop, as if tabs were set at the emulated tab distance rather than the hardware tab distance. Backspacing immediately after entering an emulated tab will delete the fictitious tab as a unit, but as soon as you move the cursor away from the spot, NEdit will forget that the collection of spaces and tabs is a tab, and will treat it as separate characters. To enter a real tab character with "Emulate Tabs" turned on, use Ctrl+Tab.
It is also possible to tell NEdit not to insert ANY tab characters at all in the course of processing emulated tabs, and in shifting and rectangular insertion/deletion operations, for programmers who worry about the misinterpretation of tab characters on other systems.
Programmers who use structured languages usually require some form of automatic indent, so that they don't have to continually re-type the sequences of tabs and/or spaces needed to maintain lengthy running indents. NEdit therefore offers "smart" indent, in addition to the traditional automatic indent which simply lines up the cursor position with the previous line.
Smart indent macros are only available by default for C and C++, and while these can easily be configured for different default indentation distances, they may not conform to everyone's exact C programming style. Smart indent is programmed in terms of macros in the NEdit macro language which can be entered in: Preferences -> Default Settings -> Indent -> Program Smart Indent. Hooks are provided for intervening at the point that a newline is entered, either via the user pressing the Enter key, or through auto-wrapping; and for arbitrary type-in to act on specific characters typed.
To type a newline character without invoking smart-indent when operating in smart-indent mode, hold the Shift key while pressing the Return or Enter key.
With Indent set to Auto (the default), NEdit keeps a running indent. When you press the Return or Enter key, spaces and tabs are inserted to line up the insert point under the start of the previous line.
Regardless of indent-mode, Ctrl+Return always does the automatic indent; Shift+Return always does a return without indent.
The Shift Left and Shift Right commands as well as rectangular dragging can be used to adjust the indentation for several lines at once. To shift a block of text one character to the right, select the text, then choose Shift Right from the Edit menu. Note that the accelerator keys for these menu items are Ctrl+9 and Ctrl+0, which correspond to the right and left parenthesis on most keyboards. Remember them as adjusting the text in the direction pointed to by the parenthesis character. Holding the Shift key while selecting either Shift Left or Shift Right will shift the text by one tab stop (or by one emulated tab stop if tab emulation is turned on). The help section "Shifting and Filling" under "Basic Operation" has details.
Syntax Highlighting means using colors and fonts to help distinguish language elements in programming languages and other types of structured files. Programmers use syntax highlighting to understand code faster and better, and to spot many kinds of syntax errors more quickly.
To use syntax highlighting in NEdit, select Highlight Syntax in the Preferences menu. If NEdit recognizes the computer language that you are using, and highlighting rules (patterns) are available for that language, it will highlight your text, and maintain the highlighting, automatically, as you type.
If NEdit doesn't correctly recognize the type of the file you are editing, you can manually select a language mode from Language Modes in the Preferences menu. You can also program the method that NEdit uses to recognize language modes in Preferences -> Default Settings -> Language Modes....
If no highlighting patterns are available for the language that you want to use, you can create new patterns relatively quickly. The Help section "Highlighting Patterns" under "Customizing", has details.
If you are satisfied with what NEdit is highlighting, but would like it to use different colors or fonts, you can change these by selecting Preferences -> Default Settings -> Syntax Highlighting -> Text Drawing Styles. Highlighting patterns are connected with font and color information through a common set of styles so that colorings defined for one language will be similar across others, and patterns within the same language which are meant to appear identical can be changed in the same place. To understand which styles are used to highlight the language you are interested in, you may need to look at "Highlighting Patterns" section, as well.
Syntax highlighting is CPU intensive, and under some circumstances can affect NEdit's responsiveness. If you have a particularly slow system, or work with very large files, you may not want to use it all of the time. Syntax highlighting introduces two kinds of delays. The first is an initial parsing delay, proportional to the size of the file. This delay is also incurred when pasting large sections of text, filtering text through shell commands, and other circumstances involving changes to large amounts of text. The second kind of delay happens when text which has not previously been visible is scrolled in to view. Depending on your system, and the highlight patterns you are using, this may or may not be noticeable. A typing delay is also possible, but unlikely if you are only using the built-in patterns.
NEdit can process tags files generated using the Unix ctags command or the Exuberant Ctags program. Ctags creates index files correlating names of functions and declarations with their locations in C, Fortran, or Pascal source code files. (See the ctags manual page for more information). Ctags produces a file called "tags" which can be loaded by NEdit. NEdit can manage any number of tags files simultaneously. Tag collisions are handled with a popup menu to let the user decide which tag to use. In 'Smart' mode NEdit will automatically choose the desired tag based on the scope of the file or module. Once loaded, the information in the tags file enables NEdit to go directly to the declaration of a highlighted function or data structure name with a single command. To load a tags file, select "Load Tags File" from the File menu and choose a tags file to load, or specify the name of the tags file on the NEdit command line:
nedit -tags tags
NEdit can also be set to load a tags file automatically when it starts up. Setting the X resource nedit.tagFile to the name of a tag file tells NEdit to look for that file at startup time (see "Customizing NEdit"). The file name can be either a complete path name, in which case NEdit will always load the same tags file, or a file name without a path or with a relative path, in which case NEdit will load it starting from the current directory. The second option allows you to have different tags files for different projects, each automatically loaded depending on the directory you're in when you start NEdit. Setting the name to "tags" is an obvious choice since this is the name that ctags uses. NEdit normally evaluates relative path tag file specifications every time a file is opened. All accessible tag files are loaded at this time. To disable the automatic loading of tag files specified as relative paths, set the X resource nedit.alwaysCheckRelativeTagsSpecs to False.
To unload a tags file, select "Un-load Tags File" from the File menu and choose from the list of tags files. NEdit will keep track of tags file updates by checking the timestamp on the files, and automatically update the tags cache.
To find the definition of a function or data structure once a tags file is loaded, select the name anywhere it appears in your program (see "Selecting Text") and choose "Find Definition" from the Search menu.
Calltips are little yellow boxes that pop up to remind you what the arguments and return type of a function are. More generally, they're a UI mechanism to present a small amount of crucial information in a prominent location. To display a calltip, select some text and choose "Show Calltip" from the Search menu. To kill a displayed calltip, hit Esc.
Calltips get their information from one of two places -- either a tags file (see "Finding Declarations (ctags)") or a calltips file. First, any loaded calltips files are searched for a definition, and if nothing is found then the tags database is searched. If a tag is found that matches the highlighted text then a calltip is displayed with the first few lines of the definition -- usually enough to show you what the arguments of a function are.
You can load a calltips file by using choosing "Load Calltips File" from the File menu. You can unload a calltips file by selecting it from the "Unload Calltips File" submenu of the File menu. You can also choose one or more default calltips files to be loaded for each language mode using the "Default calltips file(s)" field of the Language Modes dialog.
The calltips file format is very simple. calltips files are organized in blocks separated by blank lines. The first line of the block is the key, which is the word that is matched when a calltip is requested. The rest of the block is displayed as the calltip.
Almost any text at all can appear in a calltip key or a calltip. There are no special characters that need to be escaped. The only issues to note are that trailing whitespace is ignored, and you cannot have a blank line inside a calltip. (Use a single period instead -- it'll be nearly invisible.) You should also avoid calltip keys that begin and end with '*' characters, since those are used to mark special blocks.
There are five special block types--comment, include, language, alias, and version--which are distinguished by their first lines, "* comment *", "* include *", "* language *", "* alias *", and "* version *" respectively (without quotes).
Comment blocks are ignored when reading calltips files.
Include blocks specify additional calltips files to load, one per line. The ~ character can be used for your $HOME directory, but other shell shortcuts like * and ? can't be used. Include blocks allow you to make a calltips file for your project that includes, say, the calltips files for C, Motif, and Xt.
Language blocks specify which language mode the calltips should be used with. When a calltip is requested it won't match tips from languages other than the current language mode. Language blocks only affect the tips listed after the block.
Alias blocks allow a calltip to have multiple keys. The first line of the block is the key for the calltip to be displayed, and the rest of the lines are additional keys, one per line, that should also show the calltip.
Version blocks are ignored for the time being.
You can use calltips in your own macros using the calltip() and kill_calltip() macro subroutines and the $calltip_ID macro variable. See the Macro Subroutines section for details.
Regular expressions (regex's) are useful as a way to match inexact sequences of characters. They can be used in the `Find...' and `Replace...' search dialogs and are at the core of Color Syntax Highlighting patterns. To specify a regular expression in a search dialog, simply click on the `Regular Expression' radio button in the dialog.
A regex is a specification of a pattern to be matched in the searched text. This pattern consists of a sequence of tokens, each being able to match a single character or a sequence of characters in the text, or assert that a specific position within the text has been reached (the latter is called an anchor.) Tokens (also called atoms) can be modified by adding one of a number of special quantifier tokens immediately after the token. A quantifier token specifies how many times the previous token must be matched (see below.)
Tokens can be grouped together using one of a number of grouping constructs, the most common being plain parentheses. Tokens that are grouped in this way are also collectively considered to be a regex atom, since this new larger atom may also be modified by a quantifier.
A regex can also be organized into a list of alternatives by separating each alternative with pipe characters, `|'. This is called alternation. A match will be attempted for each alternative listed, in the order specified, until a match results or the list of alternatives is exhausted (see Alternation section below.)
If a dot (`.') appears in a regex, it means to match any character exactly once. By default, dot will not match a newline character, but this behavior can be changed (see help topic Parenthetical Constructs, under the heading, Matching Newlines).
A character class, or range, matches exactly one character of text, but the candidates for matching are limited to those specified by the class. Classes come in two flavors as described below:
[...] Regular class, match only characters listed.
[^...] Negated class, match only characters NOT listed.
As with the dot token, by default negated character classes do not match newline, but can be made to do so.
The characters that are considered special within a class specification are different than the rest of regex syntax as follows. If the first character in a class is the `]' character (second character if the first character is `^') it is a literal character and part of the class character set. This also applies if the first or last character is `-'. Outside of these rules, two characters separated by `-' form a character range which includes all the characters between the two characters as well. For example, `[^f-j]' is the same as `[^fghij]' and means to match any character that is not `f', `g', `h', `i', or `j'.
Anchors are assertions that you are at a very specific position within the search text. NEdit regular expressions support the following anchor tokens:
^ Beginning of line
$ End of line
< Left word boundary
> Right word boundary
\B Not a word boundary
Note that the \B token ensures that neither the left nor the right character are delimiters, or that both left and right characters are delimiters. The left word anchor checks whether the previous character is a delimiter and the next character is not. The right word anchor works in a similar way.
Quantifiers specify how many times the previous regular expression atom may be matched in the search text. Some quantifiers can produce a large performance penalty, and can in some instances completely lock up NEdit. To prevent this, avoid nested quantifiers, especially those of the maximal matching type (see below.)
The following quantifiers are maximal matching, or "greedy", in that they match as much text as possible.
* Match zero or more
+ Match one or more
? Match zero or one
The following quantifiers are minimal matching, or "lazy", in that they match as little text as possible.
*? Match zero or more
+? Match one or more
?? Match zero or one
One final quantifier is the counting quantifier, or brace quantifier. It takes the following basic form:
{min,max} Match from `min' to `max' times the
previous regular expression atom.
If `min' is omitted, it is assumed to be zero. If `max' is omitted, it is assumed to be infinity. Whether specified or assumed, `min' must be less than or equal to `max'. Note that both `min' and `max' are limited to 65535. If both are omitted, then the construct is the same as `*'. Note that `{,}' and `{}' are both valid brace constructs. A single number appearing without a comma, e.g. `{3}' is short for the `{min,min}' construct, or to match exactly `min' number of times.
The quantifiers `{1}' and `{1,1}' are accepted by the syntax, but are optimized away since they mean to match exactly once, which is redundant information. Also, for efficiency, certain combinations of `min' and `max' are converted to either `*', `+', or `?' as follows:
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width assertion of the enclosed regular expression. In other words, a match of the regular expression contained in the positive look-ahead construct is attempted. If it succeeds, control is passed to the next regular expression atom, but the text that was consumed by the positive look-ahead is first unmatched (backtracked) to the place in the text where the positive look-ahead was first encountered.
One application of positive look-ahead is the manual implementation of a first character discrimination optimization. You can include a positive look-ahead that contains a character class which lists every character that the following (potentially complex) regular expression could possibly start with. This will quickly filter out match attempts that can not possibly succeed.
Negative look-ahead takes the form `(?!<regex>)' and is exactly the same as positive look-ahead except that the enclosed regular expression must NOT match. This can be particularly useful when you have an expression that is general, and you want to exclude some special cases. Simply precede the general expression with a negative look-ahead that covers the special cases that need to be filtered out.
Positive look-behind constructs are of the form `(?<=<regex>)' and implement a
{} {,} {0,} *
{1,} +
{,1} {0,1} ?
Note that {0} and {0,0} are meaningless and will generate an error message at regular expression compile time.
Brace quantifiers can also be "lazy". For example {2,5}? would try to match 2 times if possible, and will only match 3, 4, or 5 times if that is what is necessary to achieve an overall match.
A series of alternative patterns to match can be specified by separating them with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be an arbitrarily complex regular expression. The alternatives are attempted in the order specified. An empty alternative can be specified if desired, e.g. `a|b|'. Since an empty alternative can match nothingness (the empty string), this guarantees that the expression will match.
Comments are of the form `(?#<comment text>)' and can be inserted anywhere and have no effect on the execution of the regular expression. They can be handy for documenting very complex regular expressions. Note that a comment begins with `(?#' and ends at the first occurrence of an ending parenthesis, or the end of the regular expression... period. Comments do not recognize any escape sequences.
In a regular expression (regex), most ordinary characters match themselves. For example, `ab%' would match anywhere `a' followed by `b' followed by `%' appeared in the text. Other characters don't match themselves, but are metacharacters. For example, backslash is a special metacharacter which 'escapes' or changes the meaning of the character following it. Thus, to match a literal backslash would require a regular expression to have two backslashes in sequence. NEdit provides the following escape sequences so that metacharacters that are used by the regex syntax can be specified as ordinary characters.
\( \) \- \[ \] \< \> \{ \}
\. \| \^ \$ \* \+ \? \& \\
There are some special characters that are difficult or impossible to type. Many of these characters can be constructed as a sort of metacharacter or sequence by preceding a literal character with a backslash. NEdit recognizes the following special character sequences:
\a alert (bell)
\b backspace
\e ASCII escape character (***)
\f form feed (new page)
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
*** For environments that use the EBCDIC character set,
when compiling NEdit set the EBCDIC_CHARSET compiler
symbol to get the EBCDIC equivalent escape
character.)
Any ASCII (or EBCDIC) character, except null, can be specified by using either an octal escape or a hexadecimal escape, each beginning with \0 or \x (or \X), respectively. For example, \052 and \X2A both specify the `*' character. Escapes for null (\00 or \x0) are not valid and will generate an error message. Also, any escape that exceeds \0377 or \xFF will either cause an error or have any additional character(s) interpreted literally. For example, \0777 will be interpreted as \077 (a `?' character) followed by `7' since \0777 is greater than \0377.
An invalid digit will also end an octal or hexadecimal escape. For example, \091 will cause an error since `9' is not within an octal escape's range of allowable digits (0-7) and truncation before the `9' yields \0 which is invalid.
NEdit defines some escape sequences that are handy shortcuts for commonly used character classes.
\d digits 0-9 \l letters a-z, A-Z, and locale dependent letters \s whitespace \t, \r, \v, \f, and space \w word characters letters, digits, and underscore, `_'
\D, \L, \S, and \W are the same as the lowercase versions except that the resulting character class is negated. For example, \d is equivalent to `[0-9]', while \D is equivalent to `[^0-9]'.
These escape sequences can also be used within a character class. For example, `[\l_]' is the same as `[a-zA-Z_]', extended with possible locale dependent letters. The escape sequences for special characters, and octal and hexadecimal escapes are also valid within a class.
Although not strictly a character class, the following escape sequences behave similarly to character classes:
\y Word delimiter character
\Y Not a word delimiter character
The `\y' token matches any single character that is one of the characters that NEdit recognizes as a word delimiter character, while the `\Y' token matches any character that is NOT a word delimiter character. Word delimiter characters are dynamic in nature, meaning that the user can change them through preference settings. For this reason, they must be handled differently by the regular expression engine. As a consequence of this, `\y' and `\Y' can not be used within a character class specification.
Capturing Parentheses are of the form `(<regex>)' and can be used to group arbitrarily complex regular expressions. Parentheses can be nested, but the total number of parentheses, nested or otherwise, is limited to 50 pairs. The text that is matched by the regular expression between a matched set of parentheses is captured and available for text substitutions and backreferences (see below.) Capturing parentheses carry a fairly high overhead both in terms of memory used and execution speed, especially if quantified by `*' or `+'.
Non-Capturing Parentheses are of the form `(?:<regex>)' and facilitate grouping only and do not incur the overhead of normal capturing parentheses. They should not be counted when determining numbers for capturing parentheses which are used with backreferences and substitutions. Because of the limit on the number of capturing parentheses allowed in a regex, it is advisable to use non-capturing parentheses when possible.
Positive look-ahead constructs are of the form `(?=<regex>)' and implement a zero width asse