<html><head><title>VCP::Filter::stringedit - alter any field character by character</title></head><body><h1><a name="NAME">NAME
</a></h1><p>VCP::Filter::stringedit - alter any field character by character
<p><hr><h1><a name="SYNOPSIS">SYNOPSIS
</a></h1><pre> StringEdit:
## Convert illegal p4 characters to ^NN hex escapes and the
## p4 wildcard "..." to a safe string. The "^" is not an illegal
## char, it's replaced with an escape to allow us to use it as
## an escape character without the (extremely small) risk of
## running across a file name that actually uses it.
## Order is significant in this ruleset.
# field(s) match replacement
name,labels /([\s@#*%^])/ ^%02x
name,labels "..." ^___
</pre><pre> StringEdit:
## underscorify each unwanted character to a single "_"
name,labels /[\s@#*%^]/ _
</pre><pre> StringEdit:
## underscorify each run of unwanted characters to a single "_"
name,labels /[\s@#*%^]*/ _
</pre><pre> StringEdit:
## prefix labels that don't start with a letter or underscore:
labels /([^a-zA-Z_])/ _%c
</pre><p><hr><h1><a name="DESCRIPTION">DESCRIPTION
</a></h1><p>Allows field by field string editing, using Perl regular expressions
to match characters and substrings and sprintf-like replacement
strings.
<h2><a name="Rules">Rules
</a></h2><p>A rule is a triplet of expressions specifying a (1) set of fields to match,
(2) a pattern to match against those fields' contents (matching contents
are removed), and (3) a string to replace each of the removed bits
with.
<p>NOTE 1: the "match" expression uses perl5 regular expressions, not
filename wildcards used in most other places in VCP configurations.
<p>The list of rules is evaluated top down and all rules are applied to
each string.
<p>NOTE 2: The all-rules-apply nature of this filter is different from the
behaviors of the ...Map: filters, which stop after the first matching
rule. This is because ...Map: filters are rewriting entire strings and
there can be only one result string, while the StringEdit filter may be
rewriting pieces of string and multiple rewrites may be combined to good
effect.
<h2><a name="The_Fields_List">The Fields List
</a></h2><p>A comma separated list of field names. Any field may be edited except
those that begin with "source_".
<h2><a name="The_Match_Expression">The Match Expression
</a></h2><p>For each field, the match expression is run against the field and, if it
matches, causes all matching portions of string to be replaced.
<p>The match expression is a full perl5 regular expression enclosed in
/.../ delimiters or a plain string, either of which may be enclosed in
'' or "" delimiters if inline spaces are needed (rare, we hope).
<h2><a name="The_Replacement_Expression">The Replacement Expression
</a></h2><p>Each match is replaced by one instance of the replacement expression,
optionally enclosed in single or double quotation marks.
<p>The replacement expression provides a limited list of C sprintf style
macros:
<pre> %d The decimal codes for each character in the match
%o The octal codes for each character in the match
%x The hex codes for each character in the match
</pre><p>Any non-letter preceded by a backslash "\" character is replaced by
itself. Some more or less useful examples:
<pre> \% \\ \" \' \` \{ \} \$ \* \+ \? \1
</pre><p>If a punctuation character other than a period (.) or slash "/" follows
a letter macro, it must be escaped using the backslash character (this
is to reserve room in the spec for postfix modifiers like "*", "+", and
"?"). So, to put a literal star (*) after a hex code, you would do
something like "%02x\*".
<p>The "normal" perl5 letter abbreviations are also allowed:
<pre> \t tab (HT, TAB)
\n newline (NL)
\r return (CR)
\f form feed (FF)
\b backspace (BS)
\a alarm (bell) (BEL)
\e escape (ESC)
\033 octal char (ESC)
\x1b hex char (ESC)
\x{263a} wide hex char (SMILEY)
\c[ control char (ESC)
\N{name} named Unicode character
</pre><p>including the following escape sequences are available in constructs
that modify what follows:
<pre> \l lowercase next char
\u uppercase next char
\L lowercase till \E
\U uppercase till \E
\E end case modification
\Q quote non-word characters till \E
</pre><p>As shown above, normal sprintf-style options may be included (and are
recommended), so %02x produces results like "%09" (if the match was a
single TAB character) or "%20" (if the match was a SPACE character).
The dot precision modifiers (".3") are not supported, just the leading 0
and the field width specifier.
<h2><a name="Case_sensitivity">Case sensitivity
</a></h2><p>By default, all patterns are case sensitive. There is no way to
override this at present; one will be added.
<h2><a name="Command_Line_Parsing">Command Line Parsing
</a></h2><p>For large stringedits or repeated use, the stringedit is best specified
in a .vcp file. For quick one-offs or scripted situations, however, the
stringedit: scheme may be used on the command line. In this case, each
parameter is a "word" and every triple of words is a ( pattern, result )
pair.
<p>Because <a href="vcp.html">vcp</a> command line parsing is performed incrementally and
the next filter or destination specifications can look exactly like a
pattern or result, the special token "--" is used to terminate the list
of patterns if StringEdit: is used on the command line. This may also
be the last word in the <code>StringEdit:</code> section of a .vcp file, but that
is superfluous. It is an error to use "--" before the last word in a
.vcp file.
<p><hr><h1><a name="LIMITATIONS">LIMITATIONS
</a></h1><p>There is no way (yet) of telling the stringeditor to continue processing the
rules list. We could implement labels like <code> <<<i>label</i></code>> > to be
allowed before pattern expressions (but not between pattern and result),
and we could then impelement <code> <<goto <i>label</i></code>> >. And a <code> <<next</code>>
> could be used to fall through to the next label. All of which is
wonderful, but I want to gain some real world experience with the
current system and find a use case for gotos and fallthroughs before I
implement them. This comment is here to solicit feedback :).
<p><hr><h1><a name="AUTHOR">AUTHOR
</a></h1><p>Barrie Slaymaker <barries@slaysys.com>
<p><hr><h1><a name="COPYRIGHT">COPYRIGHT
</a></h1><p>Copyright (c) 2000, 2001, 2002 Perforce Software, Inc.
All rights reserved.
<p>See <a>VCP::License</a> (<code>vcp help license</code>) for the terms of use.
<p><hr><i><font size="-1">Last updated: Fri Jun 4 14:21:32 2004</font></i></body></html>