aboutsummaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorAndreas Schwab <[email protected]>2010-06-10 00:08:50 +0200
committerAndreas Schwab <[email protected]>2010-06-10 00:08:50 +0200
commit639b2760f19231881f753c8f1f7822eab457c751 (patch)
treee18ffb6ca9d3ed2ad8cf2e38de2caa44e52efaaa /doc
parentc1b1acc2f7a3b658407afe4562a88ea8c62671d9 (diff)
parente454a4a330cc6524cf0d2604b4fafc32d5bda795 (diff)
Merge from emacs-23
Diffstat (limited to 'doc')
-rw-r--r--doc/lispref/ChangeLog6
-rw-r--r--doc/lispref/searching.texi25
2 files changed, 16 insertions, 15 deletions
diff --git a/doc/lispref/ChangeLog b/doc/lispref/ChangeLog
index ca40b34b73..cecb6f0c66 100644
--- a/doc/lispref/ChangeLog
+++ b/doc/lispref/ChangeLog
@@ -1,3 +1,9 @@
+2010-06-02 Chong Yidong <[email protected]>
+
+ * searching.texi (Regexp Special): Remove obsolete information
+ about matching non-ASCII characters, and suggest using char
+ classes (Bug#6283).
+
2010-05-30 Juanma Barranquero <[email protected]>
* minibuf.texi (Basic Completion): Add missing "@end defun".
diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi
index 48780d0a34..722f76cdd7 100644
--- a/doc/lispref/searching.texi
+++ b/doc/lispref/searching.texi
@@ -362,7 +362,7 @@ the two brackets are what this character alternative can match.
Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and
@samp{[ad]*} matches any string composed of just @samp{a}s and @samp{d}s
-(including the empty string), from which it follows that @samp{c[ad]*r}
+(including the empty string). It follows that @samp{c[ad]*r}
matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc.
You can also include character ranges in a character alternative, by
@@ -400,20 +400,11 @@ is @samp{@var{c}..?\377}, the other is @samp{@var{c1}..@var{c2}}, where
@var{c1} is the first character of the charset to which @var{c2}
belongs.
-You cannot always match all non-@acronym{ASCII} characters with the regular
-expression @code{"[\200-\377]"}. This works when searching a unibyte
-buffer or string (@pxref{Text Representations}), but not in a multibyte
-buffer or string, because many non-@acronym{ASCII} characters have codes
-above octal 0377. However, the regular expression @code{"[^\000-\177]"}
-does match all non-@acronym{ASCII} characters (see below regarding @samp{^}),
-in both multibyte and unibyte representations, because only the
-@acronym{ASCII} characters are excluded.
-
-A character alternative can also specify named
-character classes (@pxref{Char Classes}). This is a POSIX feature whose
-syntax is @samp{[:@var{class}:]}. Using a character class is equivalent
-to mentioning each of the characters in that class; but the latter is
-not feasible in practice, since some classes include thousands of
+A character alternative can also specify named character classes
+(@pxref{Char Classes}). This is a POSIX feature whose syntax is
+@samp{[:@var{class}:]}. Using a character class is equivalent to
+mentioning each of the characters in that class; but the latter is not
+feasible in practice, since some classes include thousands of
different characters.
@item @samp{[^ @dots{} ]}
@@ -431,6 +422,10 @@ A complemented character alternative can match a newline, unless newline is
mentioned as one of the characters not to match. This is in contrast to
the handling of regexps in programs such as @code{grep}.
+You can specify named character classes, just like in character
+alternatives. For instance, @samp{[^[:ascii:]]} matches any
+non-@acronym{ASCII} character. @xref{Char Classes}.
+
@item @samp{^}
@cindex beginning of line in regexp
When matching a buffer, @samp{^} matches the empty string, but only at the