292 lines
11 KiB
Plaintext
292 lines
11 KiB
Plaintext
|
'\"
|
||
|
'\" Copyright (c) 1993 The Regents of the University of California.
|
||
|
'\" Copyright (c) 1994-1996 Sun Microsystems, Inc.
|
||
|
'\" Copyright (c) 2000 Scriptics Corporation.
|
||
|
'\"
|
||
|
'\" See the file "license.terms" for information on usage and redistribution
|
||
|
'\" of this file, and for a DISCLAIMER OF ALL WARRANTIES.
|
||
|
'\"
|
||
|
.TH scan n 8.4 Tcl "Tcl Built-In Commands"
|
||
|
.so man.macros
|
||
|
.BS
|
||
|
'\" Note: do not modify the .SH NAME line immediately below!
|
||
|
.SH NAME
|
||
|
scan \- Parse string using conversion specifiers in the style of sscanf
|
||
|
.SH SYNOPSIS
|
||
|
\fBscan \fIstring format \fR?\fIvarName varName ...\fR?
|
||
|
.BE
|
||
|
.SH INTRODUCTION
|
||
|
.PP
|
||
|
This command parses substrings from an input string in a fashion similar
|
||
|
to the ANSI C \fBsscanf\fR procedure and returns a count of the number of
|
||
|
conversions performed, or -1 if the end of the input string is reached
|
||
|
before any conversions have been performed. \fIString\fR gives the input
|
||
|
to be parsed and \fIformat\fR indicates how to parse it, using \fB%\fR
|
||
|
conversion specifiers as in \fBsscanf\fR. Each \fIvarName\fR gives the
|
||
|
name of a variable; when a substring is scanned from \fIstring\fR that
|
||
|
matches a conversion specifier, the substring is assigned to the
|
||
|
corresponding variable.
|
||
|
If no \fIvarName\fR variables are specified, then \fBscan\fR works in an
|
||
|
inline manner, returning the data that would otherwise be stored in the
|
||
|
variables as a list. In the inline case, an empty string is returned when
|
||
|
the end of the input string is reached before any conversions have been
|
||
|
performed.
|
||
|
.SH "DETAILS ON SCANNING"
|
||
|
.PP
|
||
|
\fBScan\fR operates by scanning \fIstring\fR and \fIformat\fR together.
|
||
|
If the next character in \fIformat\fR is a blank or tab then it
|
||
|
matches any number of white space characters in \fIstring\fR (including
|
||
|
zero).
|
||
|
Otherwise, if it is not a \fB%\fR character then it
|
||
|
must match the next character of \fIstring\fR.
|
||
|
When a \fB%\fR is encountered in \fIformat\fR, it indicates
|
||
|
the start of a conversion specifier.
|
||
|
A conversion specifier contains up to four fields after the \fB%\fR:
|
||
|
a XPG3 position specifier (or a \fB*\fR to indicate the converted
|
||
|
value is to be discarded instead of assigned to any variable); a number
|
||
|
indicating a maximum substring width; a size modifier; and a
|
||
|
conversion character.
|
||
|
All of these fields are optional except for the conversion character.
|
||
|
The fields that are present must appear in the order given above.
|
||
|
.PP
|
||
|
When \fBscan\fR finds a conversion specifier in \fIformat\fR, it
|
||
|
first skips any white-space characters in \fIstring\fR (unless the
|
||
|
conversion character is \fB[\fR or \fBc\fR).
|
||
|
Then it converts the next input characters according to the
|
||
|
conversion specifier and stores the result in the variable given
|
||
|
by the next argument to \fBscan\fR.
|
||
|
.SS "OPTIONAL POSITIONAL SPECIFIER"
|
||
|
.PP
|
||
|
If the \fB%\fR is followed by a decimal number and a \fB$\fR, as in
|
||
|
.QW \fB%2$d\fR ,
|
||
|
then the variable to use is not taken from the next
|
||
|
sequential argument. Instead, it is taken from the argument indicated
|
||
|
by the number, where 1 corresponds to the first \fIvarName\fR. If
|
||
|
there are any positional specifiers in \fIformat\fR then all of the
|
||
|
specifiers must be positional. Every \fIvarName\fR on the argument
|
||
|
list must correspond to exactly one conversion specifier or an error
|
||
|
is generated, or in the inline case, any position can be specified
|
||
|
at most once and the empty positions will be filled in with empty strings.
|
||
|
.SS "OPTIONAL SIZE MODIFIER"
|
||
|
.PP
|
||
|
The size modifier field is used only when scanning a substring into
|
||
|
one of Tcl's integer values. The size modifier field dictates the
|
||
|
integer range acceptable to be stored in a variable, or, for the inline
|
||
|
case, in a position in the result list.
|
||
|
The syntactically valid values for the size modifier are \fBh\fR, \fBL\fR,
|
||
|
\fBl\fR, and \fBll\fR. The \fBh\fR size modifier value is equivalent
|
||
|
to the absence of a size modifier in the the conversion specifier.
|
||
|
Either one indicates the integer range to be stored is limited to
|
||
|
the same range produced by the \fBint()\fR function of the \fBexpr\fR
|
||
|
command. The \fBL\fR size modifier is equivalent to the \fBl\fR size
|
||
|
modifier. Either one indicates the integer range to be stored is
|
||
|
limited to the same range produced by the \fBwide()\fR function of
|
||
|
the \fBexpr\fR command. The \fBll\fR size modifier indicates that
|
||
|
the integer range to be stored is unlimited.
|
||
|
.SS "MANDATORY CONVERSION CHARACTER"
|
||
|
.PP
|
||
|
The following conversion characters are supported:
|
||
|
.TP
|
||
|
\fBd\fR
|
||
|
.
|
||
|
The input substring must be a decimal integer.
|
||
|
It is read in and the integer value is stored in the variable,
|
||
|
truncated as required by the size modifier value.
|
||
|
.TP
|
||
|
\fBo\fR
|
||
|
.
|
||
|
The input substring must be an octal integer. It is read in and the
|
||
|
integer value is stored in the variable,
|
||
|
truncated as required by the size modifier value.
|
||
|
.TP
|
||
|
\fBx\fR or \fBX\fR
|
||
|
.
|
||
|
The input substring must be a hexadecimal integer.
|
||
|
It is read in and the integer value is stored in the variable,
|
||
|
truncated as required by the size modifier value.
|
||
|
.TP
|
||
|
\fBb\fR
|
||
|
.
|
||
|
The input substring must be a binary integer.
|
||
|
It is read in and the integer value is stored in the variable,
|
||
|
truncated as required by the size modifier value.
|
||
|
.TP
|
||
|
\fBu\fR
|
||
|
.
|
||
|
The input substring must be a decimal integer.
|
||
|
The integer value is truncated as required by the size modifier
|
||
|
value, and the corresponding unsigned value for that truncated
|
||
|
range is computed and stored in the variable as a decimal string.
|
||
|
The conversion makes no sense without reference to a truncation range,
|
||
|
so the size modifier \fBll\fR is not permitted in combination
|
||
|
with conversion character \fBu\fR.
|
||
|
.TP
|
||
|
\fBi\fR
|
||
|
.
|
||
|
The input substring must be an integer. The base (i.e. decimal, octal, or hexadecimal) is determined by the C convention (leading 0 for octal; prefix 0x for hexadecimal). The integer value is stored in the variable,
|
||
|
truncated as required by the size modifier value.
|
||
|
.TP
|
||
|
\fBc\fR
|
||
|
.
|
||
|
A single character is read in and its Unicode value is stored in
|
||
|
the variable as an integer value.
|
||
|
Initial white space is not skipped in this case, so the input
|
||
|
substring may be a white-space character.
|
||
|
.TP
|
||
|
\fBs\fR
|
||
|
.
|
||
|
The input substring consists of all the characters up to the next
|
||
|
white-space character; the characters are copied to the variable.
|
||
|
.TP
|
||
|
\fBe\fR or \fBf\fR or \fBg\fR or \fBE\fR or \fBG\fR
|
||
|
.
|
||
|
The input substring must be a floating-point number consisting
|
||
|
of an optional sign, a string of decimal digits possibly
|
||
|
containing a decimal point, and an optional exponent consisting
|
||
|
of an \fBe\fR or \fBE\fR followed by an optional sign and a string of
|
||
|
decimal digits.
|
||
|
It is read in and stored in the variable as a floating-point value.
|
||
|
.TP
|
||
|
\fB[\fIchars\fB]\fR
|
||
|
.
|
||
|
The input substring consists of one or more characters in \fIchars\fR.
|
||
|
The matching string is stored in the variable.
|
||
|
If the first character between the brackets is a \fB]\fR then
|
||
|
it is treated as part of \fIchars\fR rather than the closing
|
||
|
bracket for the set.
|
||
|
If \fIchars\fR
|
||
|
contains a sequence of the form \fIa\fB\-\fIb\fR then any
|
||
|
character between \fIa\fR and \fIb\fR (inclusive) will match.
|
||
|
If the first or last character between the brackets is a \fB\-\fR, then
|
||
|
it is treated as part of \fIchars\fR rather than indicating a range.
|
||
|
.TP
|
||
|
\fB[^\fIchars\fB]\fR
|
||
|
.
|
||
|
The input substring consists of one or more characters not in \fIchars\fR.
|
||
|
The matching string is stored in the variable.
|
||
|
If the character immediately following the \fB^\fR is a \fB]\fR then it is
|
||
|
treated as part of the set rather than the closing bracket for
|
||
|
the set.
|
||
|
If \fIchars\fR
|
||
|
contains a sequence of the form \fIa\fB\-\fIb\fR then any
|
||
|
character between \fIa\fR and \fIb\fR (inclusive) will be excluded
|
||
|
from the set.
|
||
|
If the first or last character between the brackets is a \fB\-\fR, then
|
||
|
it is treated as part of \fIchars\fR rather than indicating a range value.
|
||
|
.TP
|
||
|
\fBn\fR
|
||
|
.
|
||
|
No input is consumed from the input string. Instead, the total number
|
||
|
of characters scanned from the input string so far is stored in the variable.
|
||
|
.PP
|
||
|
The number of characters read from the input for a conversion is the
|
||
|
largest number that makes sense for that particular conversion (e.g.
|
||
|
as many decimal digits as possible for \fB%d\fR, as
|
||
|
many octal digits as possible for \fB%o\fR, and so on).
|
||
|
The input substring for a given conversion terminates either when a
|
||
|
white-space character is encountered or when the maximum substring
|
||
|
width has been reached, whichever comes first.
|
||
|
If a \fB*\fR is present in the conversion specifier
|
||
|
then no variable is assigned and the next scan argument is not consumed.
|
||
|
.SH "DIFFERENCES FROM ANSI SSCANF"
|
||
|
.PP
|
||
|
The behavior of the \fBscan\fR command is the same as the behavior of
|
||
|
the ANSI C \fBsscanf\fR procedure except for the following differences:
|
||
|
.IP [1]
|
||
|
\fB%p\fR conversion specifier is not supported.
|
||
|
.IP [2]
|
||
|
For \fB%c\fR conversions a single character value is
|
||
|
converted to a decimal string, which is then assigned to the
|
||
|
corresponding \fIvarName\fR;
|
||
|
no substring width may be specified for this conversion.
|
||
|
.IP [3]
|
||
|
The \fBh\fR modifier is always ignored and the \fBl\fR and \fBL\fR
|
||
|
modifiers are ignored when converting real values (i.e. type
|
||
|
\fBdouble\fR is used for the internal representation). The \fBll\fR
|
||
|
modifier has no \fBsscanf\fR counterpart.
|
||
|
.IP [4]
|
||
|
If the end of the input string is reached before any conversions have been
|
||
|
performed and no variables are given, an empty string is returned.
|
||
|
.SH EXAMPLES
|
||
|
.PP
|
||
|
Convert a UNICODE character to its numeric value:
|
||
|
.PP
|
||
|
.CS
|
||
|
set char "x"
|
||
|
set value [\fBscan\fR $char %c]
|
||
|
.CE
|
||
|
.PP
|
||
|
Parse a simple color specification of the form \fI#RRGGBB\fR using
|
||
|
hexadecimal conversions with substring sizes:
|
||
|
.PP
|
||
|
.CS
|
||
|
set string "#08D03F"
|
||
|
\fBscan\fR $string "#%2x%2x%2x" r g b
|
||
|
.CE
|
||
|
.PP
|
||
|
Parse a \fIHH:MM\fR time string, noting that this avoids problems with
|
||
|
octal numbers by forcing interpretation as decimals (if we did not
|
||
|
care, we would use the \fB%i\fR conversion instead):
|
||
|
.PP
|
||
|
.CS
|
||
|
set string "08:08" ;# *Not* octal!
|
||
|
if {[\fBscan\fR $string "%d:%d" hours minutes] != 2} {
|
||
|
error "not a valid time string"
|
||
|
}
|
||
|
# We have to understand numeric ranges ourselves...
|
||
|
if {$minutes < 0 || $minutes > 59} {
|
||
|
error "invalid number of minutes"
|
||
|
}
|
||
|
.CE
|
||
|
.PP
|
||
|
Break a string up into sequences of non-whitespace characters (note
|
||
|
the use of the \fB%n\fR conversion so that we get skipping over
|
||
|
leading whitespace correct):
|
||
|
.PP
|
||
|
.CS
|
||
|
set string " a string {with braced words} + leading space "
|
||
|
set words {}
|
||
|
while {[\fBscan\fR $string %s%n word length] == 2} {
|
||
|
lappend words $word
|
||
|
set string [string range $string $length end]
|
||
|
}
|
||
|
.CE
|
||
|
.PP
|
||
|
Parse a simple coordinate string, checking that it is complete by
|
||
|
looking for the terminating character explicitly:
|
||
|
.PP
|
||
|
.CS
|
||
|
set string "(5.2,-4e-2)"
|
||
|
# Note that the spaces before the literal parts of
|
||
|
# the scan pattern are significant, and that ")" is
|
||
|
# the Unicode character \eu0029
|
||
|
if {
|
||
|
[\fBscan\fR $string " (%f ,%f %c" x y last] != 3
|
||
|
|| $last != 0x0029
|
||
|
} then {
|
||
|
error "invalid coordinate string"
|
||
|
}
|
||
|
puts "X=$x, Y=$y"
|
||
|
.CE
|
||
|
.PP
|
||
|
An interactive session demonstrating the truncation of integer
|
||
|
values determined by size modifiers:
|
||
|
.PP
|
||
|
.CS
|
||
|
\fI%\fR set tcl_platform(wordSize)
|
||
|
4
|
||
|
\fI%\fR scan 20000000000000000000 %d
|
||
|
2147483647
|
||
|
\fI%\fR scan 20000000000000000000 %ld
|
||
|
9223372036854775807
|
||
|
\fI%\fR scan 20000000000000000000 %lld
|
||
|
20000000000000000000
|
||
|
.CE
|
||
|
.SH "SEE ALSO"
|
||
|
format(n), sscanf(3)
|
||
|
.SH KEYWORDS
|
||
|
conversion specifier, parse, scan
|
||
|
'\" Local Variables:
|
||
|
'\" mode: nroff
|
||
|
'\" End:
|