Sharkysoft's Printf for Java specification offers Java programmers all the
robust text- and data-printf
function and more. Printf for Java allows the programmer to
format data into strings using customizable formatting templates. This
specification assumes that the reader is already familiar with C's
printf
function.
In C, printf
calls are made using variable argument lists, or
parameter lists of arbitrary length and type. In Java, however, variable
parameter lists are not allowed (yet). To overcome this limitation, Printf for
Java accepts an Object
array in place of the variable argument
list. Throughout this specification, this structure is referred to as the
data vector.
The printf data vector may be any length and contain any object type, as long
as its contents are compatible with the data types indicated in the the
format string. Because the data vector is implemented as an Object
array, elements represeting primitive types, such as int
s, must be
wrapped in compatible objects to be included in the vector. The primitive types
and their corresponding wrappers are listed below.
primitive type | wrapping class |
---|---|
boolean |
java.lang.Boolean |
byte |
java.lang.Byte |
short |
java.lang.Short |
int |
java.lang.Integer |
long |
java.lang.Long |
float |
java.lang.Float |
double |
java.lang.Double |
char |
java.lang.Character |
A format string is a concatenated mixture of literal strings and
format specifiers. Literal strings are copied verbatim to the formatted
output. Format specifiers describe conversion operations
(stringification) which are usually applied to elements in the data vector. All
printf
format strings that are valid (and sensible) in C are also
valid in Printf for Java, and can be expected to perform the same function.
However, Printf for Java adds a few additional capabilities which are not
available in C-based implementations. These new capabilities are indicated in
highlighted text.
A format specifier has the following syntax:
%
[flags][width][.
precision][input_size]conversion_type
The percent character ('%') always signals the beginning of a format specifier. The conversion type character is always present and marks the end of the format specifier. The flags, width, precision, and input size are optional, but if they are present, they must appear between the percent character and conversion type character, in the order given above.
The conversion type character indicates the type of input data to be stringified and formatted. There are several input conversion types, one output conversion type, and two escape conversion types. An input conversion type consumes at least one element from the data vector and generates formatted data based on the input value. The complete list of input conversion types is shown below. For convenience, the flags permitted are also shown, though these are not discussed until later.
conversion type character | default input type | stringified as | flags permitted |
---|---|---|---|
%c |
char |
Character, possibly encoded. |
'- ',
'^ ',
'# '
|
%d |
int |
Signed decimal integer. |
'- ',
'^ ',
'0 ',
'+ ',
' '
|
%e , %E |
float |
Real number, scientific notation (lowercase or uppercase exponent marker). |
'- ',
'^ ',
'# '
|
%f |
float |
Real number, standard notation. |
'- ',
'^ ',
'0 ',
'+ ',
' ',
'# '
|
%g , %G |
float |
Same as %f or %e , depending on value.
Scientific notation is used only if the exponent is greater than the
precision or less than -4.
|
'- ',
'^ ',
'0 ',
'# '
|
%o |
int |
Unsigned octal integer. |
'- ',
'^ ',
'0 ',
'# '
|
%p |
Object |
Object identity hash code (i.e., the object's address), in unsigned hexadecimal. |
'- ',
'^ ',
'0 ',
|
%s |
String |
String. |
'- ',
'^ ',
'# '
|
%u |
int |
Unsigned decimal integer. |
'- ',
'^ ',
'0 '
|
%x or %X |
int |
Unsigned hexadecimal integer, lowercase or uppercase. |
'- ',
'^ ',
'0 ',
'# '
|
%z[ n] or
%Z[ n]
|
int |
Unsigned integer in base n (decimal), lowercase or uppercase. (The square brackets are part of the specifier.) |
'- ',
'^ ',
'0 ',
|
The other conversion types are described later.
Input conversion type specifiers can be preceded by an input size modifier to override the default input type. All supported input size modifications are listed below:
default input type | input size modifier | modified data type |
---|---|---|
int |
(none) | int |
b |
byte |
|
B |
BigInteger |
|
h |
short |
|
l |
long |
|
float |
(none) | float |
B |
BigDecimal |
|
l |
double |
An optional output field width specifier, if present, specifies the minimum
output field width, or the minimum number of characters that the formatted data
will span in the output. If the stringified value does not fill the whole
field, then the field will be padded. The default behavior, in this case, is to
right-
On the other hand, if the formatted value exceeds the minimum length of the field, the output will not be truncated, and the field will be widened as necessary to display the entire result. When no output width is specified, there is no minimum field width. In this case, the field will only be as wide as necessary to display the result.
Example:
The code
Printf.out("%6d", new Object[] {new Integer(52)});
will output
" 52"(with four spaces on the left).
A precision specifier, if present, controls the precision with which the input data will be converted. The effect of setting a precision on a value depends on the conversion type. The effects of precision for each input type, along with the default precision values, are given below:
conversion type | effect of precision | default precision |
---|---|---|
|
Precision controls the minimum number of digits. The converted
value will be prepended with zeros if necessary. If the precision
is 0 and the input value is zero, then:
|
1 |
Real conversion:
|
Precision controls the number of fractional digits after the decimal point. The converted value will be rounded if necessary. | 6 |
String conversion:
|
Precision controls the maximum number of characters from the input that will be displayed. If the string is longer than the precision, it will be truncated. | infinity |
Precision cannot be specified for types not listed in the table above. If no precision is specified, then the default precision will be used.
Field widths and field precisions can be specified with an asterisk
('*') to indicate that these values should be obtained from the data
vector. Each asterisk in a format specifier consumes an int
from
the data vector, in the same order as it appears in the format specifier. The
actual data to be converted is consumed last.
Example:
If the format specifier is "%*hd
", then there must be two elements in the data array corresponding to this format specifier. The first element is anint
specifying the field width, and the second element is ashort
that will be rendered as a signed decimal string.
Flags are single characters that indicate exceptions to the conversion type's default formatting behavior. A format specifier may have multiple flags, but some flags are mutually exclusive. Multiple flags can appear in any order. The following table lists all of the formatting flags supported by Printf for Java:
flag | effect | applicable conversion types |
---|---|---|
'- ' |
Output will be left- |
%c ,
%d ,
%e ,
%E ,
%f ,
%g ,
%G ,
%o ,
%p ,
%s ,
%u ,
%x ,
%X ,
%z[ n] ,
%Z[ n] ,
|
'^ ' |
Output will be centered in field. This flag is meaningless if no field width is specified. | |
'0 ' |
Field will be padded with leading zeros, inserted between sign character, if any, and value. |
%d ,
%e , %E ,
%f ,
%g , %G ,
%o ,
%p ,
%u ,
%x , %X ,
%z[ n] ,
%Z[ n]
|
'+ ' |
Non-negative values will begin with a plus character ('+'). |
%d ,
%f |
' ' |
Non-negative values will begin with a space character (' '). | |
'# ' |
Data will be represented in an "alternate form." This depends on the conversion type: | |
Non-negative octal values will begin with a zero ('0'). | %o |
|
Hexadecimal values will begin with "0x" or "0X" (depending on case of conversion type character). | %x , %X |
|
The integer portion of the result will end with a decimal point ('.'), even if the fractional portion is zero. |
%e , %E ,
%f
|
|
The fractional portion always appears, even if it is zero. | %g , %G |
|
If the character is special or unprintable, it will be output in escaped form. The output can be surrounded by single quotes to form a syntactically valid Java character literal. | %c |
In format strings, the percent character ('%') normally signals the beginning of a conversion type specifier. This can cause difficulties, however, if you actually want to output a percent character. To solve this problem, an escaping mechanism is used. To output a single percent character, just embed two percent characters in the format string. This conversion type does not consume input from the data vector.
It is sometimes convenient to embed line separators in format strings.
However, line separators vary from platform to platform. To avoid
platform-%\n
" instead of
"\n
", "\r\n
", or "\r
". Printf for Java
will automatically output the host platform's native line separator. This
conversion type does not consume input from the data vector.
Note: If platform-\n
' will probably produce acceptable results, with
slightly better performance.
The character count conversion type, "%n
", does not format data
or generate formatted output. Instead, when this format specifier appears in
the format string, the number of characters formatted up to that point is
counted, and the resulting Integer
value is written into the data
vector. Thus, when building the data vector for a format string containing this
conversion type, a slot must be reserved for the result. The original contents
of the slot are ignored and overwritten.
Example:
This program prints the string "My friend Sharky is 27 years old." It uses
%n
to obtain character positions so that it can underline the number 27.// These are the parameters: String name = "Sharky"; int age = 27; // This is the parameter list. We're leaving empty (null) slots where the %n // conversion type will write its results: Object[] params = new Object[] { name, null, // filled in by %n new Integer(age), null // filled in by %n }; // Now we'll generate the first line of output: Printf.out("My friend %s is %n%d%n years old.%\n", params); // Next, we'll use the feedback from %n to output the underline: int start = ((Integer) params[1]).intValue(); int stop = ((Integer) params[3]).intValue(); for (int pos = 0; pos < start; ++ pos) Printf.out(" "); for (int pos = start; pos < stop; ++ pos) Printf.out("-");Results:
My friend Sharky is 27 years old. --
Most format specifiers are not picky about the actual data types furnished in the data vector, as long as they are instances of the required base class. Printf for Java will utilizes the object's own conversion methods to obtain input data of the correct type. The required class for each input type, and conversion methods used, are given in the table below:
input type | recommended data type | required data type | conversion method |
---|---|---|---|
boolean |
java.lang.Boolean |
java.lang.Boolean |
booleanValue() |
byte |
java.lang.Byte |
java.lang.Number |
byteValue() |
short |
java.lang.Short |
shortValue() |
|
int |
java.lang.Integer |
intValue() |
|
long |
java.lang.Long |
longValue() |
|
float |
java.lang.Float |
floatValue() |
|
double |
java.lang.Double |
doubleValue() |
|
char |
java.lang.Character |
java.lang.Character |
charValue() |
java.math.BigInteger |
java.math.BigInteger |
java.math.BigInteger |
n/a |
java.math.BigDecimal |
java.math.BigDecimal |
java.math.BigDecimal |
n/a |
java.lang.String |
java.lang.String |
java.lang.Object |
toString() |
With these capabilities, the following observations can be made:
Number
can be supplied as data
to format specifiers requiring primitive numeric input types.toString()
can
easily be formatted as a string.StringBuffer
may be used in place of a
String
for the %s
conversion type.The %u
, %x
, and %o
format specifiers
typically represent unsigned integer conversions. This is conceivable
with primitive data types, such as int
, because the bit widths of
primitive types are fixed and known. By definition, however, a
BigInteger
, represents an "infinitely long" bit vector, so it is
not possible to treat a BigInteger
as an unsigned value. Because
of this, the format specifiers %Bx
, %Bo
, and
%Bu
produce undefined results when the input values are negative.
(Good luck.)
With all of the formatting options that are available in this specification, it should not be surprising that some options are mutually exclusive. Printf for Java validates each format string and rejects it if conflicting or incompatible options are discovered. A few of the conditions which might cause a printf format string to be rejected are listed below, with examples:
"%-^10d"
"%-d"
"%++d"
"%- d"
"%10.d"
"%q"
"%hs"
"%#n"
Most C implementations of printf
silent ignore errors and
generate unpredictable results without complaining. Printf for Java, however,
is not so forgiving. Printf for Java rejects ambiguous format strings by
throwing an exception, and hence a little more discipline is required from
programmer. We think this is a good thing. :-)
©1998-2004 Sharkysoft. All rights reserved.