Printf formatting is often used in loops to produce neatly formatted tables. Sharkysoft's Printf for Java was designed with the ability to accelerate printf formatting within loops. To demonstrate the many ways in which Printf for Java can be optimized, we'll examine a simple program that outputs an ASCII table. Each of the versions shown below produces exactly the same output.
This first example shows the easiest way to use printf in Java.
import lava.clib.Stdio; import lava.clib.stdarg.Va_list; class asciitable_1 { public static void main (String[] args) { for (int i = 0; i < 256; ++ i) Stdio.printf ( "%#10c%+10d%10.4o%#10.2x\n", new Va_list () . add ((char) i) . add (i) . add (i) . add (i) ); } }
Note that printf
is a method of the Stdio
class, reminding us of the fact that in C, printf
was part of the stdio
library. Thus, before you can use printf
in Java, you must first import the Stdio
class. Note that this feels remarkably similar to including the "stdio.h
" header file in a C program.
Since Java does not support variable argument lists, a special class, Va_list
, is used to emulate them. This class is analogous to the va_list
macro (in the stdarg
library) used to implement variable argument lists in C. (In fact, most C implementations of va_list
use this macro.) Va_list
's add
method is overloaded for all of the primitive Java types. The completion of the argument list is (optionally) indicated by calling the done()
method.
We're just kidding ourselves. We all know that no matter how you dress it up, in Java an untyped argument list is nothing more than a vector of Object
s. In fact, I'll confess now that Va_list
uses java.util.Vector
in its implementation. In the above example, each add
call appends its arguments to the list, and the printf method automatically converts the vector into an Object array.
So let's skip the overhead of building an Object
array one element at a time and do it the fast way instead:
import lava.clib.Stdio; class asciitable_2 { public static void main (String[] args) { for (int i = 0; i < 256; ++ i) Stdio.printf ( "%#10c%+10d%10.4o%#10.2x\n", new Object[] { new Character ((char) i), new Integer (i), new Integer (i), new Integer (i) } ); } }
Pay careful attention to the syntax used here to build the argument list. Not only does this syntax cause the Java compiler automatically chooses the correct length for the Object array (in this case, 4), but it also populates the array with the 4 objects. This approach requires significantly less processing overhead than Va_list
or Vector
, because the final array length is known ahead of time (eliminating the need for dynamic array growth) and because there is no need to make a copy of the vector when you are done.
Here's another handly trick. Since the last three parameters to the argument list are the same value, why waste time creating three identical objects? Instead, let's optimize the loop further by creating just one Integer object, and then reusing it to populate the array:
import lava.clib.Stdio; class asciitable_3 { public static void main (String[] args) { for (int n = 0; n < 256; ++ n) { Integer i = new Integer (n); Stdio.printf ( "%#10c%+10d%10.4o%#10.2x\n", new Object[] { new Character ((char) n), i, i, i } ); } } }
This optimization causes 2 less objects to be created every iteration.
Before printf
can produce output, the format string must be parsed. In the above examples, the format string must be parsed once for each pass through the loop. What a waste of time! Why not just parse the format string outside the loop, and then reuse the same parse results inside the loop? This would certainly be an optimization. In Lava, printf format strings can be "pre-parsed" by constructing instances of PrintfFormatString
. When PrintfFormatString
objects are used in place of a regular String
objects for format strings, the resulting printf operation is always faster. The PrintfFormatString
s are like ready-to-go "formatting engines." Here's what the revised code looks like:
import lava.clib.Stdio; import lava.clib.stdio.PrintfFormatString; class asciitable_4 { public static void main (String[] args) { PrintfFormatString fmt = new PrintfFormatString ("%#10c%+10d%10.4o%#10.2x\n"); for (int n = 0; n < 256; ++ n) { Integer i = new Integer (n); Stdio.printf ( fmt, new Object[] { new Character ((char) n), i, i, i } ); } } }
This little trick saves us the overhead of parsing the format string with each iteration through the loop. Whereever printf is used in a loop or in a method that is called repeatedly, the programmer should consider storing the pre-initialized format string as a private static
variable in the class. This way it will be initialized just once, when the classes is loaded, and ready for use each time it is needed. It's too bad the C version of printf
didn't have this feature.
You may think that we have completed all the optimization one could hope for, but would you believe there's more? Most C programmers are unfamiliar with the fact that printf
has a return value. The ANSI C specification of printf declares that in normal circumstances, the value returned by printf shall be the total number of characters formatted.
So what does this have to do with speeding up asciitable
? Well, the relationship is indirect. You see, Stdio.printf
doesn't really do any work. Instead, it delegates the job to another method, which performs the formatting and returns the results in a String
. The Stdio.printf
method simply writes the String
to System.out
and returns the String
's length.
In our case, however, we don't really care about counting characters, so we can bypass the syntactic sugar of Stdio.printf
and call the formatting routine ourselves.
The main printf formatting workhorse is -- you guessed it -- the Printf class (that's right, I said "class," not "method"). The Printf
class has just one (overloaded) method, format
. Like Stdio.printf
, when you call it you must supply a format and arguments. Unlike Stdio.printf
, however, Printf.format
does not produce stream output; it only returns the formatted string. What you do with it when you get it is your own business, not Printf
's. Let's have a look:
import lava.clib.stdio.Printf; import lava.clib.stdio.PrintfFormatString; class asciitable_5 { public static void main (String[] args) { PrintfFormatString fmt = new PrintfFormatString ("%#10c%+10d%10.4o%#10.2x\n"); for (int n = 0; n < 256; ++ n) { Integer i = new Integer (n); System.out . print ( Printf.format ( fmt, new Object[] { new Character ((char) n), i, i, i } ) ); } } }
You see what's happening here? We're taking the formatted String returned by format and passing it to System.out.print
for printing, bypassing the call to Stdio.printf
entirely.
Compare asciitable_1
to asciitable_5
and see how much our code has evolved. The first version was syntactically sweet, but the price for this sugar was unnecessary processing overhead at runtime (burning more calories, to keep the analogy). The final version is not as pleasant to look at (health food), and requires significantly more labor from the programmer (gardening), but the optimizing techniques employed result in much faster output (OK, analogies fail me here).
A simple benchmark I ran, based on these examples, indicates that printf
calls complete approximately 32% faster when the optimization techniques of asciitable_5
are used. For small formatting jobs, this speedup may not be worth the extra coding effort. However, if your program uses printf
to produce large, formatted tables that are thousands of lines longs, this 32% speedup may be worth the effort.
©1998-2001 Sharkysoft. Comments on this article are welcome.