Home Services People Process Tools More About Us |
Lazy Optimization
Ken Auer, President, RoleModel Software, Inc.
Kent Beck, First Class Software, Inc.
We both preach to our clients about performance tuning and productive programming. "Make it run, make it right, make it fast," is the mantra. As much as they seem to like the idea, in reality it makes them very nervous.
One client in particular had gotten a full dose of this medicine, and was moving ahead nicely on a complex application that stored Smalltalk objects in a relational database. They had gotten it to work, and then made sure the design for the overall framework was very elegant and flexible. Unfortunately, the system was not nearly as responsive as the user wanted. It was time to "make it fast" and time was running out.
Later, a developer pulled Kent aside and said, "You got me in trouble."
"About what?"
"You said I could tune performance after I got the design right."
"Yes, I did. What happened?"
"We spent all that time getting the design right. Then the users wanted sub-second response time. The system was taking as long as fifteen seconds to respond to a query. It took me all weekend to change the design around to fix the performance problem."
"How fast is it now?"
"Well, now it comes back in a third of a second."
"So, what’s the problem?"
"I was really nervous that I wouldn’t be able to fix the performance problem."
And that’s why there is a pattern called "Performance Assessment". It probably won’t change anything about how you program, but it will keep you from getting nervous.
Software development is a delicate piece of chamber music. If any one instrument protrudes, the quality of the piece is ruined.
Although many performance problems stem from lack of a coherent design, many others stem from solutions which optimize readability or flexibility over resource usage. In many cases this is the right tradeoff, as the costs of a system over its lifetime are dominated by how well it communicates to other humans. When the program has to go fast, though, it just has to go fast.
Languages (like Smalltalk) provide power that allows us to manage complexity, hiding how much binary logic is actually being exercised behind the functionality we take for granted. This isolation from these details makes it easier to write communicative but inefficient code.
Tuning in Smalltalk is the same as in any language. You either:
The primary forces driving the production of efficient programs are:
Below is a partial pattern language for creating efficient Smalltalk programs. Many of the higher level patterns are not specific to Smalltalk at all, but could be applied to all programs. Many of the more detailed patterns could also be applied to most languages, and most definitely object-oriented languages. However, in the interest of brevity and narrowing our audience, we chose to illustrate the patterns by assuming their application to Smalltalk. Additionally, as patterns get to lower levels of detail and move into the realms of "idioms" they tend to be less applicable to a wide range of languages. The authors have a high level of confidence in these patterns when applied to Smalltalk. We invite experts in other programming environments to use this as a template for a similar pattern language for their own environments and would be surprised if they did not find their own unique language specific patterns (or idioms).
Here is a graphical map of the patterns in this language. Patterns referenced but not included are in gray.
Performance Assessment
Prolog |
You are ready to begin developing the Use Cases in your Commitment Schedule in earnest. |
Problem |
How can you develop with the confidence that performance won’t be a problem? |
Forces |
Early in
program development is where you are most likely to hear
the siren call of premature optimization. Once you are
into the thick of development, you will be unlikely to
have energy or thoughts for anything but getting the
system done and done right. Unfortunately, it is early in
development where premature optimization has the
opportunity to do the most damage. Early in development is when you should be listening carefully to the users needs and taking stock of your available resources. If every thought is poisoned by "Yes, but how fast will that run," you will likely miss the chance to make large leaps in development. At the same time, performance is a legitimate concern. If the program doesn’t go fast enough, you will not if fact get paid (or keep getting paid). You need to assure yourself and your client that performance problems will not sink an otherwise brilliantly crafted and executed system. Therefore: |
Solution |
Indulge your performance worries by writing a short (e.g., two page) performance assessment of the system. Identify critical resources outside of Smalltalk. Do the math to figure out how big and how fast the final system will be. Prototype areas of concern if absolutely necessary. |
Discussion |
Some people
call these "envelope calculations," after their
tendency to be done on the back of an envelope. A typical
performance assessment might conclude that only one
mainframe access is allowable per user transaction. Hard real time systems have a long history of careful performance assessments. When you know you have to service an interrupt in 10 microseconds, well, that pretty much shapes at least one major subsystem. Most systems, although often called real time, are not nearly so restrictive. Areas that may require a prototype might be any unique area of your system that you deem somewhat complex and whose responsiveness is critical to the success of your application. For example, you may have a map that needs to display hundreds of small labels and icons representing particular locations. Throw together a prototype that pumps hundreds of random labels and icons to the screen in random spots. If it’s too slow, figure out what’s taking all the time and tune the prototype. When you are convinced it can be done (or can’t), throw out the prototype and start the real work (or revise your expectations) |
Epilog |
The Performance Assessment will help you decide how to shape the Concise Design. You may be able to define your Performance Criteria directly from the assessment. |
Lazy Optimization
Prolog |
Early Program quickly creates systems with the needed functionality. Concise Design refactors objects to improve clarity and flexibility. |
Problem |
How do you achieve acceptable performance with the least short and long term cost? |
Forces |
Users should
have the experience of being in control of the computer,
not the other way around. Even the best designed User
Interface will frustrate and hinder a user if it does not
respond with dispatch. Some programs spend considerable time in Smalltalk. These programs will be efficient to the degree they leverage but do not stress Smalltalk resources such as message sending and the garbage collector. Other programs are limited by resources out of the direct control of the programming language- virtual memory, database access, graphics calls, etc. Another standard of efficiency is how effectively a program uses the resources, how short the path is from a stimulus from the outside world to a stimulus for the outside world. Almost every professional programmer has had the experience of writing a program, taking heat because it is too slow, and (importantly) not being able to fix the performance problems in a timely fashion. Some programmers react to this rite of passage by insisting that performance is a legitimate concern throughout development. No request for functionality or proposal for design refinement can be considered until it has past the "can it run fast" test. This practice is the death of effective development. Progress slows to a crawl as every thought is forced to pass through the sieve of fear. Decisions are optimized for presumed performance impact rather than short or long term programming productivity and low risk. The result is large programs with limited functionality. They grow progressively more and more complex until they cease being valuable to the developing organization. The irony is that such programs are virtually impossible to keep running fast. Their lack of clarity robs you of the ability to identify performance bottlenecks. Even if you could find a problem, chances are you could not address it without making widespread changes, exactly the sort of activity you must avoid late in development. So, you have to worry about performance or your users won’t be satisfied, but it doesn’t help to worry about it all the time. Therefore: |
Solution |
Ignore efficiency through most of the development cycle. Tune performance once the program is running correctly and the design reflects your best understanding of how the code should be structured. The needed changes will be limited in scope or will illuminate opportunities for better design. |
Discussion |
It is
critical that this strategy is incorporated into your
project plan. Plan on some optimization activities toward
the end or at natural breaks, when functionality of a
section is expected to be complete. Not only will this strategy give you actual time to do the necessary optimization, but also it will appease early viewers of your program who deem it too slow. You can always say, "yes we know it is too slow. You can see where, in our project plan, we plan on addressing this issue." Of course, if you don’t give yourself enough time to actually do the tuning, your still going to look bad. One rule of thumb is to assume 35% of the development time will be required for tuning for those inexperienced at the process of tuning in Smalltalk, decreasing to 10% for experienced developers. |
Epilog |
Agree with the client on the Performance Criteria by which the program will be measured. Create a Performance Measurement that automatically runs the part of the program judged by users to be too slow. |
Performance Criteria
Prolog |
The Performance Assessment may have discovered easily quantifiable criteria. Later, reviewers of your system working after your Concise Design may have identified some concerns. |
Problem |
How do you know what is worth tuning and what is not? |
Forces |
The naive
view is that the whole program should be as fast as
possible. This leads you to either commit to less
functionality than you should implement or muddle the
design so badly you can’t get anything to work
right, much less work fast. Many parts of your system will work without any complaints about performance even if they are implemented the simplest possible way. In addition, you cannot be sure in advance where performance bottlenecks will lie[Barry91]. You can’t even be sure which bottlenecks will bother users and which won’t. Therefore: |
Solution |
Ask your client to give you ballpark response time or throughput criteria for your system. Be prepared to revise this list as the system matures. |
Discussion |
At first you
just want to know what the client is sure to notice.
Typical performance criteria are:
As time goes on, your client (or you) may notice new things that just are not snappy enough. Make sure the particular complaints are specific and goals are explicitly stated. |
Epilog |
Performance Criteria need a Threshold to give you a concrete goal. Each Criteria should be accompanied by a set of Performance Measurements. |
Threshold Switch
Prolog |
You have identified the performance criteria for your system. You have begun to improve the problem areas. |
Problem |
How fast is fast enough? |
Forces |
You might
think that once you have settled on a Performance
Criterion, your job is to make it as fast as possible.
This is almost never true. Programs that are to be used by people are both constrained by and can take advantage of the quirks of human perception. In particular, actual performance and perceived performance are not linearly related. Once you get past a perceptual threshold, substantial further measured performance improvement will result in little or no perceived improvement. Therefore: |
Solution |
Agree that as soon as you meet the Performance Criteria, you will stop tuning. If you’ve beaten the pants of of it (e.g., needed sub-second, now performs in 0.2 seconds) with the first Experiment, great. However, don’t make new goals of seeing how much you can beat the Criteria. When you hit the threshold of what is needed, stop tuning! |
Discussion |
Always remember that the major cost of software is in maintenance. Most techniques used to improve performance sacrifice a bit of maintainability. Avoid them at all costs. |
Epilog |
Once you’ve agreed to the concept of a Threshold Switch, you are ready to begin the actual process of tuning. However, before you do, you must gather Performance Measurements to make the Switches real. |
Performance Measurement
Prolog |
You’ve established Performance Criteria and agreed to Threshold Switches |
Problem |
How do you consistently measure performance and ensure that the system keeps running while you are tuning and know when you have triggered your Threshold Switches? |
Forces |
As already
noted, programmers typically aren't very good at
identifying bottlenecks. Experienced programmers (those
who have made more mistakes than you have ever dreamed of
making) may be able to make a good guess at what's
causing these bottlenecks. If you've got one of these
people around, getting them to guess would be a great
improvement over guessing yourself. If you have a good list of efficiency hacks, e.g. "when you see A, do B instead", you could look for all occurences of the "slow" ways of doing things (e.g. As) and replace them with their faster counterparts (Bs). Odds are good you'll end up with a much faster program, but it will take an awful long time to find all of the potentially slow things you've done. Also, there may be so many places you have to change, you might wonder why you didn't just write your code like that in the first place. Then you'll look at the probably unreadable result and remember why. Chances are you will have replaced a lot of code that ran "fast enough" if you just left it alone. Even if you've done this exhaustive search and destroy, you may have missed some bottlenecks that were not in your catalog of efficiency tricks. You might even have missed the one or two key bottlenecks that, if corrected first, would have made all the other efficiency hacks unnecessary. Therefore: |
Solution |
Always write a routine that tests and times the apparent performance problems in such a way that it does not rely on human intervention. Use a profiler to identify the biggest bottlenecks in the area in question. |
Discussion |
A VisualWorks
customer was very concerned about the performance of
their system. They were concerned that all the horror
stories they heard about Smalltalk being too slow for
real systems were true. It seemed that the fundamental
section of their system was taking minutes to do
anything, and they needed it to take seconds. They
couldn’t figure out what was taking so long no
matter how hard they tried. In a last ditch effort to figure out whether their work could be salvaged, Dave Leibs, Smalltalk performance guru from ParcPlace Systems, Inc. was called in for a day of consulting. They quickly described an overview of what they were trying to do. Nothing stood out to Dave as anything that would bring Smalltalk to its knees. So, he said, "Where did the Profiler indicate most of the time was being spent?" "What’s the Profiler?" So he spent the next 30 minutes or so explaining what the Profiler was and helping them install it. He then ran the profiler on the section of code in question and found it was spending more than 98% of it’s time in Date>>printOn:. Dave thought for a second what might be so slow about Date>>printOn: and felt it should not cause so great a problem. So, he looked a little higher up the profile tree and examined the containing method. He noticed the expression ‘Date today printString’ inside of a loop. So he chose an Experiment and pulled the expression out of the loop, storing the value in a temporary variable which he then used inside the loop. Ten minutes after running the Profiler for the first time, he ran the code again. This critical section of code that was central to the client’s system now took less than a second. Dave then turned to his hosts and said, "So, what are we going to do for the rest of the day?" |
Epilog |
After addressing the bottleneck (see [Hot Spot]) run the Performance Measurement again to verify whether you actually improved performance, and whether you've brought it down to meet the Performance Criteria. If you have not continue to use [Hot Spot] until you've met the criteria or you have reached a point where you have to either give up or [ReThink Approach]. Before moving on to the next bottleneck, verify that it still exists, as well-factored methods that are tuned often perpetuate performance increases in other areas that rely on the method. You may have killed two or more birds with one stone. |
Hot Spot
Prolog |
You’ve identified several areas of your system that don’t meet your Performance Criteria |
Problem |
Where do actually begin tuning? |
Forces |
You could
just list out the problem areas and assign them as tasks
on a queue and knock them off one at a time. Individuals
who are working on your project could just take the next
problem on the list . The order in which they are
addressed wouldn't really matter if you insist all of
them must be addressed, but you should probably
prioritize them as some might be nastier than others and
you don't want to spend the time fixing the simple ones
if the critical ones are the real show stoppers. If
individuals finished their task before others, they could
just take the next available task. Pretty simple, isn't
it? However, dividing the work this way takes away the collaborative effort of individuals and the benefit of finding commonality of problems. On the other hand, if you have everybody standing around analyzing the best way to go about solving the problems, they won't ever get addressed. |
Solution |
Profile most or all performance problems before putting a lot of energy into prioritizing, assigning, and fixing them. Identify the apparent places (at a more micro level) where the majority of time is going in each. These are the Hot Spots on which you should focus your tuning efforts. You may be surprised to find that multiple performance problem areas are due to a particular approach that spans multiple areas. The best approach to address these hot spots may be very different when taking into account several contexts in which the problems occur, and priority items will become more apparent. Once these are addressed, if performance problems persist, repeat this pattern to identify the next set of hot spots. |
Discussion |
If you are working on a large system, you may want to first categorize problems and run the profiler on a particular category of problems. You want to avoid waiting until everyone is waiting to tune to start your profiling efforts. Again find natural breaking points in the project (see [Lazy Optimization]) where a developer is available to work on areas where Concise Design is completed. |
Epilog |
In addressing the Hot Spots, look through the remaining patterns to see whether the corresponding code matches the symptoms, and apply the suggested solutions. |
Experiment
Prolog |
You have identified an apparent Hot Spot but are not certain you have the solution that will get you to your Threshold Switch. |
Problem |
How do I find the best performance solution? |
Forces |
The mindset
at this point in time is "I need to make this thing
faster or I’m dead meat". However, we must remember the points made in Threshold Switch with respect to efficiency hacks sacrificing later maintainability. Therefore: |
Solution |
Treat each application of an efficiency hack as an experiment. If it didn’t get you to the desired Criteria (e.g., you needed 10x improvement and it gave you 0.1x), make a note of what it did give you and undo it. Keep only the experiments that actually get you to the criteria. You may have to combine several to get you there, but concentrate on the big bang experiments vs. the little improvements. |
Discussion |
In the
discussion of Performance Measurement we presented a
problem which may have indicated that we needed to tune
Date>>printOn:. Suppose Dave Liebs took some time
ripping apart the Date>>printOn: method and putting
it back together so it was 40% faster, albeit less
concise and in danger of blowing up in the fourth leap
year in the 21st century. He then conducted the
Experiment (Caching Temporary Variable) in the higher
level method that gave him the big bang. Now leaving the
faster printOn: method would have the critical section
run in 0.92 seconds instead of 0.93. Not only is the new printOn: method not worth leaving it in now for performance reasons, but also think about future ramifications of leaving it in. Some poor maintenance programmer comes along with the thankless job of upgrading the system to the next release of VisualWorks. He first tries to find out all of the base image changes his company has made that need to be carried forward. He discovers this change (which may or may not have a comment explaining what was done to it). He also notices that the new version of VisualWorks has a different version of the method than the previous version (before the hacked change). People will start noticing the amount of time trying to figure out what to do here more than they ever noticed the 0.01 seconds it bought them in the first release of the product. |
Epilog |
You should be getting closer to your desired Performance Criteria with little impact to your Concise Design. |
Cacheable Expression
Prolog |
The Profiler has identified a particular expression as a significant contributor to execution time. An expression is executed many times, often with the same result. |
Problem |
How do we reduce the overhead of this repeated expression with the least impact on the integrity of the system? |
Forces |
As in
previous patterns we would rather not compromise the
design of the system in removing bottlenecks. We would
also prefer to minimize the overhead of objects.
Additional variables (e.g. instance variables) often
proliferates side effects of additional supporting
methods, not to mention the memory allocated to hold the
corresponding value. A variable cache eliminates design
possibilities and offers opportunities to break
encapsulation. On the other hand, the result of an expression takes up memory somewhere (e.g. the execution stack) no matter what. We shouldn’t get too carried away with the avoidance of memory usage. |
Solution |
If you find an expression that often returns the same result, find the lowest overhead type of variable that will alleviate the redundancy. This will cause a performance improvement while minimizing the design compromises[Auer95]. |
Discussion |
Several years
ago, a major portion of a Process Modelling System with
which we were involved was devoted to re-routing and
re-drawing diagrams. After a redesign of a prototype (Concise Design), performance had improved significantly (3-5x) without performance being a primary goal. However, this performance was still deemed unacceptable. Performance Criteria were established. In a period of one week, system response time was improved roughly one order of magnitude.[White91] Due to the Concise Design there was little to overhaul. Most of these performance improvements were due to finding common expressions and speeding them up by Caching Expressions. On the other hand, keeping may force other methods to be aware of caches. For example if a Rectangle caches its area in an instance variable, all methods that allow changes to the Rectangle’s coordinates must do something to make sure the cached area is reset. Caching variables can also threaten the integrity of the object containing the cache itself. For example, before the area cached instance variable is added, the following expression (although bad practice) would not threaten the integrity of the object:
The implementor of Rectangle chose to take the risk of exposing it’s instance variable for a variety of reasons knowing that, as the implementation stood, such an expression as this would not break anything. As soon as the Cached Instance Variable is added, such an expression would result in a problem as there would be no way to trigger a cache reset without additional mechanisms. |
Epilog |
Overhead and risk typically are lowest for [Caching Temporary Variable], followed by [Caching Argument], [Caching Class Variable], [Caching Class Instance Variable], [Caching Instance Variable] |
Caching Temporary Variable
Prolog |
A Cacheable Expression is causing a Hot Spot, most often because it occurs in a loop. |
Problem |
How do you increase the performance of the loop? |
Forces |
Sacrificing
good design will cost you during maintenance. Introducing
a cache introduces the possibility that its value will
become incorrect during the life of the method. Often seemingly harmless expressions inside a loop return the same result each time |
Solution |
For each expensive expression in the loop that returns the same value each time through, add a temporary variable which is set to the result of the expression before the loop begins and replace all occurences of the expression with the corresponding temporary variable. |
Discussion |
For example,
if you wanted to collect a subset of a table based on
selected column names, your code may look something like:
Realizing the column indices of interest are going to be the same each time through the loop, you could greatly reduce the amount of times #columnIndexFor: is sent as follows:
|
Epilog |
Although this may fix many problems, it is possible that the expression might still be repeated many times due to pervasiveness of a containing method and performance may still not meet the Performance Criteria. If so, look for caching with a larger scope, such as [Caching Instance Variable]. |
Caching Argument
Prolog |
Multiple objects are creating similar transient objects to create pieces of the desired end result. |
Problem |
How can we reduce the number of transient objects? |
Forces |
Keeping "collector" object in instance, class, class instance, or global variable (depending on scope) has undesirable costs. Although it does increase performance, it also adds a lot of overhead and tends to hold onto objects that are not needed past the completion of an operation. Therefore: |
Solution |
Keep a single transient object that can be used by multiple objects which contribute to the final result and pass the object around as an argument to all the objects that can put it to use. |
Discussion |
Here is an
example of an algorithm that uses collection
concatenation, which causes lots of transient objects to
be created.
Objects convert themselves into strings by creating a Stream on an empty string and sending themselves the printOn: message with the Stream as an argument. We could instead create a single Stream which all objects would write to.
Another example would be to pass a collection into which all objects which had something to contribute would add to.
The addFemaleChildrenInto: method would do nothing but add to the passed collection, or do nothing if there are no female children. It could be used instead of:
which would create an intermediate collection, maybe even an empty one. |
Epilog |
Applying this pattern may result in modifications to multiple parts of the system if protocol must be changed to use the Caching Argument. Typically, this can be avoided by adopting the old protocol to still be supported using the new protocol with the Caching Argument as it’s foundation. However, this may potential hurt the performance of systems using the original protocol. Use Performance Measurement and Performance Criteria to identify if this is a problem. |
Caching State Variable
Prolog |
A Cacheable Expression is causing a Hot Spot, but it cannot simply be cached in an argument or temporary variables since the contexts in which it is necessary are varied. |
Problem |
How do you increase the performance of expressions that are needed in a variety of contexts? |
Forces |
You never
want to limit design possibilities. No one can tell what
new requirements will come later and you would like to
easily map these requirements into simple methods in
existing classes. Every time you add a state variable, a
burden is placed on the developer to make sure every
method that may have an effect on that state is aware of
it. It can no longer be concerned with simply, giving an
answer. It needs to be concerned with side-effects. However, sometimes, ignoring the implied side-effects of an operation is exactly what causes performance problems. Everything must be calculated dynamically because "you never can be sure" whether the environment has changed since the last time the result of the expression was needed. In the context of a single method, or a chain of methods, knowledge of a local environment (temps, arguments, etc.) can be exploited via Caching Temporary Variables or Caching Arguments. The decisions to use these types of Cacheable Expressions affect no one else but the execution stack which is pretty well secure from the rest of the system. Unfortunately, the system encompasses much more than this local environment and such techniques can not solve all performance problems. Yet, we still want to avoid caching expressions at a level that exposes the cache to the entire system, and therefore limits design possibilies and exposes integrity problems on a wide scale. For instance, if we kept the results of an expression for anyone to use in a global variable all methods that wanted to benefit from the results of the cache would have to know about its existence by name. Additionally, any actions anywhere in the system that would invalidate its value would have to explicitly reset it. You would like to encapsulate the Cacheable Expression in as low a level as possible that will affect all of the contexts which would like to exploit the cache. Classes, MetaClasses, and Class hierarchies each define an environment narrower than the system level but broader than the execution stack. Each define their own behavioral environment. Therefore: |
Solution |
Identify the behavioral environment (Class, Class hierarchy, or MetaClass) that encapsulates all of the contexts in which the Cached Expression can be shared before the probability of resetting the value would be expected. Add a corresponding state variable (instance, class, or class instance). Give it the name of the message to be cached. Rename the message to be cached by prepending "compute" to the corresponding selector. Use lazy initialization to compute the value of the variable. Modify other parts of the behavioral environment to make sure this value is reset whenever some event occurs that could possibly make this result obsolete. |
Discussion |
If you needed
a solid dot icon, you could recompute it every time.
Realizing the icon will be the same for all instances of all classes in the hierarchy:
If the dot size needed to change based on user preferences changing, a hook would have to be inserted to flush the Dot whenever the user preferences changed. Similarly, an instance method
could become
and methods that set the parts of the fullName would be modified to reset the fullName as the example below shows:
If you needed an icon to represent the on state for a particular class of button, but the icons were different for different classes of buttons in the hierarchy, you could implement:
Realizing the icon will be the same for all instances of all classes in the hierarchy:
Clearing the cache of class instance variables may require some awkward code in the root class to ensure that all caches are cleared simultaneously:
|
Epilog |
With any Cached State Variable, the potential is there to eliminate design possibilities or invalidate the previous class design assumptions. Be careful to adjust your Concise Design if the implications warrant. |
Simplification
Prolog |
You notice several methods are Hot Spots in a certain context that cannot be simplified while still maintaining the integrity of the system for all instances of the same class. |
Problem |
How do we improve the Hot Spots for the context in question without causing undesirable side-effects in other contexts where instances of the same class is used? |
Forces |
Changing a
method in such a way as to enable unadvertised
side-effects under certain conditions, no matter how
remote, is inviting trouble. Case statements could be inserted to detect a particular context. However, such statements are ugly and often slow down execution for the more general cases due to the overhead of case checking. Therefore: |
Solution |
Examine the features of a class that are actually being used by a particular instance or group of instances. Determine if there is another existing class or a simplified version of the class being used which could provide the desired features more efficiently. |
Discussion |
There are
many examples of existing classes that act functionally
similarly for many operations, yet offer distinct
performance advantages. Arrays are generally more efficient than OrderedCollections. They take up less space, don’t have extra instance variables to manager to denote the beginning and end of collection, and avoid a level of indirection in accessing their indexed instance variables. Symbols, once created, are much faster to compare than Strings as they are merely comparing object pointers as opposed to comparing each individual element contained by the String. Additionally, one can often create classes that are simplifications of more flexible classes if that flexibility is not needed. For example, ComposedText in VisualWorks will handle various composition features of text (word-wrapping, tabs, changes of fonts, etc.). If you simply want to display a simple ASCII String in a single font with no special characters or composition considerations, one could create a simple class, such as VisualString which simply calls the primitive for displaying a string using the particular font without any intermediate effort. |
Epilog |
Often after Simplification you may find that other areas of your system are using more complex classes than they need to be unnecessarily. You may wish to revisit your Concise Design to delete classes that are redundant. |
Transient Reduction
Prolog |
You have some set of objects, A, and want some set of objects, B. |
Problem |
How do I get from A to B in the most efficient manner? |
Forces |
Objects are wonderful and they can do wonderful
things for us very elegantly. Encapsulation helps us to
hide the complexity of what's happening behind the scenes
when we want some result. There may be tens, hundreds of
thousands of objects that are created in helping to
determine the desired result that we never even see.
These objects give their life for our result and we don't
even know they ever existed. Some thanks they get. Ah, but their dirty little secret is they don't like going unnoticed, so they leave their mark by gobbling up CPU cycles. Creation here, a few messages sent there, a little garbage collection to clean up the mess and eventually someone might notice that more time has gone by than they had planned. |
Solution |
Reduce the number of transient objects in a given operation. |
Discussion |
A client of ours once was complaining about Smalltalk and how the garbage collector was constantly interrupting and taking up too much time during many of their graphical programming operations. After analyzing the situation, we found that moving a particular object from one place to another was causing tens of thousands of other objects to get created that were all short-lived and mostly unnecessary. The problem wasn’t the efficiency of garbage collection, but the constant garbage creation. |
Epilog |
This can be accomplished using several other patterns, e.g. [Transform Objects], [Caching Argument], [Pre-allocated Collection], [Concatenating Stream] |
Object Transformation
Prolog |
I have some set of objects, A, and I want some set of objects, B of the same classes. Once I have B, I no longer need A. |
Problem |
How do I get from A to B in the most efficient manner? |
Forces |
Overhead of creating new object, garbage
collecting old one. Preserving encapsulation. |
Solution |
Instead of creating new objects to replace the old objects, transform the old objects into the objects you desire. |
Discussion |
For example: if you have a Rectangle of a
particular shape and position, and you want to replace it
with a new Rectangle with a similar shape, but different
position, you could simple send it the message
translatedBy: with the appropriate argument to get the
resulting Rectangle. However, this creates a new
Rectangle and two new Points causing the old Rectangle
and its two points to be garbage collected. Instead, you
could send the message moveBy:
which results in the old Rectangle being transformed to the new, creating one less object and adding one less object to the garbage heap. If one really wanted to throw caution to the wind, and was sure the original Rectangle was the only object referring to the Points held by its origin and corner variables. One could write a method (and send the corresponding message) that reused the same Points and therefore created no new objects nor added any objects to the garbage heap. E.g.
|
Epilog |
Hypoth-a-sized Collection
Prolog |
A Hot Spot is time spent adding objects to a collection. |
Problem |
How can I make this more efficient? |
Forces |
Collections are designed such that the details
of how they manage to hold all of the objects one is
placing in them is hidden. Most collections start out
with relatively few number of slots to place objects in
as to avoid wasting space. When it needs more space, it
simply allocates a bigger chunk of space, copies the
objects previously held to the new space, and no one is
the wiser... except the profiler. When adding many objects to a relatively small collection, much of the time is spent growing incrementally larger buffers and copying objects from the old to the new buffer. |
Solution |
When you realize an operation is bound to add a significant number of objects to a collection, make sure the collection is big enough, based on a hypothesis, to hold the results without growing, or at least with minimal growing. |
Discussion |
E.g. instead of something like:
allocate enough space in the collection to avoid growth:
|
Epilog |
Concatenating Stream
Prolog |
You may be collecting results in a Collecting Temporary Variable. You may be collecting results over several methods in a Collecting Parameter. |
Problem |
How do you concatenate several Collections efficiently? |
Forces |
Concatenation
(,) is a simple way to join several Collections. When you
have lots of Collections, though, it can be slow because
the objects in the first Collection are copied once for
each Collection that is concatenated. Streams provide a way out of this dilemma. Streams are careful not to copy their contents many times. |
Solution |
Use a Stream to concatenate many Collections. |
Discussion |
We ran a
quick benchmark to see how significant the difference was
between Collection concatenation and Streams. We ran the
following code, which concatenates one thousand Strings:
It took 53 seconds to run. Then we ran the Stream version:
It took 0.9 seconds to run. The difference is a factor of almost 60. Sometimes you will use Concatenating Stream not because it is your choice, but because you have to fit into other code that already uses it. The most common example is specializing object printing by overriding printOn:. Note that this pattern can also be combined with Hypoth-a-sized Collection to create a Stream whose collection can avoid growing. |
Epilog |
Copyright © 2002-03 by RoleModel Software, Inc. All rights reserved. |