.. index:: active_tol Example Improving Performance with active_tol -------------------------------------- .. note:: This is an experimental tweak to performance. Using it requires a high level of understanding of optimization and the structure of your problem. Some optimizers use an active set method, whereby their constraints are marked as active or inactive depending on their proximity to the feasible region. If a constraint is far enough from the feasible region, it is essentially redundant to the set, so an optimizer can ignore it and mark it as inactive. When this occurs, it no longer needs functional or gradient evaluations for that constraint. Since gradient calculation can be a major source of computation, some performance can be gained if we can omit calculating the derivatives for inactive constraints. The ideal way to do this would be to gain access to the optimizer internals and promote that information to OpenMDAO. This was not conveniently available, so we have instead provided an argument to `add_constraint` that lets you specify how far from your constraint boundary you need to go before you consider it to be inactive. This is most easily used on geometric problems where you can clearly visualize when a constraint is completely occluded by other constraints. The following restrictions apply to using the active tolerance. - Optimizer must support active set methods (only SNOPT in ``pyoptsparse`` at present) - Only works for adjoint mode, so `mode` in `root` linear solver must be set to "rev" - Relevance reduction must be enabled ("single_voi_relevance_reduction" set to True in root linear solver) Let's consider a problem where we have 7 discs with a 1 cm diameter, and we would like to arrange them on a line as closely together as possible without overlapping. We can do this by minimizing the sum of the distances between each disc and its 6 neighbors. Now, we don't want any of our discs to overlap, so we need to constrain each of them so that the distance to every other disc is greater than 1 diameter. The code for this is below. We used an `ExecComp` because the equation for distance is simple to write. To make a point about derivative calculation, we use OpenMDAO's built-in profiling. This time, when we set up the profiler, we tell it to only count every time `apply_linear` (which is the workhorse derivatives function) is called on `Component`. The output will be placed in a file that we can process later. .. testcode:: active_tol_example from __future__ import print_function from six.moves import range import numpy as np from openmdao.api import Problem, Group, pyOptSparseDriver, ExecComp, IndepVarComp if __name__ == '__main__': # So we compare the same starting locations. np.random.seed(123) diam = 1.0 pin = 15.0 n_disc = 7 prob = Problem() prob.root = root = Group() driver = prob.driver = pyOptSparseDriver() driver.options['optimizer'] = 'SNOPT' driver.options['print_results'] = False # Note, active tolerance requires relevance reduction to work. root.ln_solver.options['single_voi_relevance_reduction'] = True # Also, need to be in adjoint root.ln_solver.options['mode'] = 'rev' obj_expr = 'obj = ' sep = '' for i in range(n_disc): dist = "dist_%d" % i x1var = 'x_%d' % i # First disc is pinned if i == 0: root.add('p_%d' % i, IndepVarComp(x1var, pin), promotes=(x1var, )) # The rest are design variables for the optimizer. else: init_val = 5.0*np.random.random() - 5.0 + pin root.add('p_%d' % i, IndepVarComp(x1var, init_val), promotes=(x1var, )) driver.add_desvar(x1var) for j in range(i): x2var = 'x_%d' % j yvar = 'y_%d_%d' % (i, j) name = dist + "_%d" % j expr = '%s= (%s - %s)**2' % (yvar, x1var, x2var) root.add(name, ExecComp(expr), promotes = (x1var, x2var, yvar)) # Constraint (you can experiment with turning on/off the active_tol) #driver.add_constraint(yvar, lower=diam) driver.add_constraint(yvar, lower=diam, active_tol=diam*2.0) # This pair's contribution to objective obj_expr += sep + yvar sep = ' + ' root.add('sum_dist', ExecComp(obj_expr), promotes=('*', )) driver.add_objective('obj') prob.setup() print("Initial Locations") for i in range(n_disc): xvar = 'x_%d' % i print(prob[xvar]) # Run with profiling turned on so that we can count the total derivative # component calls. from openmdao.api import profile, Component profile.setup(prob, methods={'apply_linear' : (Component, )}) profile.start() prob.run() profile.stop() print("\nFinal Locations") for i in range(n_disc): xvar = 'x_%d' % i print(prob[xvar]) total_apply = 0 for syst in root.subsystems(recurse=True): if 'dist_' in syst.name: total_apply += syst.total_calls print("\ntotal apply_linear calls:", total_apply) Note that we defined the variable "n_disc" for the number of discs, so component and variable names such as "dist_1_2" and "y_2" had to be created with some string operations. :: Initial Locations 15.0 13.482345928 11.4306966748 11.1342572678 12.7565738454 13.5973448489 12.1155323006 Final Locations 15.0 12.9999999413 9.99999993369 8.99999991376 11.9999999405 13.9999999687 10.9999999358 Note that this lines our discs up neatly so that they are touching each other with their centers ranging from 9 to 15. Note that we chose a distance of 2.0 times the disc diameter as our "active_tol". When we do this, and have 3 discs in a row, then the distance constraint between disc1 and disc3 is inactive, so its gradient is not calculated. We can look at the profiling output by issuing the following command at your operating system command line: :: proftotals prof_raw.0 We want the grand totals, which are printed last: :: Grand Totals ------------- Function Name, Total Time, Calls apply_linear, 0.00373601913452, 183 So, did our active tolerance really do anything? If we turn it off and compare the number of `apply_linear` calls by running and postprocessing a second time: :: Initial Locations 15.0 13.482345928 11.4306966748 11.1342572678 12.7565738454 13.5973448489 12.1155323006 Final Locations 15.0 12.9999998135 9.99999980586 8.99999978593 11.9999998126 13.9999999687 10.999999808 The optimium is essentially the same. The time: :: Grand Totals ------------- Function Name, Total Time, Calls apply_linear, 0.00686693191528, 344 So almost half of the `apply_linear` calls turn out to be unneeded. This would normally be a pretty bad case to run in adjoint mode because the number of constraints varies with n_disc by (n_disc**2)/2 - n_disc, while the number of design variables only varies by n_disc. However, a good choice for "active_tol" cuts out a significant number of the extra gradient calculations.