Regularization-free multicriteria optimization of polymer viscoelasticity model

This paper introduces a multiobjective optimization (MOP) method for nonlinear regression analysis which is capable of simultaneously minimizing the model order and estimating parameter values without the need of exogenous regularization constraints. The method is introduced through a case study in polymer rheology modeling. Prevailing approaches in this field tackle conflicting optimization goals as a monobjective problem by aggregating individual regression errors on each dependent variable into a single weighted scalarization function. In addition, their supporting deterministic numerical methods often rely on assumptions which are extrinsic to the problem, such as regularization constants and restrictions on parameter distribution, thereby introducing methodology inherent biases into the model. Our proposed non-deterministic MOP strategy, on the other hand, aims at finding the Pareto-front of all optimal solutions with respect not only to individual regression errors, but also to the number of parameters needed to fit the data, automatically reducing the model order. The evolutionary computation approach does not require arbitrary constraints on objective weights, regularization parameters or other exogenous assumptions to handle the ill-posed inverse problem. The article discusses the method rationales, implementation, simulation experiments, and comparison with other methods, with experimental evidences that it can outperform state-of-art techniques. While the discussion focuses on the study case, the introduced method is general and immediately applicable to other problem domains.


Introduction
In physical sciences and engineering, an inverse problem denote the goal of formulating a causal description of a given phenomenon from a set of observational experimental output data-what, mathematically, corresponds to formulating a model capable of prediction the value of one or more dependent variables based on the values of one or more independent variables.Inverse problems lay in heart of system identification field and play a role in uncountable application domains such as physics, chemistry, signal processing, machine learning, to name a few.One recognized challenging aspects of the inverse problem is that its real-world instances are often ill-posed, that is, they belong to the class of problems which does not satisfy Hadamard's wellposedness properties that (a) a solution exists, (b) is unique, and One main issue is that minimizing regression errors of each of these variables often turn out as conflicting objectives.When this is the case, one practical way to address the challenge is by approaching it as a monobjective problem, by aggregating all regression errors into a single scalarization function, e.g. a linear combination.Assigning weights to this objective function, however, introduces more bias into the model, and implies in an a priori decision on which individual objectives are prioritized.
This paper introduces a multiobjective optimization (MOP) method for nonlinear regression analysis aimed at simultaneously minimizing the model order and estimating parameter values without the need of exogenous assumptions.Our proposed nondeterministic MOP strategy aims at finding the Pareto-front of all optimal solutions with respect not only to individual regression errors, but also to the number of parameters needed to fit the data, allowing for an adaptive reduction of the model order.The approach does not require arbitrary constraints on objective weights, regularization parameters or other exogenous assumptions to handle the ill-posed inverse problem.The key method's rationales are introduced in the context of an industrial case study, which illustrate the problem and the strategy toward its solution.The sections which follow discuss the method's rationales, implementation, simulation experiments, and comparison with other methods, with experimental evidences that it can outperform state-of-art techniques.While focused on the study case, the method is general and immediately applicable to other problem domains.

Polymer rheology modeling
With major importance in industry, the viscoelastic properties of polymers have been extensively studied over the last decades, and a great deal of scientific research work has been devoted to the development of analytical models capable of describing such phenomenon.The time-dependent relation between stresses and strains in the material is usually modeled using Boltzmann superposition principle by a constitutive equation as which correlates the stress tensor τ with the linear relaxation modulus G(t) and the rate of deformation tensor γ [1,2].The relaxation modulus, in turn, is often analytically represented by the Maxwell generalized model, as a sum of exponential terms in which λ i and G i comprise the model parameters corresponding, respectively, to the relaxation times and their corresponding weights.The whole set of n pairs (G i , λ i ) is known as the discrete relaxation spectrum of the material.In a common scenario, those model parameters are to be empirically determined from experimental data.Small-amplitude oscillatory shear [3] (SAOS), which lays among the prevailing methods aimed at this purpose, consists in applying a sinusoidal shear strain to a sample of the investigated material and measuring the phase difference between this excitation and the strain caused in the material at a particular frequency ω j .This frequency-domain experimental method produces as outcome the material's linear viscoelastic properties measured as the storage modulus G ′ (ω j ) and loss modulus G ′′ (ω j ), which maps into the time domain as One may then express both the storage and the loss moduli as In practice, the experimental procedure consists in varying the frequency ω j along a discrete set of n values, and measuring the corresponding values of G′ (ω j ) and G′′ (ω j ).The modeling chal- lenge may then be addressed as a nonlinear regression problem in which a set of n pairs (G i , λ i ) is sought such that the theoretical G ′ (ω j ) and G ′′ (ω j ) of Eqs. ( 4) and (5) best fit the empirically measured 1 G′ (ω j ) and G′′ (ω j ).The so-determined solution offers thus an estimated set of parameters for Eq.(2).The meaning of ''best fit'', in the former sentence may be quantified by any metric for the regression residuals such the customary mean square error, or other convenient choice.

State-of-art review
Mathematically, the determination of the parameters which fit Eqs. ( 4) and ( 5) is an instance of the aforementioned inverse problem.Not only a notably nontrivial theoretical challenge in itself, the special case of Eq. ( 2) has long been recognized as illconditioned [4,5], meaning that the model is very sensitive to either any method's bias or noise in experimental instrumentation.Even minor variations in each of those may yield major effects on regression parameters estimation, potentially reflecting high regression errors and spectra with unrealistic features [3,5].
To handle ill-posedness, auxiliary strategies may be evoked.Building upon Laun [6] formulations, Orbey and Dealy [3] discuss a linear-regression method which requires the number of parameters n to be preset with relaxation times λ i , i = 1, 2, . . ., n evenly spaced along a logarithmic scale.The authors compare the approach with a variant by Honerkamp and Weese [5], which introduces the Tikhonov regularization as an additional constraint.Also discussed is a third alternative based on non-linear regression by Baumgaertel and Winter [7], who comment on the benefit of Tikhonov method to improve the solution convergence.Regularization is also used by Ramkumar et al. [8] in their approach based on a quadratic programming method capable of reducing regression error and increasing curve smoothness.
A novel technique referred to as edge-preserving regularization was proposed by Roths et al. [9] in order to overcome the limitations of common regularization methods in coping with spectra edges and large curvatures.Gerlach and Matzenmiller [10] compare two different numerical methods for determination of relaxation spectra, one built on gradient-based optimization algorithm and another combining Tschebyscheff approximation and Wolfe quadratic optimization.Both approaches require appropriate starting values for the parameters in order to converge to stable minima.Alternatively, the windowing method by Emri and Tschoegl [11] is a computational approach also built on the assumption of preset evenly distributed spectrum lines on a logarithmic scale.
More recent works have investigated different approaches for the determination of polymer relaxation spectra.Arguing about the practicality and availability of experimental equipment and procedures, Zatloukal et al. [12] proposed using capillary steady shear data in place of SOAS.As some aforementioned works, their analytical framework relies on non-linear regression and constraints on parameter distribution.Guzmán et al. [13] introduced an approach which does not uses regularization and is based on double-reptation model as an alternative to the conventional inversion of the integral equations, relying only on an a priori knowledge of an α parameter and dynamic moduli data. 1 The symbol ũ is used to denote empirically measured values of variable u throughout this text.
Nonetheless, and in spite of some reported interest in the use of continuous relaxation spectrum to describe polymer viscoelastic properties [14][15][16], the discrete formulation stands as the prevailing approach in the rheology framework, with small-amplitude oscillatory shear remaining an important experimental method both in scientific research and industry shop floor.

Evolutionary computation approach
To handle the nonlinearity and ill-posedness of the regression problem, the aforementioned and other analytical and computational deterministic methods described in the literature are, to a broad extent, reliant on some type of assumptions that are extrinsic to the experimental data, such as heuristically tuned regularization parameters, constraints on the number of parameters and their distribution.Meanwhile, in a distinctively different vein, a non-deterministic optimization method has been proposed [17] which does not rely either on arbitrary or exogenous regularizing parameters nor technique-biased parameter distribution hypotheses.The introduced method, grounded on evolutionary computation theory, was successfully applied to the determination of the discrete linear viscoelastic relaxation spectrum from SOAS data, and experimental evidences indicate that it was able to fit the storage modulus G ′ and the loss modulus G ′′ of Eqs. ( 4) and ( 5) with significantly low regression error and requiring a low number of parameters (G i , λ i ) relatively to other reported methods.
Essentially, an evolutionary algorithm (EA) works not too differently from other iterative optimization methods in which a solution is successively improved at each step.Unlike conventional deterministic techniques, nevertheless, the EA approach evolves not a single solution, but a collection of many prospective solutions simultaneously.At each step, the algorithm selects a few from among the best solutions found so far, and, from these, it derives some other new solutions.Both these procedures, selection and derivation, embody a degree of randomness.The best solutions have a high probability, rather than certainty, of being picked up-such a nongreed strategy is key to avoiding an otherwise inevitable premature convergence to local optima.The procedure to derive new solutions from the selected ones is performed by combining the latter, such that offspring solutions are close in the solution space to their parent solutions, already a product of best-fit selection, and hopefully may thus further approach an optimum in the objective space.Often, after recombination, mutations -minor random changes with low probability -are applied in order to generate new solutions which would not be produced solely by recombination-and thus, allowing for further exploration of the search space.Summarizing, and resorting to the metaphor of the natural evolution upon which the computational strategy is built, the successive choice of the ''best fit'' -evaluated by the objective functionand their recombination into similar solutions yield a selective pressure driving the convergence to optima in the solution space; probabilistic selection and random variations impose a contrary pressure which tends to avoid premature convergence and better exploration of the solution search space.
In their paper Monaco et al. [17], authors describe the realization of this approach in the form of a conventional genetic algorithm [18] (GA) modified to work with real-valued genes, each a 2-tuple (G i , λ i ).The method was implemented as a computer program which takes experimentally measured G ′ and G ′′ data and produces as output a set of n parameter (G i , λ i ) along with the computed regression errors, as in Eqs.(7) and (8).Using data sets from the literature, the evolutionary algorithm was capable of reproducing Laun [6] results for LDPE material with n = 8 when abiding by the constraint of preset logarithmically distributed λ.Under the same restriction, an equivalent result in the curve smoothness obtained by Ramkumar et al. [8] with n = 41 parameter was achieved with a regression residuals as low as two orders of magnitude bellow that reported by the latter work.When the constrain on λ spacing was removed, the genetic algorithm reproduced Laun [6] result with a 20% lower regression residue, and that of Ramkumar et al. [8] with residuals bellow the mensurable precision limit and equivalent smoothness.Moreover, the authors replicated this computational experiment with different choices of n and could achieve both regression residuals and curve smoothness equivalent to that of n = 41 evenly log-scale distributed λ found by said work [8], but with as few as n = 8 free-λ (unconstrainedly distributed relaxation times).Results for other materials including HDPE, PB and Honerkamp and Weese [5] synthetic data were producedin those cases, the genetic algorithm needed, respectively n = 4, 6 and 7 parameter paris to outperform the reported results and, respectively, n = 8, 10 and 10 to reach the minimum regression residual, beyond which no improvement was seen by increasing n.
It is fair to remark, at this point, that whereas such a positive outcome might not be assured beforehand -bearing in mind the approach's inherent non-determinism -it is nevertheless not absolutely surprising, once taken into account that the less constraints are imposed, the broader the range or possible solutions.In this case, the problem admits superior (best fitted) choices of regression parameters out of the evenly log-distributed λ restriction.Indeed, the self-sufficiency to explore the solution search space without the need of arbitrary, extrinsic constraints is a noteworthy inherent virtue of the evolutionary computational approach.Evolutionary algorithms (of which genetic algorithms are an example) are known as a metaheuristic, meaning that it is not a specific-purpose prescription for a particular problem, but rather a generic problem-solving mechanism which can be tailored to different domains.Given a codification scheme to represent prospective solutions (chromosome), a procedure to derive new solutions from existing ones (recombination and mutation), and an algorithm to compare the quality of two solutions (fitness function), a GA is essentially a well-reasoned trial-and-error strategy, grounded on natural evolution theory and backed by empirical positive evidences of both biological and artificial systems, implemented as a non-deterministic iterative optimization algorithm.

The multiobjective problem
The issue motivating the present research work concerns the fact that the goal of determining the parameters that minimize the regression residues of both G ′ and G ′′ theoretical curves translates into a multiobjective optimization problem (MOP).That is because the same set of parameters which minimizes the residue of Eq. (4) may not be the best choice to minimize the residue of Eq. ( 5).If we associate each individual regression a metric of quality which quantifies the resulting residues, we may define two objective functions f 1 and f 2 .As is usual in real multiobjective optimization problems, in the problem under investigation both objectives are conflicting.
One straightforward strategy to tackle the goal of minimizing m objective functions at the same time is to address it as a monobjective problem by means of scalarization.Under this approach, if a solution x is sought which minimizes f = {f 1 (x), f 2 (x) . . .f m (x)}, then a scalar function g(x, w) is chosen as the new monobjective function to be optimized.For instance, the weighted sum method corresponds to a linear combination of the individual objective functions f i , such as is one very simple, yet meaningful scalarization function.The set of weights w = {w 1 , w 2 , . . .w m } may be tuned to reflect the relative importance of each objective.The exploration of the solutions' search space is narrowed and driven around the possibilities which conform to the chosen weights.This is an example of an a priori multiobjective problem solving technique in which the decision maker assigns optimization priorities beforehand, accounting for the strategy exploited in aforementioned work [17].In this present paper we extend those results by tackling the same challenge as a strictly multiobjective optimization problem.That is, we aim at finding not one single best solution, but a set of solutions which are the optimal ones with respect to all possible weight assignments, allowing thus for a posteriori decision making on the particular solution to be chosen.This collection of optimal solutions is commonly referred to as the Pareto set.The concept may be intuitively depicted as in Fig. 1, addressing a generic bi-objective problem.In the plot, the two objective functions f 1 and f 2 , indicated by orthogonal axes, define the objective space.In this bidimensional example, each dot on the solution's plane corresponds to a particular set of parameters in the feasible solution space-which may have a different dimension.
Like in the running case study, where the objective functions are regression residuals, let us suppose that the goal is to minimize both f 1 and f 2 .In Fig. 1(a), solution S1 is clearly better than solution S2, seeing that the former minimizes both objectives respectively to the latter.In the optimization field's terminology, S1 is said to dominate S2.A like conclusion is not however so immediate when comparing S1 to S3, as either one is better respectively to each of the objectives-bringing up the fundamental challenge of defining ''optmum'' in a multiobjective problem.The formal approach for this dilemma is to define that a given solution is dominated if there is another solution which is better than it with respect to at least one of the objectives.Therefore, both solutions S1 and S3 are dominated.On the other hand, solution S4 is nondominated, inasmuch as there is no other solution which is better than it regarding any objective.In the plot of Fig. 1(a), white dots are dominated solutions, whereas black dots are nondominated ones.If over all the solution search space there are no feasible solutions that dominate the ones shown in the plot, the nondominated solutions are considered optimal solutions, meaning that any attempt to improve them with respect to one objective would degrade another objective.The frontier on which lay the optimal solutions, depicted in Fig. 1(b), is known as the Pareto front.
In possession of the Pareto front, the problem analyst may choose among all optimal solutions, selecting which is more appropriate based on a particular objective priority assignment.For instance, they can prefer a solution which equalizes all objectives, or else a solution which relatively prioritizes more critical objectives to the detriment of less important ones.In the focused problem, the domain analyst may make decisions on the basis of accuracy or precision associated with either G ′ or G ′′ data sets, or any other sensible criteria.

Problem formulation
To comply with an usual preference in the literature, we herein quantify the regression quality as the mean absolute percentage error 2 ϵ, as hereafter denoted.Formally, for a set of d data values x i , this metric is defined as with x indicating the experimental data and x, the value forecast by the model. 3Formally, thus, the addressed multiobjective problem may thus be stated as the goal of minimizing the objective-functions ϵ 1 and ϵ 2 given by the expressions ( 7) and ( 8) 2 In evaluating a prediction model, the mean absolute percentage error ϵ between a series of forecast values f i and their corresponding actual values a i , is a measure of the absolute relative deviation between the actual and forecast values.i.e. the average of |f i − a i | over the n data records.
3 Some authors, e.g.[3,8] use to refer to this metric as the average ab- solute deviation (AAD).This however conflicts with the conventional statistics terminology, which defines AAD as the average deviation of each individual variable with respect to the sample's mean-therefore, as a dispersion rather a regression quality measurement.In this paper we abide with the more precise denomination of mean absolute percentage error, as commonly used in statistics field.
with P being the set of n determined parameter  9) delimits the feasible solution space which the algorithm is allowed to explore.
In this paper, we exploit a multiobjective evolutionary algorithm to generate a Pareto set containing optimal solutions for the determination of the linear viscoelastic relaxation spectrum from experimental data.Each solution is a set of parameters (G i , λ i ) which fits the theoretical curve of the storage and loss module given by Eqs. ( 4) and ( 5), and is optimal in the sense that it yields the best possible ϵ 1 for an admitted ϵ 2 and vice verse.Moreover, the method allows for including the number of parameters n as one of the multiple objectives to be optimized, thereby offering the decision maker to appreciate the trade-offs between regression quality and model order.Low-order models are advantageous for reducing cost and processing time of computational-intensive simulations.

The algorithm MOEA/D
When tackling the running case study by means of an evolutionary strategy, Monaco et al. [17] outlined the working principle of a monobjective GA in the form of a simple algorithm, adapted in Fig. 2.
Under the metaphor of genetic algorithms, the initial population of step (a) is a randomly generated set of ρ prospective solutions encoded in an appropriate form.In the running study case, an individual is a candidate solution of the regression problem encoded as an ordered sequence of n pairs (G i , λ i ), which plays the role of a chromosome.In step (b), the fitness function is the regression quality, which the authors assumed as the simple average between ϵ 1 and ϵ 2 , given by Eqs. ( 7) and ( 8).In step (c), a ''mating pool'' is formed by the probabilistic selection of β ''best fitted'' individuals, that is, those chromosomes (parameter sets) yielding lower average regressing residuals.In step (d), each of the β new individuals is formed such that some have their genes -(G i λ i ) pairs -closer to a ''father'' solution, whereas others have their gene closer to a ''mother'' solution among the ones in the mating pool, in a vague analogy to a biological offspring.Finally, in step (e) the new individuals are compared with all individuals of the original population regarding fitness; the highly scored ones survive to the detriment of the worst fitted, again probabilistically.The process repeats until a stop criterion is reached, which, in the present example, is when the average regression error eventually ceases to show significant improvement at each new iteration.The GA output is a sequence of n pairs (G i , λ i ) which best fit Eqs. ( 4) and (5).It is worthy to remark that, as a non-deterministic numerical method, the solution is not guaranteedly the absolute optimal, but a best-effort estimation.
Successive runs of the algorithm with varying initial conditions and random selection may yield numerically distinct results.The reported computational method was however evaluated as robust in the sense that the generated solutions, yet different, result in equivalent regression quality.Contrasting to this algorithm, a multiobjective evolutionary algorithm (MOEA) builds upon a different paradigm as explained in Section 1.4.Even so, some general principles of evolutionary computational methods are shared among the mono and multiobjective algorithms.Both start with an initial population and iteratively evolve it through successive steps of random selection of the best fitted, random recombination and variation, with parameters set to boost selective pressure (convergence toward the optimum) and diversity preservation (solution space exploration).The main difference between a monobjective GA and a MOEA is that, while the former (GA) aims at driving a unique individual towards a single optimum -possibly a scalar combination of several objectives -the latter (MOEA) aims at driving the whole population to settle along the Pareto front, considering all objectives independently.
Perhaps the most straightforward strategy to achieve this is to start from the very concept of dominance, and design a selection policy which probabilistically preserves the nondominated solutions to the detriment of the dominated ones.Its not difficult to see that this will tend to cause the surviving solutions to approach the Pareto front.One problem with this approach, which has been long noticed [19], is that this sole criterion does not prevent the tendency of all found Pareto-optimal solutions to cluster together in one region of the theoretical Pareto front, due to the inherent convergence pressure of the evolutionary process.Aiming at a more even distribution of solutions along the Pareto front, several techniques have been proposed [20], many of which embedding into the selection mechanism some rule to prefer scattered over crowded solutions.A particularly ingenious one, suggested by Zitzler and Thiele [21], is using to use a metric called hypervolume, graphically explained in Fig. 3.The hypervolume is the union of the areas of all rectangles with one vertex at a reference point Z and the opposite vertex at a candidate optimal solution.It is easy to see that the more evenly spread the set of Pareto solutions, the larger the resulting area.In the image, Z is arbitrary point that is at least slightly worse than the nadir point of the Pareto front approximation obtained.
While effective, all the proposed techniques, including the hypervolume, are computationally costly, especially in higher dimension problems, and with many objectives.Because of this, a different approach suggested by Zhang and Li [22] proposes to tackle the problem by means of a decomposition strategy.
Decomposition is well-known in multiobjective optimization.Essentially, it consists in converting the MOP into a set of monobjective problems through some scalarization function.For instance, using the weighted sum method, a two-dimensional MOP with objectives f 1 (x) and f 2 (x) may be converted into a scalar optimization problem whose goal is to optimize (either maximize or minimize) a function g(x, w) referred to as the weight vector, while f = (f 1 , f 2 ) is the objectivefunction vector. 4The optimal solution of the scalar function g, with a particular weight vector w is a point in the Pareto front of the MOP problem.Therefore, by taking k weight vectors w j ∈ w 1 , w 2 , . . ., w k and optimizing each g(f , w j ) individually, one can obtain k Pareto-optimal solutions of the MOP.Generalizing the preceding reasoning in a concise formulation, the decomposed MOP problem may therefore be stated as follows.
S(f j (x), w j , . ..) = g(x, w j , . ..) In Eq. ( 10), let f (x) be a set of m objective-functions whose domain Ω is the feasible solution space, i.e.where lay the solutions of practical interest.The objective-functions' argument x may be multidimensional and comprise the set of decision variables concerning the problem.The MOP decomposition implies addressing the aggregation of objectives according to relative weights, and searching for the Pareto-optimal set requires several weight vectors.Let W , in Eq. ( 11), be this set of k weight vectors w j , each of these a vector itself with m non-negative coordinates w j i which sum up 1, as in Eq. ( 12).As for the scalarization method, we may define a generic operator S, as in Eq. ( 13) which associates both the objective and a particular weight vectors to a scalar function g(x, w j , . ..)-the ellipsis is left as a provision for scalarization functions depending on other parameters.For instance, the weighted sum is an example of a scalar operator . Solving the MOP/D (multiobjective problem with decomposition) system involves optimizing g for all w j , j = 1, 2 . . .k so as to obtain a well-distributed set of solutions as close as possible to the theoretical Pareto front.
While determining the k optimal solution in this way, one each a time, may be a simple and effective approach, this is however not necessarily a very efficient one.Other methods have been proposed [20] to optimize all scalar functions simultaneously at each iteration.One such method, which we address in the present paper, is a rather clever mechanism proposed by Zhang and Li [22], named multiobjective evolutionary algorithm based on decomposition, MOEA/D.In general terms, the key idea of MOEA/D is to choose evenly distributed weight vectors and exploit the observation that, for two close weight vectors, the optimal solutions of their corresponding scalar functions tend to be close together along the Pareto front -seeing that g j is linear to w j .MOEAD/D takes advantage of this property by allowing recombination between evolving solutions of neighbor scalar functions, as measured by the Euclidean distance between weight vectors.That is, at each iteration of the algorithm, the evolution of each monobjective problem benefits from the progress of the neighbor problems, and as result, the whole set of solutions converge to the Pareto front faster and in a well-distributed manner.The technical details and in-depth analysis of the MOEA/D algorithm are offered in the proponents' article [22].

MOEA/D implementation
In the present research work, we have designed the Algorithm 1 for the MOEA/D method.
The algorithm declares (line 2) W an ordered set of k weight vectors w j .Then, for each w j , it defines a neighborhood with s ≤ k other vectors which, further on, will be optimized in combination as outlined in Section 3. In the algorithm, (line 3) B j is the set of indexes corresponding to the w j 's neighbor vectors.
Finally, P is the population (line 4) of sought after solutions x j .The procedure starts by initializing W (line 6) with k evenly distributed weight vectors 5 satisfying the conditions of Eq. ( 12).
The neighborhood of each w j is initialized (line 7) with the s nearest weight vectors, as given by their Euclidean distances, while the population P is initialized (line 8) with randomly generated solutions (solution vectors) x i within the allowable bounds of the feasible solution space Ω.The population is evolved across t max generations (iterations), as controlled by the main program loop (line 9).Within it, each solution x j is visited and improved as follows.First, its neighbor solutions (corresponding to each scalar problem, and each weight vector) are combined (lined 12)-the exact variation operator V(B j , W ) by which this combination is performed is left open for flexibility.Then, the newly generated solution y produced at this step is compared to its neighbors (line 15) by means of their corresponding scalar functions g-again, the scalar operator S(f j (x), w j , . ..) to derive g(x, w j ) is left open.
This comparison is the basis to decide on which solutions are replaced and which are retained (line 16).Parameter τ max controls how many current solutions may be so replaced.Algorithm 1 was implemented using matlab6 framework.

Choices and parameters controlling the algorithm
As mentioned, the choices of both the scalar and variation operators allow for flexibility and exploration through empirical investigation, as addressed in the following sections.
Another central aspect of an evolutionary algorithm implementation is the solution encoding scheme.This choice not only impacts the performance but also guides the design of the variation operators.In our implementation, we have opted for a quite simple scheme in which each prospective solution (a chromosome in GA jargon) of n parameters is represented by an array of 2 × n real values, with even positions corresponding to values of log(G i ) and odd positions to values of log(λ i ).This allows for variation operators for continuous real-valued search spaces already existing for monobjective EAs to be used out of the Algorithm 1 Our MOEA/D design.
x i : random(Ω) The algorithm reads as input two text files containing the experimental data in CSV format, 7 one for the storage modules G ′ and another for the loss module G ′′ .The procedure is also controlled by the choices of k, the number of desired solutions in the Pareto front; t max , the number of generations (iterations) across which the solution population is to be evolved (optimized); and s, the size of each solution's neighborhood considered in the offspring (derived solution) generation.All these parameters may be empirically tailored for the particular problem and data sets-they will mostly affect converge speed.

Implementation specifics
Aiming at verifying the susceptibility of the result quality with respect to the choice of the scalarization and variation operators, we have implemented a few already existing alternatives.

Variation operators
As for the variation operators V(B j , W ), three well-known recombination methods were implemented.

Simulated binary crossover
A Simulated Binary Crossover (SBX) [23] attempts to simulate the working principle of a single-point crossover for binary strings.Although a crossover is typically performed between a pair of parent solutions to obtain a pair of offspring, a single offspring is produced in MOEA/D during recombination.Thus, the SBX operator can be expressed as: with the values of β i being obtained from the following distribution: 7 Comma-separated value.In our implementation each row is a data mea- surement where the first column represents the frequency ω, and the second column is the associated measurement.

Differential evolution
A differential evolution (DE) [24] operator generates a new solution y by exploiting the differences between solution vectors in the population and can be described by: for j = 1, . . ., n, F is the scale factor, CR is the crossover probability and r i ∼ U (0, 1).

Gaussian model
A multivariate Gaussian model (GM) [25] operator samples solutions from: x ∼ N (µ, Σ ) (17) where µ ∈ R n is the mean vector and Σ ∈ R n×n is the covariance matrix of the probability distribution.The mean of the Gaussian model is assumed to be the solution associated with the ith subproblem: µ = x (i) .(18) The elements of Σ are computed using neighboring solutions and an unbiased estimate of the covariance: for i, j = 1, . . ., n, where K is the size of the considered neighborhood.

Polynomial mutation
In addition to the aforementioned recombination operators, a mutation operator was used in association with all three of those, as a subsequent step in the variation procedure.A Polynomial Mutation (PM) [26] is typically applied to introduce additional variations to newly generated solutions so as to ensure the exploration of new regions of the search space.This operator gives a higher probability that a new solution y is closer to rather than far away from the previous solution x.It can be expressed by: with the values of δ i being given by: for j = 1, . . ., n, where r j , u i ∼ U (0, 1), whereas p m ∈ [0, 1] and η m > 0 are control parameters.

Chebyshev method
Chebyshev (CHB) method belongs to the group of weighted metric methods [27] that aims at minimizing the distance between a reference point z = (z 1 , . . ., z m ) and the feasible objective region [27].The CHB scalarization is given by: minimize This method can find solutions in convex and nonconvex regions of the Pareto front.The drawbacks are that it cannot distinguish weakly Pareto optimal solutions and does not provide a uniform distribution of solutions along the Pareto front.

Penalty boundary intersection method
The penalty boundary intersection (PBI) [28] is developed in an attempt to generate a more uniform set of Pareto optimal solutions and is given by: minimize where The major advantage of this method is that it can provide a reasonably uniform distribution of solutions along the Pareto front.Though it comes at the cost of specifying the value of θ .

Weighted stress function method
The weighted stress function method (WSFM) [29] draws inspiration form the stress-strain behavior of thermoplastic vulcanizates.Using the weight w i , the WSFM transforms the ith objective f i into a stress σ i .This requires f i ∈ [0, 1] and w i ∈ (0, 1), which is achieved by: and for i = 1, . . ., m, where f min i and f max i are the minimum and maximum values of the ith objective in the current population, whereas ϵ is a small positive value.
The WSFM scalarization aims at minimizing the largest stress associated with the given solution: with being calculated as: where ) . (32) The particular feature of this method is that it results in nonlinear lines along which the search is performed.

Experimental setup
This section details a set of simulations aimed at exploring the computational method and summarizes their results.Relevant findings and pertinent discussions are contextually highlighted along the text.All experimental framework, including algorithm implementation, experimental data, and simulation results are available from the project repository. 8

Experimental data
The study considers two types of polymer materials: Low Density Polyethylene (LDPE) and Polystyrene (PS).LDPE is a lightweight plastic material that has excellent resistance to water, moisture, and most organic solvents and chemicals.It is widely used for packaging and as a protective coating.PS is a polymer that is produced through the polymerization of styrene in monomer status.It possess good flow properties at temperatures safely below degradation ranges, and can easily be extruded, injection molded, or compression molded.Considerable quantities of polystyrene are produced in the form of heat-expandable beads containing a suitable blowing agent which ultimately results in familiar foamed polystyrene articles.The number of experimental data available for LDPE and PS are 57 and 109 measurements, respectively. 9For the available data, Fig. 4 displays the plots of the storage modulus G ′ (ω) and the loss modulus G ′′ (ω) against the frequency ω.

Parameters and procedures
MOEA/D is a stochastic search algorithm.Due to this fact, the computational experiment involves multiple executions of the algorithm with different random number sequences, 10 and the obtained results are to be analyzed with respect to a statistical framework.
For each experiment instance, 25 independent runs were performed with different random sequences, and individual results 8 Project repository: https://gitlab.uspdigital.usp.br/research/moeadpolymer. 9 The experimental data was obtained . . . . 10 To produce random sequences out of deterministic computational pro- cedures, programs usually rely on pseudo-random functions which compute different results upon successive calls, depending on the current value held in an internal state-variable that is automatically updated at every invocation.The so generated sequences shall fulfill general mathematical proprieties of randomness.Distinct random sequences can be produced by initializing the algorithm internal state with an initial value often refereed to as random seed.were averaged, when applicable.Concerning the parameters controlling MOEAD/D algorithm, population size was set to k = 100 individuals (solutions) and the maximum number of generations (iterations) to t max = 5000, as it was empirically verified that significant improvement is not observed beyond this point.The neighborhood size was set to s = 10 and the maximum number of individuals replaced by offspring to τ max = 1.As for the feasible solution space Ω, values of G i and λ i were bounded such that log 10 (G i ) ∈ [−4, 10] and log 10 (λ i ) ∈ [−6, 6] for i = 1, . . ., n, respectively.A minor customization of Algorithm 1 was implemented which, with probability 1 − δ, the offspring of x j is generated out from the entire population P instead of from is neighbor solutions (B j is substituted with 1, 2 . . .k in line 12, and s with k in line 14) -in the reported experiment, δ was set to 0.9.All these values were empirically tuned so as to speed up convergence, and should not interfere with the overall numerical result.
In some experiments, we analyzed the results through the hypervolume indicator [30].As outlined in Section 3, this metric measures the volume of the objective space that is dominated by an approximation set and is bounded by a reference point.
It can be defined as the Lebesgue measure Λ of the union of hypercuboids in the objective space as where A = {a 1 , . . ., a |A| } is an approximation set and r is an appropriately chosen reference point.The hypervolume indicator can measure both the convergence and diversity of A. Higher values of I H are preferable.For the sake of clarity, the normalized values of I H will be presented.With the algorithm configured, MOEA/D implementation was fed with LDPE and PS experimental data and a set of computational experiments were carried out as detailed in the sections that follow.

Scalarization and variation operators
When applying evolutionary algorithms to solve real-world problems, a few algorithm design decisions must be addressed.Those include the choice of appropriate operators and parameter settings.It is known that there is no general optimization strategy that works the best for all the problems and, therefore, the issue is typically approached through experimentation.In this study, we investigated the impacts of variation and scalarization schemes when applying MOEA/D to the running problem of determination the relaxation spectrum.We considered three different variants for both schemes introduced in Section 5.The values of the parameters were based on those frequently used in the literature and set as follows.
With respect to variation operators, SBX was used with the crossover probability of p c = 1.0 and the distribution index of η c = 20; DE was used with the crossover probability of CR = 1.0 and the scaling parameter of F = 0.5; GM depends only on the size of neighborhood defined in MOEA/D.All these operators are used in conjunction with the PM, with mutation probability of p m = 1/n and the distribution index of η m = 20.
Testing the several combinations of variation and scalarization operators, we have empirically observed that the choice does not drastically impact the result, with an apparent slight superior performance of the combination of SBX and WSFM.For the sake of illustration, Fig. 5 depicts the evolution of the normalized hypervolume 11 during the course of optimization when using different variation schemes.
One possible hypothesis to explain the advantage of SBX variant over the other operators is that the method treats each gene in the chromosome separately, what may be useful for the exploration of multimodal functions with weak or no interactions between the variables.This somewhat corresponds to the characteristics of the problem at hand, as the pairs in Eqs. ( 4) and ( 5) can be viewed separable and independent from each other.On the other hand, DE generates an offspring by using the difference between the parent chromosomes.Such a mechanism can exhibit invariant properties with respect to the linear transformation of the search space as the interactions between the variables are taken into account.However, this can come with the cost of slowing the convergence.Lastly, GM builds a probability model using a set of population members and uses a random realization of that model to generate the offspring.It can exhibit similar properties to DE.Nevertheless, the convergence can be further slowed down by randomly sampling from the probability density built.
Regarding scalarization operators, although the choice does not explicitly define the way in which new solutions are produced, it does determine the direction of the search by specifying which solutions are more fit to the given problem.The distinct feature of WSFM, which arguably explains its performance, is that the contour lines which defines the directions of search in  MOEA/D are nonlinear [29].This can be useful not only for appropriately distributing the solutions along the complex geometry of the Pareto front in the final step but also during the whole optimization run.Table 1 lists the parameter values used in the experiments.
For a statistical analysis, we used a nonparametric Wilcoxon rank-sum test at a significance level of α = 0.05.The results of the test confirmed a statistical difference of the performance of the considered variants of MOEA/D.Similar trends observed for both LDPE and PS can provide support for generalization.Thus, based on the obtained results, we selected the configuration of MOEA/D with SBX and WSFM for the remainder of this paper.

On the number of model parameters
One parameter which does not belong to the method, but to the problem itself, is the number of pairs (G i , λ i ) chosen to model the physical system under investigation.This accounts for the parameter n in the solution codification described in Section 1.3.
Our first approach to this issue was to empirically evaluate this choice by performing successive runs of the algorithm, starting with n = 1 and incrementing this value at each execution.
For each experiment run (each run consisting in 25 replications with varying random sequences), we computed the average hypervolume of the resulting Pareto set.Fig. 6 shows a graph in which the hypervolume is plotted against n.As can be noted, the quality of the solution set increases rapidly up to around n = 8 to n = 10.After this point, no significant improvement is observed by adding more parameters to the relaxation spectrum model.This result is accordant with those reported in the aforementioned paper [17] on the monobjective version of the herein introduced MOEA approach.In the referred article, researchers have similarly found that the regression quality of G ′ and G ′′ curves (using an equally balanced weighted sum scalarization) reached the best quality with no more than 8 to 10 pairs for LDPE, HDPE and polybutadiene experimental data.
The corresponding Pareto sets may be graphically visualized in Fig. 7.The plots show the optimal solutions with reference to regression residuals ϵ 1 and ϵ 2 -Eq.( 7) and ( 8) -for the storage and loss moduli, respectively.Corresponding numerical values for the regression residuals may be read in Tables 2 (LDPE) and 3 (PS).The tables list three representative points of the Paretor front, namely, the extremes and the central one: S 1 has the lowest ϵ 1 , S 1 has the lowest ϵ 2 and S m has evenly balanced ϵ 1 and ϵ 2 .
The S m point, referring to the solution in the Pareto front having the smallest distance to (0,0) with respect to L 1 norm, is of particularly interested as it represents intermediate trade-offs between the regression errors.The estimated parameters p i = (G i , λ i ) for S m are shown on Tables 4 and 5.The list tabulates the values found for n = 4, 8, 10 and 20 parameters.The same values may be graphically visualized on the plots of Fig. 8.

Optimal solution length
Up to this point, we have provided evidences that, for the studied materials, no significant gain in regression quality is observable by using more than n = 10 pairs (G i , λ i ).This was observed by manually varying the model parameter n across successive experiment runs.Again, while effective, this expedient does not take full advantage of a multiobjective optimization method such as the one exploited in this study.Indeed, it makes sense to approach the number of parameters as another optimization objective by transforming the problem into a 3-dimensional MOP.In this case, we may have ϵ 1 , ϵ 2 and n as our objective functions.To that end, we modified the solution codification described in Section 4 by converting the 2n-length array into a 3n-length one.In this scheme, positions whose indexes are multiples of 3 play  the role of an indicator function 1 R + (x i ) to determine 12 whether the corresponding pair (G i , λ i ) preceding it should be used or not in the fitness evaluation-i.e. to fit Eqs. ( 4) and ( 5).This is a simple yet effective encoding that allows for the application of the same variation operators we had already been using.No modification in MOEA/D implementation was required.

Loose-n variable-length simulations
As a preliminary exploration of this modified solution encoding, we first rerun the same simulation protocol as before, with only two optimizations of objectives, ϵ 1 and ϵ 2 .We left n a free parameter and did not consider it as an objective in the scalarization function to be minimized.Again, we run the experiment with chromosome lengths varying in the range from 1 to 20.Since some parameters might be disabled, this length is actually an upper bound n max on the effective number of parameters utilized by the solution.Fig. 9 shows the result, comparing the evolution of the hypervolume along the simulation, between both the fixed-length (preset-n) and the variable-length (free-n) solution versions.
In the plot, the abscissa axis indicates the maximum n allowed in the experiment with the variable-length version, and the exact n value with the fixed-length version.While the curve for PS material does not add much with respect to the expected results, the plot for the LDPE material may came up somewhat surprising at a first glance.As it can be noticed, up to n = 8, the Paretooptimal solutions of the free-n experiment turned out superior than those found by the preset-n variant, for the same allowed number of parameters, consistently across the 25 replications.
Albeit not an immediately intuitive outcome, one conjecture which might be evoked to possibly explain it is that, while the fixed-length simulation was forced to optimize exactly n model parameters -pairs (G i , λ i ) -the variable-length variant was allowed to get rid of excess parameters (by disabling them through the indicator gene) if those did not effectively contributed to the regression quality.A similar hypothesis had been already brought about in the paper on the monobjective EA [17], when authors noticed their genetic algorithm implementation's striving to reduce the numeric contribution of excess terms of Eq. ( 2) when the program was constrained to deal with an unnecessarily large n.
In order to investigate this finding further, we counted the effective number of model parameters n in the solution sets found.What we observed (Table 6) is that when the solution was bounded by a length of up to n max = 8, all parameters were used.On the other hand, when we set n max to larger values, the obtained Pareto fronts, in different executions, were composed by solutions with varying lengths, ranging from n = 10 to n = 14.
The interpretation of this result evokes the observation that if a gene does not effectively adds to the individual's competitiveness within the population, there is no bias acting to preserve it along generations.The indicator gene provides a quicker mechanism to ''turn off'' useless genes, whereas the fixed-length version requires the algorithm to gradually diminish its numerical significance across a longer succession of iterations.Nevertheless, since there is no active selective pressure to shorten the solution length, this effect is resulting solely from random chances and there is a non-negligible possibility that the population converges to some stable configuration, and remains drifting around it indefinitely.
This phenomenon may be illustrated by Fig. 10.The plot shows the (G i , λ i ) resulting from one of the simulation experiments in this series.For the LDPE material, all Pareto-optimal solutions ended up by having n = 12 parameters, while for PS this  −5.9490

Table 5
Model parameters obtained for PS (log 10 ) scale.number was n = 15 (regression residuals were minimally higher but compatible to the ones listed in Tables 2 and 3).With a very small mutation probability, this result has little chance of changing solely by random fluctuations.Are these the shorter solutions with the given regression quality?Based on the plot, chances are that the answer can be negative.Indeed, bearing in mind that each parameter (dot on the plot) corresponds to a summand in Eq. ( 2) it is easy to recognize that the two outliers at the extremes of the scales (bottom-left and top-right corners) have little numerical impact, since either a small G i or a large λ i minimizes the term.Furthermore, if λ i ≈ λ j , then the two corresponding addends may be approximated by (G i + G j )e (t−t ′ )/λ i .
Upon this reasoning, it is suggestive that there may be room for improvement regarding the optimization of the number of parameters.Adding a selective pressure to minimize n is the goal of the experiment which follows.

Three-objective formulation
Having tested the modified solution encoding with the indicator gene, we then designed a three-dimensional MOP, with the objectives of minimizing ϵ 1 , ϵ 2 and n simultaneously.With the same algorithm setup and input data we rerun the experiment for LDPE, obtaining the result depicted in Fig. 11.
At the top (Fig. 11(a)), the plot shows the tridimensional Pareto surface with the distribution of the optimum solutions in the objective-function space; for the sake of clarity, only a single replication of the experiment was plotted.In this simulation, the best solutions regarding regression quality (lower ϵ 1 and ϵ 2 ) use n = 6 parameters; they appear closely agglomerated at the  larger than what is possible with larger solution lengths (large n).Pareto fronts with lower regression errors, on the other hand, are shorter and better balanced with respect to regression errors in both storage and loss moduli.The resulting Pareto sets on the regression residuals plane may be better visually appreciated in Fig. 12.
In this experiment, we maintained the same algorithm setup than before, with a population of size k = 100 to be evolved along τ max = 5000 generations.It should be noted, therefore, that now, with three objectives, Pareto-optimal solutions which remain in the final set due to a dominance on parameter length (lower n) compete with the other solutions which would be preserved based on superior regression error (lower ϵ 1 or ϵ 2 ).Therefore, the ''chance of survival'' of some good solutions with respect to regression quality is ''stolen'' from the finite population by those with shorter lengths.Likewise, the survival of solutions with larger n may be compromised by dominance on ϵ 1 or ϵ 2 .The consequence is that, the same population size and number of generations have their potential to explore the solution search-space diminished.
A straightforward counter measure for this scenario is to increase those values.We than rerun the experiment to allow k = 300 solutions in the population and optimize them along τ max = 15 000 generations.With more freedom to explore the solution space, we obtained the best solutions with balanced regression quality with n = 10 parameters, and regression residuals of ϵ 1 = 0.480 and ϵ 2 = 0.492, very close to what was obtained with the fixed-length experiment with n = 10 parameters-shown in Table 4.

Assessment of the regression quality
Founded on the rationales and empirical evidences discussed in previous sections, a reasonably adequate configuration to assess the quality of the Pareto-optimal solutions obtained by our implemented method is one which minimizes ϵ 1 and ϵ 2 , with a parsimonious fixed-length solution encoding.This not only speeds up the convergence but reduces the burden of optimizing excess parameters, requiring thus both a smaller size population and smaller number iterations.reported in the literature so far.In a subsequent run of the same experiment, for a much larger number of iterations (t max = 50,000), we could obtain even lower regression residuals for n = 10, reaching ϵ 1 = 0.0429 and ϵ 2 = 0.0441-practically equivalent to those reported by authors [17] of the monobjective version of the EA method.This result is summarized in Table 7.
As mentioned, the said monobjective version by the author already outperformed several analytical techniques to which it was compared in the cited work [17], what indicates that our multiobjective evolutionary approach is deserving the appreciation as a promising tool in the modeling of polymer viscosity  models.It is also relevant to bear in mind that, while the herein introduced multiobjective method achieves similar superiority, its key proposed contribution is not necessarily to reduce the regression residuals even further, but rather to extend the solution from a mono to a multiobjective perspective.In the monobjective approach, the relative weights of either residuals (η 1 , η 1 ) are set a priori and one single solution which minimizes their linear combination is found.If, later on, another set of weights is to be considered, another full execution of the optimization algorithm is needed.In the multiobjective version, on the other hand, a whole set of solutions with varying tuples (w 1 , w 2 ) is evolved at once, each corresponding to a different set of weights.The choice of which solution is more interesting is an a posteriori decision, to be addressed by the decision maker based on what is more important to prioritize in each case.
One noteworthy remark about this result is that, in any regression problem, unrealistic oscillatory characteristic of forecast model due to overfitting is inherently associated to excess parameters.While some deterministic methods aim at improve smoothness using arbitrary assumptions such as regularization or constraints on parameters, our evolutionary adaptive method provides the basis for automatically reducing excesses parameters to an adequate bound, relaying only on data's intrinsic information, thereby naturally optimizing curve smoothness.

Conclusions
The process of deriving a causal model to describe a phenomenon of interest mathematically translates into an inverse problem.In both scientific and industrial research, the associated regression analysis frequently involves conflicting goals of minimizing the residuals on each individual dependent variable, what is often approached as a monobjective optimization problem in which one wishes to minimize an aggregate of those errors as quantified by a scalarization function, and where the weights are chosen by a priori importance assignment.More often than not, the order of the model, given by the number of regression parameters, is not well known in advance from the theory and must be set empirically.In addition, deterministic numerical methods underlying those techniques often rely on assumptions extrinsic to the problem such as regularization constants and restrictions on parameter distribution.All those non-inherent factors represent sources of model biases.This paper introduces a multiobjective optimization (MOP) method for nonlinear regression analysis, which is capable of simultaneously minimizing the model order and estimating parameter values without need of exogenous regularization constraints.The proposed non-deterministic strategy, based on evolutionary computational methods, aims at finding the Pareto-front of all optimal solutions with respect not only to individual regression errors, but also to the number of parameters needed to fit the data, automatically reducing the model order.The approach does not require arbitrary constraints on objective weights, regularization parameters, or other exogenous assumptions to handle the ill-posed inverse problem.Whilst the main ideas are introduced through a case study, the method is general and applicable to other problem domains, as it is discussed in the sections explaining the rationales, implementation, simulation experiments, and comparison with other methods, with experimental evidences that it can outperform state-of-art techniques.For the interest of a multidisciplinary audience, the article lays a detailed introduction of essential concepts and techniques.
The determination of the discrete linear relaxation spectrum of polymers is a relevant subject both in scientific and industrial extents, where small-amplitude oscillatory shear is one important experimental method for this purpose.The experiment produces a series of measurements quantifying the material's storage and relaxation moduli at varying frequencies.Those experimental data then are used to determine a set of regression parameters that should simultaneously fit the two equations describing the material properties.What makes this problem particularly challenging is that it brings about an ill-posed, nonlinear problem, with two conflicting objectives.Most state-of-art research results and also state-of-practice methods suffer from several limitations resulting from the constraints imposed by analytical methods.Some require the artificial addition of excess degrees of freedom, others include arbitrary parameters, others still impose constraints that rule out viable solutions.In many cases, the solution is overly complex or approximated by heuristics, and results exhibit unrealistic features not consistent with the real material.
As an alternative, [17] proposed a non-deterministic optimization method based on evolutionary algorithms (EA) which does not rely on extrinsic knowledge or arbitrary constraints.EAs can handle ill-posed problems naturally, as they do not build up neither on linearity, nor on continuity hypotheses concerning the solution or objective spaces.EAs are based on an educated trial-and-error iterative strategy, through which a collection of prospective solutions are optimized in parallel by means of recombination and probabilistic best-fit selection.This characteristic also allows EAs to overcome premature convergence to local optima and to speed up the search of the global optimum.In their experimental study Monaco et al. [17], the authors successfully applied a genetic algorithm to determine the relaxation spectrum of a few polymers.The EA implementation was capable of outperforming the regression quality obtained by other methods, and with a lower number of model parameters.The goal of minimizing the regression residuals of both the storage and loss moduli are nevertheless conflicting objectives and, not unlike most of the related works described in the literature, the referred EA implementation approached the issue as a monobjective problem-by tackling it as the minimization of an aggregate function of the residuals, pondered by an arbitrary set of weights.
In this paper, we go a step further, exploiting the generalizable EA metaheuristics to face the same challenge as a strictly multiobjective problem.In other words, we introduce a method to obtain several Pareto-optimal solutions, leaving the choice of objective weights for the decision making analyst.We designed our method as a multiobjective evolutionary algorithm based on decomposition, due its convenient capabilities of generating a fair distribution of representative Pareto-optimal solutions and the possibility of using existing, well-known monobjective variation operators.The rationales and implementation aspects are explained in detail and a set of experiments are reported, witch demonstrates the capabilities and limitations of the proposed method.
The comparison of the results of the proposed method with those of related works is not straightforward, seeing that our method produces a set of (rather than a single) solutions of varying qualities regarding the two analyzed objectives-it is not immediate which solution of the set should be taken as a basis for comparison.For the sake of practical illustration, though, the solutions with nearly equal regression residuals for both storage and loss moduli we obtained are numerically close to that reported in the referenced literature.In particular, we were able to approximately reproduce the results obtained by the mentioned monobjective EA [17] for the LDPE material with respect to both regression errors and number of model parameters (indeed, the literature [31] suggests that multiobjective optimization methods can outperform monobjective analogs in virtue of the lower scale sensitivity and susceptibility to local-optima traps-although subject to the limitations of the intrinsic experimental errors in the data sample).Like in the said experiment, the solutions produced by MOAD/D have similar or lower regression residuals than those yielded by deterministic methods reported in the literature (Section 1.2), and with a lower number of parameters.
The exact sets of (G i , λ i ) to fit the materials used as case studies are not the central contribution aimed by this paper but, rather, the introduction of a method to generate the Paretooptimal solutions.As far as our knowledge goes, there is no other proposed method in the literature which approaches the determination of the linear discrete relaxation spectrum of polymers as a multiobjective problem, and allows for addressing the compromise between the regression qualities of both storage and loss moduli as a posteriori decision making, up to the specialist's discretion and relevant criteria for the specific application.Nonetheless, for the purpose of comparison with other works, we base our experiments on data sets referred in the literature [17].Should more accurate data sets produced by more modern equipment be analyzed, this can be easily achieved by applying the outlined algorithm implementation to produce reference parameter tables for practical use.Moreover, it is worth to note that unlike deterministic methods, which produce a unique result for a given data set, evolutionary algorithms are inherently non-deterministic and, therefore, shall produce different results at each run, even for the same data set-as a result of exploring the solution search space through different paths.The convergence property, however, implies that solutions tend to result equivalent in terms of the optimization objectives.Actually, while we have limited our experiments by a maximum number of iterations, a bound on the maximum regression error or else on the incremental optimization gain at each iteration may be alternatively used as stop criteria, instead.
One noteworthy outcome of our experiment is that under less constrained conditions, and when the algorithm is left free to choose the solution's length, it naturally tends to reduce excess parameters, contributing to reduce regression residuals and improve curve smoothness simultaneously, based solely on the data's intrinsic information and without the need to a priori assumption.As already pointed out, it is thus suggestive to ponder on whether the number of parameter models needed to represent the material's behavior may have some inherent physical meaning.In our experiments the solutions's length converged to around 8 to 10 parameters, what is consistent to previous findings by [17].In our study we considered the regression residuals as the primary quality metric to assess the results.Yet, we opportune remarked that this is not the only parameter to evaluate a solution for this problem, as overfitting may produce low-residual solutions with quite unrealistic relaxation spectra such as with pathological oscillatory characteristics.One important feature of the proposed method to be highlighted is that the solution quality measure is not intrinsic to it as it is the case of several analytical techniques.In the EA approach, differently, the role of evaluating solutions' quality is played by the fitness function, which is a generic black-box for implementing any quality criteriawherefrom EA being considered a metaheuristic.Put shortly, this comes from the algorithm core being the repeated iteration of a sequence of steps: (a) evaluate all the solutions found so far in terms of how they fit the given criteria; (b) select (proabililistically) some best solutions; (c) produce some new solutions by performing (small) random changes on the selected best-fitted; (d) if some of newly created solutions are better than those in the original pool, replace the latter with the former.The procedure starts with a randomly generated solution pool and ends upon reaching some stop criteria.In step (a), the criterion to evaluate a solution is any function.The method implementation may leave it as a placeholder which the user may later fill in with any function they desires.This does not modify neither the algorithm nor the rest of the implementation.We have used Eq. ( 6) for fitness assignment; the domain analyst is free to choose other, possibly more sophisticated, criteria.The set of objectives of this MOP method are therefore user-defined, rather than inherent to the method, what yields great flexibility.
We believe both the discussed rationales and provided experimental evidences demonstrate the proposal of evolutionary multiobjective optimization as a powerful resource for nonlinear regression analysis applied to the modeling of causal phenomena, which is useful not only to technique-unbiased estimation of model parameters, but also to optimize the model order and investigate its structure, what may bring about implications for the theoretical understanding of its nature.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Pareto set as the solution of a multiobjective optmization problem.
2, . . ., n.In the equation, G ′ and G ′′ stand for the storage and loss moduli; d ′ and d ′′ account for the number of available experimental data, respectively.The tilde symbol indicates the experimental values and the corresponding tilde-less variables hold the values forecast by the model.Condition (
top and are difficult to discern.Figs.11(b) and 11(c) show the projection of the Pareto surface on the ϵ 1 × n and ϵ 2 × n planes, respectively.What this result is showing is that lower values of n produce lengthier Pareto fronts, i.e. a wider range of choices of regression parameters with varying relative combinations of ϵ i and ϵ 2 , either

Figs. 13 (
a) and 13(b) depict the result obtained for the LDPE material.The model was fit with n = 10 (empirical lower bound of ϵ suggested by Fig.8) parameters from Table4.Plots compare the experimental data and the corresponding values forecast by the fitted model.The regression residuals for LDPE material are, respectively ϵ 1 = 0.477 and ϵ 2 = 0.0508, comparable to the best results

Fig. 11 .Fig. 12 .
Fig. 11.Solutions of a free-n simulation in the objective space.

Table 1
Parameter values used in the experiments.

Table 2
Regression residuals for LDPE with varying number of parameters.

Table 3
Regression residuals for PS with varying number of parameters.

Table 4
Model parameters obtained for LDPE (log 10 ) scale.

Table 6
Effective n for loose-n variable-length simulations.

Table 7
Comparison with other methods.