An Intelligent Decision Support System for Production Planning in Garments Industry

. In this paper, we propose an Intelligent Decision Support System (IDSS) that combines prediction and optimization for production planning. We worked with a company that provides software for the garments Industry and that had access to real-world data related with a client that works with subcontractors. Using an Automated Machine Learning (AutoML) approach, we ﬁrstly target four predictive tasks that are crucial to estimate production planning indicators. Then, we use historical data and one of the predicted indicators to search for the best subcontractor allocation plan, which minimize both the cost and production time via an Evolutionary Multiobjective Optimization (EMO) algorithm (NSGA-II), achieving interesting results.


Introduction
Currently, there is a pressure in industries to increase efficiency (e.g., reduce operating costs and time) in order to compete in their markets. One way is to adopt an Intelligent Decision Support Systems (IDSS), which incorporate Artificial Intelligence techniques to provide actionable knowledge from raw data [2]. In this paper, we assume an IDSS for the garments Industry and that is based in the concept of Adaptive Business Intelligence (ABI) [11], which combines Machine Learning (ML), to predict relevant decision context variables, with Modern Optimization (MO) [7], to search for the best decision choices (according to one or more objectives).
There are some related works that employ MO methods to support production plans in the textile industry. For instance, in [1] Genetic Algorithms (GAs) were used to create production orders involving the spinning and weaving areas of the fabrication process. GAs were also adopted in [12] to optimize job orders of textile production lines. A combination of GA with Simulated Annealing was used by [13] to create energy efficient production orders. In another study, NSGA-II was used by [10] to solve a multi-objective multi-site order scheduling problem in the production planning stage with the consideration of multiple plants, multiple production departments and multiple production processes. More recently, [14] used a mathematical programming model to optimize textile production considering diverse "green" goals (e.g., waste reuse, energy recycling) and [3] optimized the master production scheduling using GA. Within our knowledge, none of these works adopted a data-driven ABI approach that combines predictive and prescriptive analytics, as provided by ML and MO algorithms. In this paper, we follow such innovative ABI combination by using an Automated Machine Learning (AutoML) [9] to first predict four important garment subcontractor decision variables. Then, we adopt historical data and one of the predicted variables (production time) to feed an Evolutionary Multiobjective Optimization (EMO) that searches for the best subcontractor allocation plan, simultaneously minimizing the total allocation cost and time.

Garment Data
The data was provided by INFOS, which is Portuguese software company that works with several textile industry clients. The company developed an Enterprise Resource Plan (ERP) that supports the production of garments. The goal of this research is to develop an IDSS based on the ABI concept and that will be integrated into the INFOS ERP system, allowing it to automatically design garment subcontractor plans regardless of size of the company and the complexity of the production order. The subcontractor selection is a non trivial task, since is a large range of textile operations, each involving costs and delivery dates. We collected all company garment related records, including purchase and manufacturing orders, from 2016 to 2020. The data was then divided in three major groups: purchase of raw material, manufacturing and subcontractor. Next, we implemented an Extraction, Transformation, Load (ETL) process to select and clean the data (e.g., removal of missing features and records with wrong dates). All data processing procedures (including the ABI system) were implemented in the Python language by the authors. Table 1 describes the input features (Attribute), their description (Description), data Type, number of Levels and Domain values separated by objective (four predictive targets and one optimization task). The final set of input features was obtained after several iterations of predictive task executions. The datasets for the predictive (regression) tasks include: Lead Time -3,315 records; Production Time -25,449 examples; Production Waste -24,425 instances; and Delivery Delays -6,016 records. Finally, the optimization objective (Production Plan) contains 5,500 records related with subcontractors.
Regarding the target output target variables for the predictive tasks, we detected that the company does not have records of them, being necessary to calculate them: Ldtime was obtained by subtracting the receiving date of a order from the placement order date and if resulting value was negative that row was discarded; for Prod days we create a function that subtracts the production finish date from the planned production start date and outputs the number of working days between the two dates and the if the number of days was negative that row was discarded; in the case if Waste ratio we first subtracted the produced quantity from the quantity to produce and if the resulting value was positive it was changed to zero, afterwards we divided the absolute result by the quantity to produce, multiplying the final result by 100; finally for Delay days we create a function that subtracts the scheduled delivery date finish date from the delivery date and outputs the number of working days between the two dates and the if the number of days was negative it was changed to zero. Table 2 describes the four output target variables with their description (Description), data type (Type), number of levels (Levels) and domain values (Domain).
In terms of preprocessing, since the String variables had a high cardinally, we employed a Label Encoder, in order to transform each level into a distinct numeric value. This option provided better results when compared with the known One-Hot encoding, which created a very high number of input features. As for the Date features, we adopted the proleptic Gregorian ordinal of a date, allowing to provide a simpler numeric value. Then, all numeric inputs were normalized by using a z-score standardization.

Intelligent Decision Support System
The proposed IDSS contains three main modules ( Figure 1): data extraction and processing, prediction and optimization. The first module is responsible for receiving the garment data, selecting the features for each objective and then creating the necessary input for prediction. The prediction module receives the data separated by predictive task splitting it into training and test sets (data separation, according to the adopted cross-validation method). Then, it trains the predictive models (model training), evaluating the models performance (model evaluation), selecting and storing the best prediction model (model selection). Then, the user inserts the data related to the lead time, using the respective model to predict the number of days that will take to receive the raw materials and can define a starting and end date for production. Finally the optimization module receives the subcontractors data (Table 1), filtered by the product to manufacture and the textile operations to execute, the quantity to produce and maximum allowed dates (all provided by the user). Then, the MO algorithm uses this data and also one of the predicted indicators (production time) to search for the best subcontractor quantity allocation, aiming to reduce the total costs and time.
To reduce the modeling effort during the development of the prediction module, we adopted the H2O AutoML tool that provided good results in recent Au-toML benchmark study [9]. The AutoML was configured to automatically select the best regression model and its hyperparameters based on the best Mean Absolute Error (MAE), using a internal 10-fold cross-validation applied over the training data. Five different ML algorithms were searched by the tool: Random Forest, Extremely Randomized Trees, Generalized Linear Models, Gradient Boosting Machine and two Stacked Ensembles, one with best model of each family and other with all trained models. An external 10-fold cross-validation was executed to evaluate the ML models and the quality of the regression was accessed by using the MAE and Normalized MAE (NMAE) metrics. The lower the values, the better are the predictions. The NMAE measure normalizes the MAE by the range of the output target on the test set, thus it provides a percentage that is easy to interpret and that is scale independent.
A production order can be defined as a composition of tasks that are executed sequentially. Each task can be represented by a set of candidate subcontractors offering similar services, where each service can have a different value in price and quantity per subcontractor. The subcontractor allocation is defined as a multi-objective task (i.e., reduce both cost and time), thus we employ a Pareto approach via an EMO algorithm, namely NSGA-II [7], as implemented in the pymoo Python module [4]. NSGA-II is a multi-objective optimization algorithm with three distinctive features: fast non-dominated sorting approach, fast crowded distance estimation procedure and usage of a simple crowded comparison operator [8]. When compared with other hypervolume based algorithms (e.g, SMS-EMOA), the NSGA-II algorithm tends to obtain competitive results when only two or three objectives are optimized [6]. The algorithm returns a population of non dominated solutions, each representing a different subcontractor allocation and that is associated with a distinct cost-time trade-off. The full subcontractor optimization can be defined in terms of x textile sequential operations that need to be executed. For each operation, there are y candidates (subcontractors) with different price and capacity parameters. Each solution is naturally represented as a sequence of q i integer values (0 ≤ q i ≤ q max ), denoting the quantity assigned for each subcontractor i, where q max denotes the total required quantity for operation x, and i ∈ {1, ..., M } and M represent the number of available subcontractors for operation x. We repair solutions by ignoring any excess of subcontractor allocation (first allocated subcontractor is served first) or by randomly distributing the deficit allocation to any of the available subcontractors. Each solution is evaluated in terms of total production plan cost and allocation time. To compute these two goals, the EMO algorithm uses the production time prediction (as shown in Figure 1). Once the Pareto curve is optimized and for user selected trade-offs, we then compute the prediction indicators of the remaining targets (e.g., production waste), such that the user can further inspect the quality of the obtained solutions. In order to obtain a single measure per Pareto curve, we selected the Hypervolume (HV) measure, which represents the volume of the objective space when assuming a "worst" reference point [5]. The higher the HV value, the better is the Pareto curve optimization.

Experiments and Results
The average of the external 10-fold iteration predictive results (in terms of MAE and NMAE) are presented in Table 3. The table also presents the best ML Model. In general, low regression errors were achieved, with the NMAE values ranging from 3.6% to 9.2%. We particularly note that the best NMAE values were obtained for the target that is directly used by the NSGA-II MO (Prod days produces an average NMAE error of just 3.6%). The selected ML algorithm was a stacked ensemble for three of the targets, while the Gradient Boosting Machine obtained the best results for the production waste prediction. For the optimization experiments, we analyzed a production order of 10,000 units of a product that requires three textile operations (cutting, tailoring and packaging) using one raw material. Using historical data, we then selected all the subcontractors that could execute these operations along with the respective cost and production capacity to create a subcontract allocation case study to utilize in the experiments. In total, the case study includes 26 subcontractors (which corresponds to the number of searched integers by the NSGA-II algorithm): cutting -4 candidates, tailoring -8 candidates and packaging -14 candidates (4+8+14=26). To compute the cost and time associated with each solution, we use four attributes from Table 1 (Subc cod, Capacity, Price andOper desc) and also the predicted Prod days variable (see Table 2). We assumed some reasonable assumptions (defined by the INFOS company): one subcontractor cannot execute two or more tasks simultaneously, the subcontractor is always available and there is no shortage of raw materials.
The two objective functions that need to be minimized are the Total Cost (TC) and Total Production Time (TPT). The TC function is the sum of the multiplication of the assigned quantity to a individual by price of operation for that individual operation. As for TPT, the function is the sum of the maximum days required by each sequential operation (cutting, tailoring and packaging). Since subcontractors can work simultaneously in the same operation (e.g., cutting), we consider the slowest operator time (measured in terms of number of days). Solutions that split the q i quantities by different operators for an operation will thus contribute for a lower TPT value. The lower bound is always zero and the upper bound was set to the quantity to be produced. When needed, a repair procedure is used to convert an unfeasible solution to a feasible one, see Section 2.2.
The NGSA-II algorithm was configured with a check procedure that eliminates duplicates, making sure that the mating produces offspring that are different from themselves and the existing population regarding their design space values. A grid search was used to set the NSGA-II hyperparameters (e.g., the population size was ranged within {50,100,150,...,500}), assuming the HV measure as the selection criterion and a reference point of (30 days, 20,000 EUR). The best obtained values correspond to a normalized HV (when each objective is divided by the respective reference point value) of 0.71, which requires 157 seconds of execution time on an Intel Xeon processor. The selected NSGA-II setup includes: population size of 100, two-point crossover with 90%, polynomial mutation probability of 20% and total of 200 generations.
The left of Figure 2 shows the Pareto front obtained after 200 generations when considering our case study. The Pareto front contains 100 solutions, with the TPT ranging from 12 to 30 working days and TC ranging from 18,000 to 20,000 EUR. The right of Figure 2 shows the evolution of the NSGA-II algorithm, in terms of the full HV measure (y−axis) through the executed 200 generations. The graph shows a substantial improvement that is obtained by NSGA-II. In effect, in the first generation the HV measure is 4,700 (normalized value of 0.2). After 200 generations, the value increased to 16,391 (normalized value of 0.71), which corresponds to an improvement of 51 percentage points when considering the normalized HV scale. The results were shown to the INFOS company, which provided a very positive feedback. In particular, the obtained TPT and TC ranges were considered realistic. Moreover, the company signaled that the obtained Pareto front provides a more richer set of trade-off solutions, while also being faster to compute when compared with the currently adopted manual subcontractor allocation.

Conclusions
We propose an IDSS that creates a textile production plan to allocate subcontractors. The IDSS is based on the ABI concept that combines predictive (via ML) with prescriptive (via MO) analytics in order to provide actionable knowledge from raw data. The IDSS was designed to work with real-world data from a Portuguese software company (INFOS) that works with diverse textile clients. Firstly, an AutoML tool was adopted to automatically select the best ML model among five algorithms when targeting four relevant allocation decision context variables. Interesting results were achieved by the prediction models (error that ranges from 3.6% to 9.2%). Then, we designed a MO model that uses one of the predicted variables (production time) and historical data to automatically allocate subcontractors to execute sequential operations associated with a textile order. The MO model, based on the NSGA-II algorithm, assumes a Pareto approach and it was designed to simultaneously minimize the cost and time to execute the order. To demonstrate the MO, we selected a case study that includes four operations and 26 potential textile subcontractors.
The obtained results were shown to the INFOS company, which considered them very positive. In future work, we intend to augment the IDSS by incorporating more problem-domain constraints, such as incorporating updated data about the currenty availability of subcontractors. Furthermore, we wish to deploy the designed IDSS into the INFOS ERP system, in order to get more valuable feedback from a real environment usage.