Inclusion of maintenance energy improves the intracellular flux predictions of CHO

Chinese hamster ovary (CHO) cells are the leading platform for the production of biopharmaceuticals with human-like glycosylation. The standard practice for cell line generation relies on trial and error approaches such as adaptive evolution and high-throughput screening, which typically take several months. Metabolic modeling could aid in designing better producer cell lines and thus shorten development times. The genome-scale metabolic model (GSMM) of CHO can accurately predict growth rates. However, in order to predict rational engineering strategies it also needs to accurately predict intracellular fluxes. In this work we evaluated the agreement between the fluxes predicted by parsimonious flux balance analysis (pFBA) using the CHO GSMM and a wide range of 13C metabolic flux data from literature. While glycolytic fluxes were predicted relatively well, the fluxes of tricarboxylic acid (TCA) cycle were vastly underestimated due to too low energy demand. Inclusion of computationally estimated maintenance energy significantly improved the overall accuracy of intracellular flux predictions. Maintenance energy was therefore determined experimentally by running continuous cultures at different growth rates and evaluating their respective energy consumption. The experimentally and computationally determined maintenance energy were in good agreement. Additionally, we compared alternative objective functions (minimization of uptake rates of seven nonessential metabolites) to the biomass objective. While the predictions of the uptake rates were quite inaccurate for most objectives, the predictions of the intracellular fluxes were comparable to the biomass objective function.


Introduction
To successfully use modeling for the rational design of new engineering strategies, 22 accurate predictions of cellular phenotypes are essential. Previously we showed that a 23 GSMM can accurately predict growth rates if supplied with accurate exchange rates of 24 (essential) amino acids (AAs) and correctly determined biomass composition [13][14][15]. 25 However, the model also needs to accurately predict intracellular fluxes. Previously it 26 was shown that biomass composition [16] as well as extracellular exchange rates [17] 27 have a big impact on the predicted intracellular fluxes. However, the validation of the 28 flux predictions by the CHO GSMM with experimental data has been done only in one 29 study so far [9]. 30 In this work, we compare fluxes predicted by parsimonious flux balance analysis 31 (pFBA), using the GSMM of CHO [8], against 20 13 C metabolic flux data sets across 32 producer and non-producer cell lines in different media and culture modes (batch, 33 fed-batch, semi-continuous) extracted from six publications [18][19][20][21][22][23]. We find that many 34 fluxes in central carbon metabolism can only be reliably estimated if non-growth 35 associated cellular maintenance is considered which was so far not included in the 36

37
Materials and methods 38 Data and code availability 39 All data and scripts for the simulations and visualisation are available in Mendeley Data 40 at https://data.mendeley.com/datasets/p973bk79ck/draft?a= 41 12eceafb-76be-4ab5-8a53-86788fda6210. 42 pFBA simulations 43 PFBA [24] was performed with the package COBRApy [25] using the solver Gurobi 44 9.1.0 [26] in python 3.7.9. The GSMM of CHO iCHO1766 [8] was used. Maximization 45 of biomass production was used as the objective function (reactions R biomass cho and 46 R biomass cho producing). Uptake and secretion rates of extracellular metabolites 47 and the recombinant product from 20 datasets (six publications) were used as 48 constraints [18][19][20][21][22][23] (see Table S1 for an overview). In several cases, the uptake rates of 49 tryptophan were not available. Assuming that it is the least abundant AA in the in the publications. If not available, the average mass of CHO (264 pg) was used [13]. In 53 some cases, the experimental data for the exchange rates was not provided, so the fitted 54 values from 13 C metabolic flux analysis (MFA) were used. If the data was provided only 55 as plots, it was extracted using WebPlotDigitizer (https://apps.automeris.io/wpd/). 56 Oxygen uptake was left unconstrained. 57 Mapping of 13 C models to i CHO1766 58 In order to compare predictions made by the GSMM of CHO to the results of 13 C MFA, 59 it was necessary to map the metabolites and reactions from all models used for 13 C 60 MFA (referred to as " 13 C models" in the further text) to the GSMM of CHO 61 iCHO1766 [8]. Metabolic flux data was extracted from six publications [18][19][20][21][22][23], see 62 Table S1 for an overview. 63 For most models, it was impossible to make a one to one mapping. Thus, the 64 following rules were applied.

65
• If one reaction in a 13 C model could be mapped to several reactions in i CHO1766, 66 the fluxes from these reactions were summed up or subtracted, depending on the 67 direction.

68
• In case of multiple equivalent reactions occurring in several compartments, their 69 individual contributions predicted by pFBA were summed up and only the total 70 was used for comparison. This approach disregards cellular compartments.

71
• In the 13 C models, several reactions are often lumped into one; therefore, the net 72 flux of the corresponding reactions in the GSMM was calculated and compared to 73 the flux of the lumped reaction. To estimate maintenance energy (mATP), pFBA was run for every individual dataset, 80 where growth rate was maximized and mATP hydrolysis (reaction "R DM atp ") was 81 constrained to a range of different values of mATP (0-40 mmol g −1 h −1 or until the 82 simulation was no longer feasible). For each value of mATP, the agreement between 83 experimental and predicted fluxes was evaluated and the value that lead to the lowest 84 median relative error was chosen as optimal. Reactions with the experimental fluxes less 85 than 1% of the maximum flux (|v i e / max v e | < 0.01) were omitted from this analysis, 86 because their absolute fluxes were very small (often close to zero) and consequently the 87 relative errors were very high (even though the absolute differences in fluxes were very 88 small). This often distorted the analysis and no clear minimum was observed.

89
Additionally, the analysis was performed with all datasets at the same time -the mATP 90 value was varied and the overall agreement between the predicted and the experimental 91 fluxes for all datasets was evaluated. The calculated median errors were divided by the 92 number of datasets for which a feasible solution was obtained (because the higher the 93 mATP value, the lower the amount of datasets with a feasible pFBA solution).

94
Uptake objective function 95 PFBA simulations were done as in [29]. First, growth rate and productivity were fixed 96 to the experimental values and the uptake of each essential AA was minimized one by 97 December 21, 2020 3/23 one to get an estimate for the minimum uptake rate that sustains the experimental 98 growth rate (nine AAs are essential in the iCHO1766 model). If the measured 99 experimental uptake rate was lower than the minimum required uptake by the model, we 100 fixed it to the computed uptake. Otherwise the experimental uptake rate was used. In a 101 few cases, the uptake rates of tyrosine and cysteine had to be adjusted in the same way 102 as the essential AAs, because the experimental uptake rates were insufficient to sustain 103 growth and no solution was obtained (three datasets for tyrosine and three for cysteine). 104 In the next step, we constrained the nonessential uptake rates, secretion rates, 105 growth rate and productivity to the experimental values (except for the nonessential 106 uptake rate that was set as the objective, which was left unconstrained) and the 107 essential uptake rates to the predicted or experimental values (see above). The flux  (1), where v i p are the fluxes predicted by pFBA and v i e are the experimentally determined  positioned at the height corresponding to 270 mL and the flow rate was set to a higher 145 value than flow-in to prevent overflow. The feed medium was kept at room temperature 146 and protected from light with aluminum foil. Due to the instability of glutamine, the 147 medium was exchanged every 5-7 days. During this time frame, the glutamine 148 degradation was shown to be negligible (see Fig S6).

149
After changing the dilution rate, the cultures were left to equilibrate for at least five 150 volume exchanges (except for one dilution rate (DR3, see Table 1) which was 151 interrupted because of contamination). To verify whether the cultures reached steady 152 state, linear fits were performed with viable cell density and Bioprofile data (glucose, 153 lactate and ammonium concentrations) using R function lm. If the 95% confidence 154 interval of the slope contained zero, the parameter was considered stable. For seven out 155 of eleven dilution rates, all parameters were stable, including a dilution rate where only 156 2.5 volume exchanges were reached due to contamination (DR3). For three dilution 157 rates 3 /4 parameters were stable, but the concentration changes of the unstable 158 parameters were within the measurement error of the measurement device. Overall, all 159 dilution rates were deemed stable enough and were used for further analysis (see Table 1 160 for an overview). Table 1. The dilution rates, calculated growth rates (Eq (3)) and the steady state concentrations of cells and metabolites.

161
. For the remaining dilution rates, AAs were analyzed using a high-performance liquid 178 chromatography method. Briefly, samples were diluted, internal standards 179 3-(2-thienyl)-DL-alanine (Fluka) and sarcosine (Sigma) were added and subsequently 180 filtered using a 0.2 µm filter unit (Sartorius). In an automated pre-column 181 derivatization method, free primary AAs reacted with ortho-phthalaldehyde (OPA, 182 Agilent) and proline and hydroxyproline with 9-fluorenyl-methyl chloroformate (FMOC, 183 Fluka) and were then separated on a ZORBAX™ Eclipse Plus C18 column (Agilent) at 184 40 • C using a flow rate of 0.64 mL/min. After gradient elution with 10 mM For all calculations, the average of all data points in each steady state was used.

198
Dilution rate D was calculated with Eq (2), where F is the flow rate into the bioreactors (calculated from the change of mass of the 200 fresh culture medium over time) and V = 270 mL is the volume of the medium in the 201 bioreactors. The growth rate µ at steady state was calculated with Eq (3), where N v /N t denotes fraction of viable cells. Steady state exchange rates q i of 203 extracellular metabolites were calculated with Eq (4), where C i in and C i out are concentrations of metabolites in the incoming medium and in 205 the bioreactor, respectively. To calculate standard deviation (SD) for an exchange rate, 206 the SDs were calculated for each variable from all available data points in steady state. 207 Then, the SD (σ) of the rate was calculated according to the mathematical rules of 208 manipulation with standard deviations -Eq (5) if values were multiplied or divided (e.g. 209 C = AB), and Eq (6) if they were summed or subtracted (e.g C = A+B) December 21, 2020 6/23 Determination of maintenance energy 212 The GSMM with cell line specific biomass composition from [13] was used 213 (iCHO K1par-8mMCD, BioModels ID: MODEL1907260016). The experimentally 214 determined growth rate and the exchange rates of glucose, lactate, ammonium and AAs 215 were used as constraints for flux balance analysis (FBA). The lower and upper bounds 216 of the exchange reactions and the biomass reaction ("R biomass specific") were fixed 217 to the experimental values ± SD. Due to high experimental noise, the uptake rates of 218 some essential AAs were too low to sustain the experimental growth rate. Therefore the 219 minimal uptake requirements were estimated as in [29] and, if necessary, the constraints 220 were adjusted. This had no impact on the calculations of the ATP consumption as these 221 AAs are solely used for biomass generation and not for ATP generation. In three cases, 222 the lower bounds of secretion rates were relaxed (DR1 -alanine secretion by 25%, DR5 223 glycine secretion by 40%, DR6 aspartate by 25%), otherwise the solutions would have underestimated, and overestimated in the remaining eleven. In one extreme case growth 238 was overestimated by 190%. Overall the median relative error was close to 60%. pFBA 239 performs even worse when compared to measured intracellular fluxes, hitting an overall 240 median relative error of 83.8% (Fig 1b). On average, fluxes in glycolysis and AA are 241 predicted better than fluxes in pentose phosphate pathway (PPP), pyruvate 242 metabolisms, and tricarboxylic acid cycle (TCA). More specifically, fluxes in PPP and 243 TCA were vastly underestimated (median error 86.9% and 94.5%, see Table 2). 244 We checked whether flux predictions can be improved when experimental growth 245 rates were used as additional constraints. Yet, no improvement was observed for those 246 datasets that returned a feasible solution (median relative error 91.5% vs. 83.8% 247 previously).

267
Enforcing a minimal mATP strongly decreases the prediction errors in the 268 intracellular fluxes for all but one dataset, see Fig 2b. More specifically, the overall 269 median relative error decreased from 83.8% to 24.6% (Fig 3b)  Not only intracellular fluxes, but also growth rates were better predicted (Fig 3a).

273
Now ten rather than previously only six out of 20 growth rates could be predicted with 274 an error of less than ±25%. Five were even predicted within an error band of ±10%.  Recently, minimizing uptake of non-essential nutrients (rather than maximizing growth) 283 was suggested to be a more suitable modeling objective for CHO [29]. Thus, we 284 repeated all previous simulations with uptake objective function (UOF), using the 285 exchanges of glucose, glutamine, serine, tyrosine, asparagine, aspartate and arginine as 286 objectives. Maintenance energy was estimated as before and similar mean mATP values 287 were obtained (5.6-6.4 and 5.8-6.7 for the biomass equations R biomass cho and 288 R biomass cho producing, respectively).

289
The prediction accuracy of the intracellular fluxes after the addition of the mATP

293
The predictions of the minimum uptake rates were best for glucose with R 2 = 0.92 294 and a median relative error of 4.1% (Fig 4a), followed by glutamine (R 2 = 0.75, median 295 error 50.4%) and asparagine (R 2 = 0.12, median error 24.2%). However, the uptake 296 rates of the remaining AAs were not predicted well (R 2 = 0.06 or less; median errors 297 within a range of 65.8-176%; Fig S4).  (Table 1). Cell viability was 302 above 95% for all steady states. The steady state viable cell densities were between 5.3 303 and 6.4 × 10 6 viable cells/mL for eight dilution rates; for the remaining three they 304 reached between 9.7-11.4 × 10 6 viable cells/mL (Fig S7). For each dilution rate, 305 extracellular exchange rates of glucose, AAs, lactate and ammonium were determined. 306 Uptake of glucose, and glutamine as well as secretion of lactate and ammonium 307 increased with increasing growth rate, see Fig 5. However, in case of waste product 308 secretion rates, the three dilution rates that had higher steady state cell concentrations 309 seem to separate from the remaining ones (indicated as magenta triangles in Fig 5).

310
Exchange rates of glucose, lactate, ammonium, all AA, and the growth rate were 311 used as constraints for FBA and the hydrolysis of ATP was maximized. The ATP 312 consumption was plotted against growth rate and a linear model was fitted (Fig 6). The 313 intercept represents the non-growth associated ATP consumption -the estimated mATP 314 and its standard error was determined to be 4 ± 1.6 mmol g −1 h −1 , which compares well 315 with the average mATP of 5.9/6.4 mmol g −1 h −1 determined computationally above.

317
An accurate determination of intracellular fluxes is key for understanding cellular 318 metabolism and applying methods that predict engineering strategies. Intracellular 319 fluxes can be experimentally determined with 13 C metabolic flux analysis [33]. However, 320 this method is very expensive due to the usage of labelled substrates and prone to recombinant proteins, it currently lacks a value for non-growth associated maintenance 336 energy -the energy needed for processes such as turnover and repair of macromolecules 337 or maintenance of concentration gradients (e.g. Na + /K + and Ca 2+ ATPases) [34]. As 338 no such value was available for CHO until now, we determined mATP computationally 339 and experimentally.

340
The variability of the computationally estimated mATP across cell lines and 341 conditions was quite high (relative SD 49% and 64% for R biomass cho and 342 R biomass cho producing, respectively). This might be the result of the experimental 343 errors of the metabolite exchange rates. As seen in Fig 1a, the growth rate predictions 344 had a high error, which we have shown previously to be sensitive to errors in the 345 exchange rates [13,14]. Another factor is the error of the 13 C flux measurements which 346 often had considerably big confidence intervals. However, the differences might also stem 347 from differences in the cell lines, cultivation conditions or productivities. Nevertheless, 348 the average mATP values were very similar when estimated with different biomass 349 equations (Fig 2a) and lead to a major improvement in the predicted intracellular  The shaded areas represent 95% confidence intervals. The triangle points in magenta color are the dilution rates that had unusually high cell concentration in steady state (see Fig S7).
of a mATP constraint, which points at alternative NADPH sources connected to the 352 TCA, e.g. malic enzyme. Indeed we found higher activity of malic enzyme in some 353 datasets, but not consistently in all. This points to a possible lack of actual NADPH 354 demand in the model -e.g. for protein folding or degradation of misfolded proteins [35]. 355 We also investigated the effect of alternative objective functions (nonessential uptake 356 rates) suggested by Chen et al. [29]. The estimated mATP values and the predictions of 357 intracellular fluxes were comparable to the predictions done with the "traditional" BOF 358 for all tested objectives. However, the predictions of the minimum uptake rates worked 359 much better for glucose uptake rate compared to the AA uptake rates.

360
The choice of the appropriate objective function might depend on the availability of 361 experimental data. In case of using BOF, highly accurate uptake and secretion rates are 362 needed in order to obtain accurate predictions, especially for essential AAs [13,14]. If 363 these are not available, using the UOF (glucose) might be a better choice than the use 364 of BOF with imprecise AA uptake rates as constraints.

365
The experimentally determined mATP was comparable to the computational 366 Fig 6. The rate of ATP consumption at different growth rates. The black line is a linear fit and the intercept represents the ATP consumption at zero growth rate. The magenta triangles are dilution rates that had unusually high cell concentration in steady state (see Fig S7). The shaded area represents 95% confidence interval. estimate, although the uncertainty was quite high due to the technical difficulty of 367 running continuous fermentation and the unstable nature of CHO cells. Long 368 cultivations lead to cell clumping, which complicated cell counting. It is also known that 369 CHO cells are unstable during long term cultivations [36,37]. Furthermore, the 370 physiological state of a culture during steady state might differ depending on how it was 371 reached and different properties (e.g. cell and metabolite concentration) can be 372 observed even if the same dilution rate and cultivation conditions are used [38][39][40][41][42][43][44][45]. Such 373 multiplicity of steady states is likely a consequence of toxic waste product accumulation. 374 Lower waste product secretion and higher cell densities indicate a metabolic switch to 375 an energetically more efficient metabolism. This phenomenon could explain the different 376 cell densities and exchange rates observed for three dilution rates (Fig S7 and Fig 5). 377 However, not enough data was available to investigate this phenomenon in more detail. 378 In literature there is only a small amount of data for mammalian maintenance 379 energy and no data for CHO. Mouse cells require a maintenance energy of 1.7 × 10 −11 380 mmol cell −1 d −1 (65% of the total energy produced at the highest growth rate) [34], 381 which corresponds to 1.1 mmol g −1 h −1 with a mouse cell dry mass (660 pg/cell) or 382 2.4-3.6 with a range of CHO dry masses [13]. However, the analysis in [34] was quite 383 simplified. Even though they cultivated the cells with a hydrolysate, they only 384 considered glucose as the energy source and calculated the generated ATP from the 385 secretion rates of lactate and CO 2 . However, mammalian cells in culture will also use 386 glutamine and other AAs as energy source [46].

387
Depending on the experimental/computational methods used, maintenance energy of 388 cancer cells was estimated to be within 1.6 and 3.7 mmol g −1 h −1 [47,48]. The values 389 were converted from the original publication with CHO specific volume and dry mass 390 values [13].

391
In other organisms, estimated/measured values widely vary and often depend on the 392 cultivation conditions. In bacteria, the reported values range between 393 3.15-18.5 [30,[49][50][51], in yeast between 0.44-2.81 mmol g −1 h −1 [32,[52][53][54]. However, not 394 only does mATP change across organisms and conditions but also during the batch as it 395 is influenced by stress responses [31]. 396 Few other studies also tried to improve the predictions by iCHO1766. For example, 397 Lularevic et al. [55] reduced variability in flux variability analysis by adding carbon 398 availability constraints. In another study, the predictions of intracellular fluxes were 399 improved by adding constraints based on enzyme kinetic information [9]. This also lead 400 to a correct prediction of the overflow metabolism (the secretion of lactate). Together 401 these studies, including the current one, show that adding more constraints to the 402 models is necessary to fully capture cellular metabolism and leads to better predictions. 403 Further developments and a combination of different approaches might lead to further 404 improvement.  Fig S2 . An example of the computational estimation of mATP. mATP was gradually increased and the agreement between experimental and predicted fluxes was evaluated at each step. The mATP value that lead to the smallest median relative error of the fluxes was chosen as the optimal value. Data is shown for the dataset SV-M3 from Templeton 2017 [22] for biomass equation R biomass cho.