

ORIGINAL ARTICLE 

Year : 2021  Volume
: 10
 Issue : 1  Page : 327332 


Is data mining approach a best fit formula for estimation of lowdensity lipoprotein cholesterol?
Rajlaxmi Sarangi^{1}, Jyotirmayee Bahinipati^{1}, Mona Pathak^{2}, Srikrushna Mahapatra^{1}
^{1} Department of Biochemistry, Kalinga Institute of Medical Sciences (KIMS), Bhubaneswar, Odisha, India ^{2} Department of Biostatistics, Kalinga Institute of Medical Sciences (KIMS), Bhubaneswar, Odisha, India
Date of Submission  25Aug2020 
Date of Decision  27Oct2020 
Date of Acceptance  01Dec2020 
Date of Web Publication  30Jan2021 
Correspondence Address: Dr. Jyotirmayee Bahinipati Associate Professor, Department of Biochemistry, Kalinga Institute of Medical Sciences (KIMS), Bhubaneswar  751 024, Odisha India
Source of Support: None, Conflict of Interest: None  Check 
DOI: 10.4103/jfmpc.jfmpc_1734_20
Background: With the change in the National Cholesterol Education Program ATP III guidelines, the risk of developing atherosclerosis has been now focused on total cholesterol and lowdensity lipoprotein (LDL) cholesterol levels. Different treatment modalities are now targeted at lowering LDL cholesterol values. Hence greater emphasis is now led on the accurate and precise measurement of LDL cholesterol. Betaquantification, though, is the best reference method for LDL cholesterol estimation, it has the disadvantage of being inconvenient in our routine practice. The new generation direct homogenous assay is now the method of choice. But being more expensive, various calculated methods have now been developed. This study is an attempt to compare different calculated formula with direct cholesterol assessment and to find out the best one. Materials and Methods: We compared LDL cholesterol measured by direct homogenous assay with the data mining approach (DM) and another calculated formula [Friedewald's Formula (FF) and Anandaraja Formula (AF)] in 266 samples with age greater than 18 years. Enrolled participants were divided into seven groups based upon their TG levels. Mean, percentage difference, and the correlation coefficient was assessed between calculated and direct LDL. Bland–Altman analysis was done to see the agreement between calculated vs direct LDL. All formulas were assessed among various TG levels with direct LDL by the Wilcoxon sign rank test. Result: 1% level of significance was found between calculated and direct LDL with TG < 600 mg/dl. Mean and the percentage difference between direct and calculated LDL was lowest with the DM approach. Bland–Altman plot shows the best agreement of the DM approach with direct LDL. Conclusion: This study indicates that the DM approach is closer to direct LDL compared to FF & AF.
Keywords: Ananda raj formula, cardiovascular disease, data mining approach, Friedewalds formula, LDLC
How to cite this article: Sarangi R, Bahinipati J, Pathak M, Mahapatra S. Is data mining approach a best fit formula for estimation of lowdensity lipoprotein cholesterol?. J Family Med Prim Care 2021;10:32732 
How to cite this URL: Sarangi R, Bahinipati J, Pathak M, Mahapatra S. Is data mining approach a best fit formula for estimation of lowdensity lipoprotein cholesterol?. J Family Med Prim Care [serial online] 2021 [cited 2021 Feb 25];10:32732. Available from: https://www.jfmpc.com/text.asp?2021/10/1/327/307955 
Introduction   
LDL cholesterol is one of the primary key predictors and a wellestablished, modifiable risk factor for cardiovascular disease.^{[1]} There is a strong positive association between increased LDLC and atherosclerosis. The levels of LDLC cholesterol are used in clinical decisionmaking guidelines to reduce cardiovascular risk. It is observed that about a 1% reduction in LDLC can reduce the risk of CAD by 1%.^{[2]} LDLC remains of utmost clinical importance. It is considered as the treatment target and emphasized in worldwide guidelines as primary cholesterol target.^{[3]} Therefore, accurate and precise measurements are necessary to appropriately identify and monitor hypercholesterolemia.
Beta Quantification is the standard method for the estimation of LDLC. It includes ultracentrifugation and chemical precipitation. As this method is costly, labor intensive, delays the turnaround time, it cannot be employed in routine clinical practice.^{[4],[5]} Automated methods are available for direct LDL (DLDL) estimation, which has the advantage of being precise, can be done in nonfasting samples, and less interference by triglyceride (TG). But still, the direct method is not perfect, because the composition of lipoproteins influences the ability of a direct method to specifically measure the cholesterol contents of one lipoprotein class in presence of other types of lipoprotein.^{[6]} Sometimes direct LDL value is overestimated and even some direct methods give an underestimation of LDL cholesterol. Besides, it is expensive and requires significant time for analysis.^{[7],[8],[9]}
LDL cholesterol calculated using Friedewald's formula (FF) correlates well with LDL cholesterol measured by Beta quantification. But this formula cannot be used for LDL cholesterol calculation when a subject is not fasting because it does not consider the cholesterol formed postprandially in chylomicrons or intermediatedensity lipoprotein or lipoprotein A. It also can't be used when Serum TG >400 mg/dl or <100 mg/dl or in patients with Type III or Type I hyperlipoproteinemia. FF formula considers a fixed factor of five for a ratio of TG to VLDL, but this ratio seems to vary significantly across the range of TG and cholesterol levels. This formula is also not recommended for Type 2 Diabetes Mellitus, Nephrotic Syndrome, and Chronic alcoholic patients.^{[10],[11]} Anandaraja formula (AF) uses only two analytesTG & TC for calculation, which may decrease the total error compared to FW formula. It is also more economical as it does not require HDL cholesterol results for the calculation. Compared to FF, AR formula tends to give higher percentage error and less reliable in patients with low HDL and Total cholesterol.^{[12],[13]} AF overestimate LDL cholesterol up to TG ≤200 mg/dl while underestimating at TG 201–400 mg/dl. 12 Hence accurate estimation of LDLcholesterol is still a challenge. Methods for LDLC calculation was developed by National Institutes of health (NIH) and it has some advantages over traditional LDLC calculation using FF.^{[14]}
New interdisciplinary subjects of the data mining approach (DM) have given us the benefit to deal with much higher dimensional and bigger data. The DM approach is also the cheapest and and most efficient way to get an accurate report. So, the DM approach can be applied to validate a new formula for estimating LDL cholesterol if it strongly correlates with Direct LDL cholesterol measurement. Keeping these in view, the present study aimed at comparing different calculated formulas with the direct LDL values and find out the bestfit formula.
Materials and Methods   
The current crosssectional study was conducted after obtaining clearance from Institutional Ethics Committee (IEC no: KIIT/KIMS/IEC/285). Considering LDLC by Friedewald's formula (Mean ± SD = 72.16 ± 78.09 mg/dl) and newly developed technique that is DM approach (Mean ± SD = 83.29 ± 77.68 mg/dl) with 80% power at 5% level of significance calculated sample size was 266. Data was collected from lipid profile reports after analysis of serum samples received from patients, who came for investigation of lipid profile to the central lab (Biochemistry section) of our tertiary care hospital. Study was conducted for a period of 6 months (Oct 2019–March 2020). Informed consent was obtained from the participant and explained about less than minimal risk involved. All the subjects above 18 years of age who came for lipid profile investigations were included in the study group. Pregnant women, patient with liver failure, end stage renal disease, those using lipid lowering drugs were excluded from the study group. After overnight fasting, 5 ml of blood sample was collected in red topped vacutainer. Collected blood sample was centrifuged in 3,000 rpm for 10 min, serum was separated. After running and checking of daily quality control, as per the standard laboratory protocol serum sample was run in fully automated analyzer (OCD 5600 Modular system) using commercially available kit as per manufacturer protocol.
Total cholesterol was measured by cholesterol oxidaseperoxidase method; TG by enzymatic colorimetric method with glycerol blank; HDLcolorimetric, nonHDL precipitation method and direct LDL (DLDL) cholesterol by endpoint assay. The automated method used for quantitative estimation of direct LDL cholesterol is a twostep reaction. In the first step, nonLDL cholesterol (such as HDL, VLDL and chylomicrons) selectively eliminated by reaction with cholesterol esterase and cholesterol oxidase to form cholestenone and hydrogen peroxide. The peroxide generated is immediately scavenged by catalase. In the second step, specific measurement of LDL cholesterol occurs.
Friedewald's formula (FF) (FF CLDL = Total cholesterolHDL cholesterolTG/5) was used for calculating LDLC along with Anandaraja formula (AF) (AF CLDL = 0.9 TC0.95 TG/528) and a new formula that is DM analysis (DM CLDL = 0.99 TC 0.98 HDLC – 0.19 TG + 7.14).
Statistical analysis
Results were reported using mean and standard deviation, as well as median (interquartile range) for quantitative variables. Pearson's correlation coefficient was used to assess the linear relationship among LDL concentration calculated by various formulas and direct LDL. Bland–Altman analysis was performed to assess the agreement between calculated LDL by various formulas with direct LDL.^{[15]} The study subjects were divided into 7 groups based on serum TG level (Group I: TG <200 mg/dl, Gr II: TG 200–300 mg/dl, Gr III: TG 300–400 mg/dl, Gr IV: TG 400–500 mg/dl, Gr V: TG 500–600 mg/dl, Gr VI: 600–1000 mg/dl, Gr VII: TG >1000 mg/dl) [Figure 1]. The three formulas, that is, FF, Anandaraja formula and DM analysis were used for calculating LDL cholesterol and were compared with direct LDL analysis among different TG levels. The performance of all three formulas was assessed among various TG levels by comparison of calculated LDL with direct LDL using Wilcoxon sign rank test. Percentage difference was calculated as = Calculated LDL – DLDL/DLDL × 100. All the P values were considered significant at 5% level of significance. Stata 15.1, Stata Corp, Texas, USA was used for analysis.
Results   
The present study included lipid profile data of 266 subjects. Among these, 193 were males (72.56%) and 73 females (27.44%), with a mean age of 50.15 ± 14.84 years. There were 108, 59, 23, 32, 22, 12, and 10 subjects in group I, II, III, IV, V, VI, VII respectively. [Table 1] shows the average value (mean ± SD) of lipid profile parameters were as TC (192.38 ± 72.25 mg/dl), TG (337.15±, 492.68 mg/dl), HDL (40.47 ± 12.98 mg/dl), VLDL (65.43 ± 98.52 mg/dl), TC/HDL ratio (5.13 ± 2.48), and direct LDL (113.41 ± 56.07 mg/dl). The mean ± SD of LDL cholesterol calculated by using different formula as FF, AF, DM were 92.26 ± 57.09, 91.72 ± 55.75, 102.7 ± 58.28 mg/dl, respectively.  Table 1: Average values of Lipid profile and calculated LDL cholesterol levels
Click here to view 
[Table 2] shows the correlation of direct LDL with calculated LDL obtained by using different formulas. A strong correlation was found between calculated LDL and direct LDL, which is significant at 1% level of significance up to TG level 600 mg/dl. When TG value crosses >600 mg/dl significance level reduces to 5% and no significance was found when TG level >1000 mg/dl. The mean difference and percentage difference between directLDL and calculated LDL was lowest in DM approach calculated formula [Table 3], whereas FW formula and AR formula shows almost similar Percentage difference (PD) and mean difference between cLDL and direct LDL. A strong correlation was found between all formula used for calculating LDL and direct LDL assay in scatter plot [Figure 2], [Figure 3], [Figure 4]. Bland–Altman plot was prepared [Figure 5], [Figure 6], [Figure 7] to see the agreement between direct and calculated LDL and no bias was observed between direct LDL and calculated LDL when TG level <400 mg/dl. But the agreement between calculated LDL by DM approach and direct LDL was maximum in comparison to FW and AR formula.  Table 3: Mean difference and percentage difference between Direct LDL and calculated LDL Obtained using different formula
Click here to view 
 Figure 3: Scatter plot representing the correlation between direct LDL and Anandaraj LDL
Click here to view 
 Figure 4: Scatter plot representing the correlation between LDL calculated datamining approach and direct method
Click here to view 
 Figure 5: BlandAltman plot for direct LDL and LDL calculated by Friedwald formula showing 95% agreement
Click here to view 
 Figure 6: BlandAltman plot for direct LDL and LDL calculated by Anandaraja formula showing no bias
Click here to view 
 Figure 7: BlandAltman plot for LDL cholesterol estimated directly and Data mining analysis
Click here to view 
Discussion   
According to NCEP recommendation, importance has been emphasized on accuracy and analytical performance for measurement of LDL cholesterol. It has emphasized that total analytical error for LDL cholesterol measurement should not exceed ±12% (<4% imprecision and ≤4% inaccuracy).^{[16]} Various formulas are still under verification which may be comparable to the D –LDL C measurement but still considerate results have not been achieved. Most of the developing countries won't go for direct LDL estimation because of its cost.
This study was undertaken to compare different methods for calculating LDL cholesterol vs Direct LDL cholesterol measurement. On correlating direct LDL with calculated LDL obtained by using different formulas we found 1% level of significance till TG <600 mg/dl (FF r value = 0.850.96, AR r = 0.850.93, DM r = 0.850.97). Significance level decreases to 5% with TG 600–1,000 mg/dl. No significance was found with TG >1,000 mg/dl. P Krishnaveni et al. in their study also found a good correlation between calculated (FF r = 0.93, AR r = 0.91) and directly measured LDL cholesterol.^{[11]} Other studies also found a similar correlation ranged between 0.78 and 0.93.^{[10],[17]} In our study, maximum correlation was seen in DM approach (r = 0.91) with TG ≤600 mg/dl, FW (r = 0.90) and AR (r = 0.73). Similarly, the study done by Dansethakul P et al. found a correlation (r = 0.977) of their DM approach with DLDL measurement.^{[18]}
Kanani DN et al. in their study found a correlation of r = 0.93 between FW and Direct LDL and r = 0.92 between AR & Direct LDL. This high correlation between FW and AR formula with direct LDL maybe because they have excluded TG ≥400 mg/dl.^{[19]}
When we compared mean difference and percentage difference between direct LDL and calculated LDL using different formulas we observed minimum difference between DM approach vs Direct LDL (PD = 14.20 ± 33.57, MD = 10.7 ± 27.19). The mean percentage difference by FF and AF was almost equal when compared to direct LDL. But in the study done by Sridevi et al. the mean percentage difference of FF was much lower compared to AF (1.93% vs 4.12%).^{[20]} The mean and percentage difference of calculated LDL in our study is higher compared to other studies because they have excluded TG ≥400 mg/dl, whereas we have compared with various ranges of TG up to 1,000 mg/dl.^{[19]} When mean LDL was compared among various TG group DM shows closer value to direct LDL cholesterol till TG <600 mg/dl. None of the calculated value was comparable to direct LDL cholesterol with TG >1000 mg/dl.
In a study, significant underestimation of LDL was seen by FF at a higher level of TG, even this underestimation was more prevalent at LDL <70 mg/dl.^{[21],[22]} Similar to the study done by Sudha K et al. our study also shows the calculated LDL values by FF & AF formula were lower compared to direct LDL.^{[23]} Kapoor et al. observed a 10.39% decrease in LDL cholesterol by FF formula compared to direct LDL estimation.^{[17]} Similar underestimation was also found in the study done by Martin et al. and Kannan et al.^{[10],[21]} Nanda et al. in their study found no significant difference between direct LDL & FW LDL at TG level <200, 200–300, and 301–400 mg/dl.^{[24]}
To evaluate that, the DM approach was better than other calculated values, the BA plot for the difference between two methods against their mean was plotted. No bias was observed between Direct LDL & calculated LDL when TG <400 mg/dl. The maximum agreement was seen in between CLDL by DM approach and Direct LDL. Other studies have shown a negative bias between direct LDL and calculated LDL with minimal negative bias with FF Formula.^{[11],[25],[26]} As evidenced in BA plots done by Palmer MK et al., the difference between FF and DLDL cholesterol increases as the TG value increases.^{[27]} The DM approach had a smaller deviation from the direct LDL cholesterol value. Hence more reliable in place of other calculated formulas (FF & AF) but can better be implicated till TG ≤400 mg/dl. Above this value, Direct LDLcholesterol measurement is best to be analyzed. The study has advocated that direct estimation should be the method of choice for LDL cholesterol estimation especially in a critical clinical setting.^{[28]}
Limitations
There are certain limitations of our study. We had not included all the calculated methods of LDL estimation for comparison with direct LDL measurement. Beta quantification which is the gold standard for LDL measurement was also not used in our study, as it is expensive and inconvenient for daily measurement. We had not taken any history regarding comorbid disease status like Diabetes mellitus, nephropathy and hepatopathy of our participants.
As direct LDLC is analytically complex and economically more most of the laboratories uses FF for estimation of LDLC. But FF has limitations in clinical decision making. Patient classification to correct diagnostic/prognostic categories for CVD risk management with less misclassification particularly when TG >400 mg/dl is more effective via the data mining approach. Further DM approach can be useful also in Primary care facilities where direct LDLC measurements are not available and FF has its limitations.
Acknowledgements
We wish to thank the technicians of Central Laboratory Biochemistry section of our hospital for their immense support for successful completion of the study.
Declaration of patient consent
The authors certify that they have obtained all appropriate patient consent forms. In the form the patient(s) has/have given his/her/their consent for his/her/their images and other clinical information to be reported in the journal. The patients understand that their names and initials will not be published and due efforts will be made to conceal their identity, but anonymity cannot be guaranteed.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
References   
1.  Helkin A, Stein JJ, Lin S, Siddiqui S, Maier KG, Gahtan V. Dyslipidemia part 1review of lipid metabolism and vascular cell physiology. Vasc Endovas Surg 2016;50:10718. 
2.  Fawwad A, Sabir R, Riaz M, Moin H, Basit A. Measured versus calculated LDLcholesterol in subjects with Type 2 Diabetes. Pak J Med Sci 2016;32:95560. 
3.  
4.  Nauck M, Warnick GR, Rifai N. Methods for measurement of LDLcholesterol: A critical assessment of direct measurement by homogeneous assays versus calculation. Clin Chem 2002;48:23654. 
5.  Nakamura M, Kayamori Y, Iso H, Kitamura A, Kiyama M, Koyama I, et al. LDL cholesterol performance of beta quantification reference measurement procedure. Clin Chim Acta 2014;431:28893. 
6.  Miller WG, Myers GL, Sakurbayashi I, Bachmann LM, Caudill SP, Dziekonski A, et al. Seven direct methods for measuring HDL and LDL cholesterol compared with ultracentrifugation reference measurement procedure. Clin Chem 2010;56:97786. 
7.  Agrawal M, Spencer HJ, Faas FH. Method of LDL cholesterol measurement influences classification of LDL cholesterol treatment goals: Clinical research study. J Investig Med 2010;58:9459. 
8.  Sahu S, Chawla R, Uppal B. Comparison of two methods of estimation of low density lipoprotein cholesterol, the direct versus friedewald estimation. Ind J Clin Biochem 2005;20:5461. 
9.  Nigam PK. Calculated low density lipoproteincholesterol: Friedewald's formula versus other modified formulas. Int J Life Sci Med Res 2014;4:2531. 
10.  Martin SS, Blaha MJ, Elshazly MB, Toth PP, Kwiterovich PO, Blumenthal RS, et al. Comparison of a novel method vs the Friedewald equation for estimating lowdensity lipoprotein cholesterol levels from the standard lipid profile. JAMA 2013;310:20618. 
11.  Krishnaveni P, Gowda VM. Assessing the validity of Friedewald's formula and Anandraja's formula for serum LDLcholesterol calculation. J Clin Diagn Res 2015;91:BC014. 
12.  Gupta S, Verma M, Singh K. Does LDLC estimation using Anandaraja's formula give a better agreement with direct LDLC estimation than the Friedewald's formula? Int J Clin Bioche 2012;27:12733. 
13.  Singh N, Kumar BJP, Thimmaraju KV, Kumar KK, Sharma B, Kumar A. Anandaraja formula or friedewald formula, which is a better formula for calculating ldl cholesterol in comparision with direct LDLmeasurement by homogenous assay method. Int J Contemp Med Res 2017;4:22931. 
14.  Sampson M, Ling C, Sun Q, Harb R, Ashmaig M, Warnick R, et al. A new equation for calculation of lowdensity lipoprotein cholesterol in patients with normolipidemia and/or hypertriglyceridemia. JAMA Cardiol 2020;5:19. 
15.  Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;1:30710. 
16.  Bachorik PS, Ross JW. National cholesterol education program recommendations for measurement of lowdensity lipoprotein cholesterol: Executive summary. The National Cholesterol Education Program Working Group on Lipoprotein Measurement. Clin Chem 1995;41:141420. 
17.  Kapoor R, Chakraborty M, Singh N. A leap above Friedewald formula for calculation of lowdensity lipoproteincholesterol. J Lab Phy 2015;7:116. 
18.  Dansethakul P, Thapanathamchai L, Saichanma S, Worachartcheewan A, Pidetcha P. Determining a new formula for calculating lowdensity lipoprotein cholesterol: Data mining approach. EXCLI J 2015;14:47883. 
19.  Kanani DN, Mishra A. Comparison of different estimated formulas with direct estimation of low density lipoprotein cholesterol. Ind J Med Biochem 2017;21:1516. 
20.  Sridevi V, Anand V, Mahendrappa SK. Comparison of Friedewald's and Anandaraja's formula with direct estimation of lowdensity lipoprotein cholesterol in Shivamogga population. Int Arc Integra Med 2016;37:12031. 
21.  Kannan S, Mahadevan S, Ramji B, Jayapaul M, Kumaravel V. LDLcholesterol: Friedewald calculated versusdirect measurementstudy from a large Indian laboratory database. Ind J Endocr Metab 2014;18:5024. 
22.  Knopfholz J, Disserol CC, Pierin AJ, Schirr FL, Streisky L, Takito LL, et al. Validation of the friedewald formula in patients with metabolic syndrome. Cholesterol 2014;2014:261878. doi: 10.1155/2014/261878. 
23.  Sudha K, Prabhu, KA, Hegde A, Marathe A, Kumar KA. Effect of serum triglycerides on LDL estimation by Friedewald formula and direct assay: A laboratory based study. Int J Biomed Res 2015;6:18991. 
24.  Nanda SK, Bharathy M, Dinakaran A, Ray L, Ravichandran K. Correlation of Friedewald's calculated lowdensity lipoprotein cholesterol levels with direct lowdensity lipoprotein cholesterol levels in a tertiary care hospital. Int J App Basic Med Res 2017;7:5762. [ PUBMED] [Full text] 
25.  Chen Y, Zhang X, Pan B, Jin X, Yao H, Chen B, et al. A modified formula for calculating lowdensity lipoprotein cholesterol values. Lipids Health Dis 2010;9:52. 
26.  Pradhan S, Gautam K, Pyakurel D. Comparison of calculated LDLcholesterol using the Friedewald formula and de Cordova formula with a directly measured LDLcholesterol in Nepalese population. Pract Lab Med 2020;20:e00165. 
27.  Palmer MK, Barter PJ, Lundman P, Nicholls SJ, Toth PP, Karlson BW. Comparing a novel equation for calculating lowdensity lipoprotein cholesterol with the Friedewald equation: A VOYAGER analysis. Clin Bio 2019;64:249. 
28.  Huchegowda R, Kumawat R, Goswami B, Lali P, Gowda SH. Derivation of a new formula for the estimation of lowdensity lipoprotein cholesterol. Ind J Health Sci Biomed Res 2019;12:2237. 
[Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5], [Figure 6], [Figure 7]
[Table 1], [Table 2], [Table 3]
