Keloids are protrusive claw-like scars that have a propensity to recur even after surgery, and its molecular etiology remains elusive. The goal of reverse engineering is to infer gene networks from observational data, thus providing insight into the inner workings of a cell. However, most attempts at modeling biological networks have been done using simulated data. This study aims to highlight some of the issues involved in working with experimental data, and at the same time gain some insights into the transcriptional regulatory mechanism present in keloid fibroblasts.
Microarray data from our previous study was combined with microarray data obtained from the literature as well as new microarray data generated by our group. For the physical approach, we used the fREDUCE algorithm for correlating expression values to binding motifs. For the influence approach, we compared the Bayesian algorithm BANJO with the information theoretic method ARACNE in terms of performance in recovering known influence networks obtained from the KEGG database. In addition, we also compared the performance of different normalization methods as well as different types of gene networks.
Using the physical approach, we found consensus sequences that were active in the keloid condition, as well as some sequences that were responsive to steroids, a commonly used treatment for keloids. From the influence approach, we found that BANJO was better at recovering the gene networks compared to ARACNE and that transcriptional networks were better suited for network recovery compared to cytokine-receptor interaction networks and intracellular signaling networks. We also found that the NFKB transcriptional network that was inferred from normal fibroblast data was more accurate compared to that inferred from keloid data, suggesting a more robust network in the keloid condition.
Consensus sequences that were found from this study are possible transcription factor binding sites and could be explored for developing future keloid treatments or for improving the efficacy of current steroid treatments. We also found that the combination of the Bayesian algorithm, RMA normalization and transcriptional networks gave the best reconstruction results and this could serve as a guide for future influence approaches dealing with experimental data.