Steam cells are undifferentiated biological cells that can develop or transform into specialized cells

Steam cells are undifferentiated biological cells that can develop or transform into specialized cells. The differentiation of these cells into more specialized units depends on many factors, some of which are still unknown or not fully understood. The differentiation process can be studied through statistical and data mining techniques using the RNA found within the cells.
Recent studies show the research about the significance of RNA sequencing for the identification and explanation of reasons and factors related to the development of certain diseases or medical conditions, as well as support and encourage the research on possible pathways or treatments to these conditions. RNA sequenced data from leukemia patients and data obtained from other sources has been successfully used to identify endogenous control genes, through this results this studies have demonstrated the variability of genes that allow us to keep a standard for the appropriate control, they have also allowed to identify more specific control genes between different kinds of cancer and normal tissue types, facilitating the accurate and rapid identification of cancer, and finally proved the effectiveness of a selection of genes in a technique used by laboratories 5. The genetic expression data of microarray studies of brain tissue from patients with Parkinson’s disease has also been analyzed to demonstrate the complexity of chronic brain disease and through this data they have been able to propose potential pathways and a treatment strategy of Parkinson’s disease. Furthermore, the relation between Parkinson’s disease, schizophrenia and Alzheimer’s dementia appear to share some common genetic networks that may be later identified by doing a deeper research on their RNA data 6. In a similar way microarray samples and RNA sequencing derived from human brain and blood in patients with Alzheimer’s disease have been analyzed and provided new insights into the underlying gene regulatory dynamics in Alzheimer’s disease 7. Through these studies we can observe how RNA sequencing and gene differential expression can provide solutions and explanations for a large number of clinical conditions. Studying the behavior and the stages of development and differentiation of cells into their different linages can help preventing or treating a wide variety of illnesses.
The objective of this work is to analyze RNA sequences using statistical tools available for R Language and understand the steps followed within this process in order to reproduce previous research studies and later provide or propose more detailed information that may be helpful for further clinical and research applications. As a goal in this research I intend to reach an accurate classification of cells according to the function and pathway they follow during the differentiation process, as well as an identification of the genes and factors involved in the differentiation of steam cells into the lymphoid lineage. In order to accomplish this goal several tools and terms such as FASTQ files, BAM files, RNA Quality Control (FastQC), Trimming of RNA, single/paired reads, HTSeq package, BioStar tools and packages, and R coding will be studied and put into practice for the development of this work.
The reason and importance of research around this topic lies in the many applications and solutions that could be developed for a number of pathologies and diseases in basically every field of medicine (cardiology, neurology, orthopedics, ophthalmology, hematology, etc.). Once we understand the functioning or behavior of these factors and proteins involved in the differentiation process, and the conditions required for it, as well as the pathways they follow through the variety of lineages these process could be mastered and controlled in order to prevent, treat or cure diseases like Parkinson’s, Alzheimer’s, Leukemia, cancer and others. Currently there is a broad amount of research surrounding RNA analysis, every day new tools, techniques and methods are available to study the development and pathways followed by the steam cells. As a good practice and in order to achieve good results this research takes into consideration all the steps that should be present in any other Data Mining process. The steps involved in any data mining process in order to get knowledge from data should cover the following stages: Data Cleaning, Data Integration, Data Selection, Data Transformation, Data Mining, Patten Evaluation, and Knowledge Presentation.
Based on the results from other studies and their results a plan and specific procedure is planned to be followed during this research. However, information should initially be obtained from trustful online sources and databases, later this raw information should be properly processed through different high throughput sequencing analysis tools and converted into a more convenient environment for its posterior analysis using R coding platform and software tools 8.
In order to find the corresponding pathways and how the differentiation of cells works is possible to analyze the genome coding of steam cells. Therefore, the objective of this research is to study and follow the process of study used to analyze RNA sequencing data and identify or propose possible novel genes that are related to the differentiation of steam cells into the different lymphoid cells. This could be achieved through statistical analysis of the information contained in the protein sequencing of a certain species. Clustering and heat maps are data mining graphic tools expected to be used during this study for interpretation and reaching the final results. However, in order to get to that point many other tools should be previously used for the preprocessing and preparation of the data.