We know that there are trillions of microbes in the environment surrounding us, even in our bodies. These microscopic communities have very diverse ecosystems and by studying their composition and behaviour we can learn a lot about them. If you have come across my previous article Metagenomics — Who is there and what are they doing? then you know that binning is an important step in metagenomics analysis.
Have you ever wondered how life formed from the primordial soup and evolved to the different life forms which can be seen at present? How did different species evolved from their ancestors and what relationships do they have with each other? The answers to these questions can be answered through the study of phylogenetics.
Have you wondered how scientists identify regions of similarity in three or more biological sequences? As described in my previous article, Sequence alignment is a method of arranging sequences of DNA, RNA, or protein to identify regions of similarity. In my latest article on bioinformatics, I have discussed about pairwise sequence alignment. Make sure to check them out as well. Multiple sequence alignment is quite similar to pairwise sequence alignment, but it uses three or more sequences instead of only two sequences.
With the development of various methods to obtain data from living beings, there has been an explosion in biological data which is readily available to be used. However, such vast amounts of data will be of no use if there is no proper way to execute a series of steps to manipulate the data as we want, to output desired results. This is where Workflow Management Systems come in handy.
Yesterday I was returning home from university via the expressway and the oil refinery at Sapugaskanda caught my eye. The refinery towers operate while sending huge flames into the sky with smoke. The sight of the oil refinery reminded me of pipelines which are used in many manufacturing and transportation industries to transform and transport materials which will provide outputs at the end. One common example is an oil pipeline which is used for long-distance transportation, while refining the oil within intermediate units to give various petroleum products.
Similarly, genomic data can be passed through special software pipelines to refine and analyze the data as required, while resulting in desired visualizations and interpretations.
In my first article where I introduced bioinformatics, I have mentioned that we will be learning a lot about DNA, RNA and Protein sequences. Since I’m new to all these DNA/RNA jargon, I decided to learn about them first and then try out some coding problems. All the sequencing problems seem to have some words related to genetics. So first things first, let’s get started. 😊
The word Bioinformatics is making quite a turnaround in today’s world of Science. The word seems to be made up of two parts which are related to two different fields, biology and computer science. About one or two decades ago, people saw biology and computer science as two entirely different fields. One would learn about living beings and their functions whereas the other would learn about computers and underlying theories.