Publishing Python packages related to bioinformatics on bioconda

Have you had trouble installing packages and running a gazillion commands to install dependencies? If you are lucky (which most of the time you won’t be), you will end up installing the package without having any dependency issues or version conflicts. Working in interdisciplinary sciences has made me aware of how hard it is to get these tools to run unless you know what is actually happening from a programming view. You wish that these tools come bundled with all the dependencies and can be installed/run without having conflicts with what you already have installed. 

Continue reading Publishing Python packages related to bioinformatics on bioconda

Marker Genes and Gene Prediction of Bacteria

When we think of the word marker, the first thing that comes to our minds is something that is used to indicate a place. For example, it can be your current location on Google Maps or it can be the place where you planted some seeds in your garden. Similarly, in genomics studies, we can find marker genes in bacterial genomes. In this article, I will introduce you to marker genes used in metagenomics analysis, how they are used and walk you through an example of a commonly used gene prediction tool.

Continue reading Marker Genes and Gene Prediction of Bacteria

Assessing the Quality of Genome Assemblies using QUAST

The assembly algorithms that have been developed so far intend to provide better assemblies evaluated under different criteria. Hence, depending on the specific scenario the assembly process might produce better results if we use the most appropriate assembler. Even though contiguous genomes may not be produced, segments from the reference genomes can be obtained using existing assembly methods. Therefore, the need to evaluate the quality of assemblies exists. These evaluations help researchers to pick different assemblers for different scenarios.

Continue reading Assessing the Quality of Genome Assemblies using QUAST

Software Tools for Reference-free Binning of Metagenomes

We know that there are trillions of microbes in the environment surrounding us, even in our bodies. These microscopic communities have very diverse ecosystems and by studying their composition and behaviour we can learn a lot about them. If you have come across my previous article Metagenomics — Who is there and what are they doing? then you know that binning is an important step in metagenomics analysis.

Continue reading Software Tools for Reference-free Binning of Metagenomes

Metagenomics — Who is there and what are they doing?

Did you know that your body houses about 100 trillion bacteria? Estimates show that a human has approximately a pound or two of bacteria living in his/her gut (stomach) [1] (Now don’t go and drink all the antibiotics you know, to kill those bacteria. In fact, these bacteria play an important role in our metabolism and immune system). The same goes for the backyard of your house. There can be many species of bacteria living in the soil and they help to enrich the soil (e.g.: nitrifying bacteria produce nitrates which are essential for plants). These microscopic communities have very diverse ecosystems and studying their composition and behaviour can provide us with valuable insights. In this article, I will provide a basic introduction to metagenomics, which is the study of genetic material obtained from microbial communities.

Continue reading Metagenomics — Who is there and what are they doing?

Molecular Phylogenetics using Bio.Phylo

Have you ever wondered how life formed from the primordial soup and evolved to the different life forms which can be seen at present? How did different species evolved from their ancestors and what relationships do they have with each other? The answers to these questions can be answered through the study of phylogenetics.

Continue reading Molecular Phylogenetics using Bio.Phylo

Multiple Sequence Alignment using Clustal Omega and T-Coffee

Have you wondered how scientists identify regions of similarity in three or more biological sequences? As described in my previous article, Sequence alignment is a method of arranging sequences of DNA, RNA, or protein to identify regions of similarity. In my latest article on bioinformatics, I have discussed about pairwise sequence alignment. Make sure to check them out as well. Multiple sequence alignment is quite similar to pairwise sequence alignment, but it uses three or more sequences instead of only two sequences.

Continue reading Multiple Sequence Alignment using Clustal Omega and T-Coffee

Bioinformatics Workflow Management Systems: Introducing Unipro UGENE to model Bioinformatics Workflows

With the development of various methods to obtain data from living beings, there has been an explosion in biological data which is readily available to be used. However, such vast amounts of data will be of no use if there is no proper way to execute a series of steps to manipulate the data as we want, to output desired results. This is where Workflow Management Systems come in handy.

Continue reading Bioinformatics Workflow Management Systems: Introducing Unipro UGENE to model Bioinformatics Workflows

Pipeline Frameworks for Genomic Data

Yesterday I was returning home from university via the expressway and the oil refinery at Sapugaskanda caught my eye. The refinery towers operate while sending huge flames into the sky with smoke. The sight of the oil refinery reminded me of pipelines which are used in many manufacturing and transportation industries to transform and transport materials which will provide outputs at the end. One common example is an oil pipeline which is used for long-distance transportation, while refining the oil within intermediate units to give various petroleum products.

Similarly, genomic data can be passed through special software pipelines to refine and analyze the data as required, while resulting in desired visualizations and interpretations.

Continue reading Pipeline Frameworks for Genomic Data

Starting off in Bioinformatics — DNA Nucleotides and Strands

In my first article where I introduced bioinformatics, I have mentioned that we will be learning a lot about DNA, RNA and Protein sequences. Since I’m new to all these DNA/RNA jargon, I decided to learn about them first and then try out some coding problems. All the sequencing problems seem to have some words related to genetics. So first things first, let’s get started. 😊

Continue reading Starting off in Bioinformatics — DNA Nucleotides and Strands