Skip to main content

Posts

DNA Sequence Alignment and Visualization with "SequenceAlignment" Package

 DNA Sequence Alignment and Visualization with "SequenceAlignment" Package In bioinformatics, sequence alignment plays a crucial role in comparing biological sequences, especially DNA sequences. It helps in identifying similarities, differences, and evolutionary relationships between sequences. In this blog, we’ll explore how to use the SequenceAlignment R package for performing sequence alignments, visualizing the results with plots like barplots and heatmaps , and analyzing DNA sequences against multiple reference sequences stored in FASTA files. What is Sequence Alignment? Sequence alignment is the process of comparing two or more biological sequences (e.g., DNA, RNA, or proteins) to identify regions of similarity or difference. In DNA sequence alignment, the sequences are compared to see how closely they match, which can provide insights into genetic similarities, mutations, or evolutionary trends. The SequenceAlignment Package The SequenceAlignment package is a powerf...
Recent posts

Week 12: Creating an R Markdown File for Alignment Score Visualization

 Week 12:  Creating an R Markdown File for Alignment Score Visualization Creating the R Markdown file "Alignment Score Visualization" was a great learning experience. It allowed me to combine R code, text, and visuals in one document, which is both easy to read and share. Here’s what I learned and how it went: Getting Started At first, R Markdown felt a bit confusing because it combines text and code in one file. But once I understood the structure, it became clear how powerful it is. The key was using code chunks, which are sections of the file that run the R code. Each chunk starts with ```{r} and ends with ``` . For my project, I wanted to display alignment scores for V and J regions in a table and a bar chart. Writing and organizing the code into chunks made it easy to run and test each part. Challenges I Faced Code Showing as Plain Text : When I first tried knitting the file, the HTML output only showed the code as plain text. I realized this was because I had included...

Week: 11 - Debugging Journey: Fixing the tukey_multiple Function

Week: 11 Debugging Journey- Fixing the tukey_multiple Function Introduction In this post, I’ll explain how I fixed a bug in the tukey_multiple function in R. The goal of this function is to find rows in a dataset where all the values are outliers. Initially, I ran into errors, but I eventually found the problem and fixed it. Here’s how I did it. Understanding the tukey_multiple Function The function tukey_multiple is supposed to: Check each column in a dataset to see if values are outliers. Return a list that tells us whether each row has only outliers across all columns. Here’s the original code I started with: Step 1: Initial Problem - Missing Function When I tried to run the code, I got an error saying tukey.outlier wasn’t found. After checking, I realized that tukey.outlier isn’t a built-in function in R and wasn’t defined anywhere else in the code. This function is supposed to check if values in a column are outliers, so I knew I needed to write my own function to do that. Wh...

Week : 10 "VDJ_Analysis: Package for Immune Receptor Alignment and Functional Junction Analysis"

 Week : 10  "VDJ_Analysis: Package for Immune Receptor Alignment and Functional Junction Analysis" Introduction to the VDJ_Analysis   Package In bioinformatics and immunology, studying immune receptor sequences—like T-cell receptors (TCRs) and antibodies—is key to understanding how our immune system detects different pathogens. The immune system creates diversity through recombination of gene segments called V (variable), D (diversity), and J (joining) regions. Analyzing these segments and identifying functional (productive) junctions helps us better understand immune responses and diseases. To make this analysis easier in R, I am proposing the VDJ_Analysis package. This package would align immune receptor sequences, match them to known V and J regions, and evaluate junction productivity. It aims to provide researchers with an R-based tool that consolidates alignment scores, matched regions, and productivity assessments, simplifying immune receptor analysis. Objectiv...

Week 9 : Exploring Cancer Survival Data Visualization in R

 Week 9 : Exploring Cancer Survival Data Visualization in R In this Assignment, I explored ways to visualize cancer survival data across different organs using a variety of R plotting methods, including base R’s   barplot() ,   ggplot2 , and an   xyplot()   with   lattice . Here’s a breakdown of the journey, the challenges faced, and what I learned along the way. The Data: Mean Survival Time by Organ The dataset I worked with contains information on the survival times across different organs from cancer . To understand the average survival time for each organ, I first calculated the mean survival time by using the following code: Once I had the mean survival times, I set out to visualize the data using four different approaches, each with its unique set of functionalities and aesthetics. 1. Basic Bar Plot with Base R My first plot used a simple   barplot()   to display the mean survival times. This method provided a quick and straightforward way t...

Week 8 : Tackling Data Handling Challenges and Finding Solutions

 Week 8 : Tackling Data Handling Challenges and Finding Solutions This time i had the opportunity to dive deeper into R by using the plyr package to compute the mean of grades split by gender and export the results to a file. The task seemed straightforward: import a dataset, perform some basic operations, and output the result. However, as with most programming journeys, I encountered a few hurdles along the way, leading to a wealth of learning. Step 1: Importing the Dataset The first task was to import a dataset into R. I used the read.table() function, which reads the file in a tabular format. Initially, the command worked well, but I did face a minor challenge when choosing the right separator for the CSV file ( sep="," ). This was an easy fix once I realized the file used commas to separate values. Here's the command that worked: Lesson learned: Always double-check the file format and ensure the separator used in the file is correctly specified. Step 2: Calculatin...

Object-Oriented Programming in R: Challenges and Insights from the "Murders" Dataset

 Object-Oriented Programming in R: Challenges and Insights from the "Murders" Dataset As part of my recent assignment on Object-Oriented Programming (OOP) in R , I delved into applying both the S3 and S4 object systems to the "murders" dataset from the dslabs package. Through this experience, I encountered some interesting challenges, particularly with S4 objects, and learned a lot about the flexibility and formal structure that R's object systems offer. The dataset provided data on murder rates across U.S. states, and my task was to determine how these object-oriented systems can be applied, test the use of generic functions, and explore key concepts like object classes, slots, and methods. Here’s a reflection on what I learned and the hurdles I faced along the way. Assigning Generic Functions to the Murders Dataset In R, generic functions like summary() and print() are widely used to extract basic information about objects, especially data frames. Since ...