Abstract: How amino acid sequence encodes stability? To predict protein thermostability we used an approach called Direct Coupling Analysis (DCA). The basic idea behind DCA is that we collect protein sequences from different organisms and line up all of the sequences so that the first column of the sequence alignment contains all of the first amino acid from each protein, the second column contains the second amino acid, etc. This shows us how a protein has varied across its evolutionary history. During my fellowship program, I collected around 20,000 Dihydrofolate Reductase (DHFR) sequences. Dihydrofolate Reductase is an enzyme that plays an important role in the synthesis of tetrahydrofolate cofactors. If the first letter is “Q” throughout the sequence alignment, then we can say that “Q” plays a crucial role for protein function. If the 10th amino acid can be any of the 20 amino acid letters, we can say that the 10th position is not of importance to us. Doing the same analysis across all protein positions, and all pairs of positions by using statistical techniques and programming we can construct a probabilistic model to see which sequences are most likely to encode a DHFR and which are not.
What Does Research Mean To You? Research to me is discovering new things while disproving lies and advocating the truth. Research to me means improving the quality of life which is also a way of giving back to society. Without research, we cannot say that we are close to finding a cure for cancer; without research, we wouldn’t be traveling in planes or cars; and lastly without research, we wouldn’t have a vaccine for COVID-19. The Green Fellowship program has taught me to be curious, to never hesitate from asking questions and to completely immerse myself in discovering everything there is to know.
Tell Us About Your Journey The healthcare/life sciences industry has always been my primary focus as I was raised in a household of scientists, doctors and professors. The healthcare sector generates huge amounts of data that need to be analyzed such as patient details, health insurance, etc. Through data analytics we can analyze how customers shop and stock the right items accordingly at a level that decreases the organization’s accession expenses. Incorporating my statistics knowledge into the healthcare field is something I've always wanted to do. Hence, I’ve chosen to work at Dr. Reynolds lab which is a computational biology lab and I’ve learnt how to combine programming and statistical models to predict the thermostability of proteins. I did not come in with a strong biology background as I majored in mathematics however, with the support of my mentor and fellow lab mates I was able to successfully grasp domain knowledge and complete my project.
How Did the Pandemic Affect Me? Program wise, the pandemic hasn’t affected me as most of my work was computational. However, just the fact that I wasn’t able to go lab and had to stay in my room all day was sad. I missed interacting with my mentor, lab mates, and specifically our lab lunches.
Where am I now? Currently working as a Business Analyst for a pharmaceutical company called Viatris. Going to Carnegie Mellon to pursue a Masters in Healthcare Analytics (Fall 21).
Advice for Future Green Fellows
Do not hesitate to apply. You aren’t going to lose out on anything. There’s no application fee. I know interviews can be nerve-racking however, just be confident and speak your mind.
Once you get in the most important thing is to NETWORK, NETWORK & NETWORK. Can’t stress how important it is. Plus, do not hesitate to ask questions. Also, do not get overwhelmed by the number of research papers you’ll have to read. Try understanding 10% of a paper and you are there. Take it step by step.
Explore the campus, it's beautiful. Feel free to reach out to any of us. Good luck and make the most of the program.