Wednesday, August 3, 2016

New Ventures: Babies, Stats and Python

When I created this blog, I'd set out to have something online where I could dispense what I was learning and I didn't feel the need to post if I didn't think I had something worthwhile to extend to my audience. That's why you are seeing a new post 8 mos after the previous post and the previous post was a newsletter of what I find awesome in the field rather than a written blog post.

Since my last post a lot has changed and I am back after a very long hiatus to have a child, change jobs, move states and attempt to figure out the trajectory of my career from this point on. By way of updates:
  • My daughter is almost 11 mos old, awesome, adorable and sleeps through the night - which is great on my sanity. I don't talk personal life on this blog much if at all but if you are the nosy type - head over to my website This blog is a professional journey, the latter is a personal journey and is updated even less than this one - just saying. I'm pretty busy. Last post on that site was January 2016, same as this one.
  • My family has moved to accommodate a job my husband got. I mean seriously, why not? The man moved countries and states for me for nearly 6 years, it was only fair. And while I am still in adjustment mode it was a good choice for both of us.
  • I still work remotely for the research institute in Maryland doing data analysis and consulting. It's really a great set up for me right now. Gives me time with my daughter. At least after the adjustment period of having NO WHERE to go on a Monday morning and attempting to be productive at home where household things that need to get done make me twitch as I try to focus and do analysis on days tiny minion goes to daycarae. I think I am getting better *twitch twitch*.
  • I am dipping my heels into the academic job market seeing what's out there and developing my philosophies on research and teaching - so real deep soul wrenching stuff. Not really, but reflection is a great practice to have moving forward.
So where do I go from here?

I was looking forward to this 'sort of' break because I knew it would give me a chance to explore my field and how I want my career to play out. If you aren't constantly learning you are falling behind.
"Anyone who stops learning is old, whether at twenty or eighty. Anyone who keeps learning stays young" - Henry Ford
I am determined not to let the gray hairs that are starting to tease me in the mirror get the better of me.

So what does that mean?

I had two main professional goals - to get better at programming (in python), refresh my abilities with stats because I think that last time I seriously did any stats was at Montana State University in 200*-does it matter?! My knowledge is outdated. I also wanted to learn an open source stats program (I've used SPSS and SAS in the past).

Goal 1 - Python programming. I'm not too proud to admit I attempted a formal course around 2013 via EdX administered through MIT...and I failed in fantastic form. I didn't fail the course, I ended up dropping it. At the time (i) I underestimated the challenge level of the course. I liked the challenge but with challenge comes the need to 'meet the challenge' which translates into time! (ii) Time was something I did not have. I was in the middle of building a viral genetics and bioinformatics section at my institute in Maryland which was daunting to say the least especially in light of a million regulations. I made it half way through the course and had to drop as the demands of the job allowed me little time after work to do anything except go home, maybe eat, zone out for a half our then pass out - I'd work runs into the mix when I could. So I conceded defeat determined to conquer the course at a later date; because up til that point I was actually really loving the course. It was (and still is) a well designed online course. I did my epidemiology and biostats certificate completely online, I learn well through an online format. So - now that I find myself with some free time I have re-enrolled in an attempt to actually conquer the course.

So I'm out to start improving my programming from the basics up. If you cannot embrace and learn from defeat you have no business heralding your successes.


Goal 2 - Stats. I actually really like stats despite the heavy math component that can get scary when visions of my high school AP calculus homework start invading my dreams and my stats textbooks. Am I familiar with stats? Yes. Am I a 'pro' with stats? No, but I used to be and I'd like to be again. I'd also like to update my stats programming from the ever so expensive SAS and SPSS to the open source (ie. awesome as hell) R. You can do some beautiful stuff in R.

Oh wait...this is the part where my python programmer/software engineer of a husband serves me divorce papers for not using python pandas/matplotlib and other python visualization tools - doh!

Who says I cannot learn both? Maybe I'll move exclusively to python in the future but for right now in my field R really is the language of choice for data visualization...No! I won't sign those divorce papers. Just kidding, he really is a lovely supportive man who will tolerate my 'hobby' of R.

To this end I could've decided to just do another online course - meh. I think it much more personal and awesome to instead try out a friend/collegue's textbook published in 2014 and I'm really excited about it and that's what you'll be hearing about in the coming blogs on this site - a walk through:

Foundational and Applied Statistics for Biologists Using R by Ken A Aho

I think I'm the perfect candidate to do a self study on this book and see how it goes. I have stats background and to be frank I really appreciated a sentence in his preface:
"Statistical texts and classes within biology curricular generally ignore or fail to instill foundational concepts...Unfortunately, this problem has been exacerbated by advance in statistical software. These tools do not require any knowledge of foundational principles. However, a poor understanding of the theory underlying the algorithms often leads to misapplication of analyses, misunderstanding of results, and invalid inferences."
How often have I 'preached' about using bioinformatics tools without knowledge to back up why you are using the tool you are using and are you using the best tool (especially in phylogenetics)? Do you even know what a model is? Do you know why you are using that model? You are using Geneious to do WHAT?!! Etc.

Bioinformatics is not about pressing buttons and black boxes. It is not about getting results and taking then with no grains of salt. If you are a researcher I would hope you want to know the fundamentals, what it is doing and what it can tell you when you run your data through it and the caveats associated with it.

I am guilty of pressing buttons in stats, especially of recent - time to back up and practice more of what I preach.

Disclaimer: This will not be a fast process, clocking in at 554 pages, Ken's book is long. But I am looking forward to it. I cannot guarantee every-day postings but I'll get through it! Tune into this blog if you are interested in the journey.

What about the newsletters?

Right - so I spent 3 years running newsletters around my institute via email then started posting some on this blog. I got generally favorable feedback. My newsletters will continue to appear on and I encourage you to head there. Though given my goals above I will not be updating them as often as I did before.


Also - is a pay-for service if you want multiple newsletters so I may reduce in the future as readership has been light for over a year now and the service isn't exactly cheap. Just a heads up.

I think that about sums it up.

One final note before I leave you - for those that are in a continual pursuit to improve their abilities in the world of bioinformatics. There are some great reads out there - and here's where I shamelessly plug my bioinformatics justice league - leaders in the field, they will make you think and they will hold you accountable. It makes us all better scientists.

For those that read this blog - thanks for your readership, I do hope it's helpful or interesting or both! up - A Walk Through: Foundational and Applied Statistics for Biologists Using R (FASBuR) - and it's a cool acronym too!