Follow up on “Proper upbringing vs negligence of computational students in biomedical research environments”

So it turns out my previous post struck a cord with quite a few readers. Some of them contacted me directly to comment and share their stories. I decided to include three. The first is an update I previously posted quickly after a senior PI misinterpreted my post. The other two came later and reflect personal experiences which I thought are important to share both for the PIs out there that think there is no problem and of course for the aspiring compbio students/postdocs who read this. I think those stories speak for themselves and it’s interesting to see the comments and feedback coming from basically all over the world, so this seems to be somewhat of a universal problem in our field. Of note, those stories are not about evil PI exploiting students (the wrong impression the senior PI from the first update was worried about), but rather various forms of negligence and nonsupportive environments, which is what I was actually describing. For obvious reasons, I removed all details that may identify the people involved (and got prior authorization to post those here).

PREVIOUS UPDATE – ORIGINALLY POSTED 1/16/2017:

It seems this post got a lot of views but was also misinterpreted by some who got back to me with legitimate concerns and criticism. Specifically,  a senior PI wrote me they read this as “data generation labs are exploiting the students”. That was never my intention. Let me clarify, and I’ll use Penn’s GCB graduate group to make the point. GCB stands for “Genomics and Computational Biology”. I think the creators of GCB were wise to define it as such. It means GCB caters to a wide range of students who want to get exposed to “real life data/problems”. Some are more into methods development to derive hypotheses (hence “Computational Biology”), others are more into actually generating the data and analyzing it themselves (hence “Genomics”). These are crude distinctions of course but the point is not every student is interested in methods development, not every student requires co-advising. And Sometimes a student may need co-advising/collaboration for a specific project/problem and that’s all. As the PI rightfully wrote me “there is no one size fits all”. Indeed. And students that are becoming experts in a certain field while using/producing Genomic data are not “exploited.” As that PI wrote me: “I’d be better off hiring a good bioinformatician then taking on an untrained grad student who typically needs close supervision and mentorship.” That’s a fair point. My worry, and what sparked this blog in the first place, is with students who want to do more “methods development” at some level and do not get to do that because (a) they haven’t realized that’s what they actually want to do (b) they did not articulate it (see my suggestions above) (c) the system/lab they are in does not support it.

PERSONAL STORY 1:

Your recent blog post strikes a deep personal chord with me because, during grad school, I was one of the “computationally oriented students [that were] basically used as in-house bioinformaticians to solve the bioinformatics needs of a data generating lab”.
Before we go on, I should say that my grad advisor is a very nice person, excellent scientist, absolutely looks out for me, and we have a great ongoing relationship. So, this is definitely a classic case of asking the student (me) to “go explore and tell me what you may want to do” “[w]ith all the good intentions.”
I joined a genetics lab with a lot of interest in computational biology but, being a naive undergrad, I did not realize that, although the science was really cool, my advisor will not be able to advise me on the computational aspects of my work. After I started my work this slowly dawned on me when problems were being posed to me and I was being asked to “solve” them without being given any starting point or subsequent guidance. This was still my first year and I found it very hard to cope with.
I struggled day and night to find relevant papers & reviews, read them end-to-end, read online tutorials, improve both my programming and analysis skills, and started working on the given problems. Then, I started seeking out other bioinformatics/computational-biology faculty on campus to interact with and attend journal clubs with, and I was also doing my best to identify one of them to be my co-advisor.
But, the latter – engaging with other computational faculty – was not easy at all due to complicated politics from all parties involved: my advisor only wanted to ‘collaborate’ and did not want me to be partially subsumed into another faculty’s group, distracting me from my main work; he also did not have good experience/relationship with a few bioinformatics faculty whom he wanted to work with, and so, he decided to “grow the expertise” in his own lab and liked to tout that he had in-house bioinformatics capabilities.
I survived by working very hard, making hundreds of mistakes, interacting with folks far and wide across the campus, and finding a couple of “shadow” mentors whom I could go to for general guidance when things really were not looking well. Along the way (just like you pointed out), I also managed to mildly enjoy the part of being in a lab that was generating data and interacting closely with experimental biologists, both helping me tremendously in my scientific development.
So, in spite of my survival and subsequent success, I couldn’t agree more with your post. Now as a faculty myself, I cannot emphasize enough the value of “advice”, “training”, “guidance”, and “well-rounded professional growth” of my students and I’m committed to “improving the upbringing of our future generation of scientists”.
Thank you for your post. This is a super-important issue and I’m glad you brought it up.
PERSONAL STORY 2:
Your recent post on computational training has touched me deeply.
I have read this almost 50 times and this completely echoes my sentiments.
We all acknowledge the misuse of computational trainees as in-house bioinformaticians, but your post also talks about the “benign form of negligence”, i.e. not knowing what to do with a computational trainee.
I am currently in the same situation, figuring out what to do next. Unfortunately, most people never realize this problem until it is too late.
Thank you again for this post.
Advertisement

Proper upbringing vs negligence of computational students in biomedical research environments

So, today I want to write about a topic I feel strongly about which is how we raise the next generation of computational biologists.

To start with, I think that in many ways we had made great progress compared to the state of affairs when I started my graduate studies: There is a much better understanding of what it is students should actually know, there are dedicated courses, books, online material, etc.

I also want to emphasize that I’m not advocating that computational students do not train/work in biomedical environments. Unless what you really want is only do CS/Math you may miss out *a lot* in terms of real life data/problems (domain specific data science if you will), how biologists think about problems (quite different I tell you, and there is a lot to learn there!), or thinking about the next set of problems/challenges to tackle. Not to mention cutting edge biomedical research you get exposed to can be absolutely fascinating even when no computational problems are involved!

But I’m not here to discuss all that but rather the not uncommon situation where computationally oriented students are basically used as in-house bioinformaticians to solve the bioinformatics needs of a data generating lab. And sure I understand it’s not a black or white and there is great value in getting your hands dirty with real data, and that it’s important to help each other, be a good citizen, etc. That’s not what I’m talking about. I’m talking about students with computational aspirations that end up doing all the bioinformatics work in the lab because (a) it’s really needed (b) they can (c) they are much cheaper and easier to get than a Bioinformatician. Sure, these students may end up on great papers representing great science from great labs. But I argue that’s not enough, and that can not be an excuse. Why? Because they come to *train* and it’s our responsibility to train them. And if you think that just by making them solve your Bioinformatics problems you are giving them proper computational training you are *wrong*: They will not necessarily develop the technical skills in algorithms, proper coding, data analysis, thinking about computational modeling and many more things they should be getting. And don’t tell me that the fact they are coming out to a market that will now snatch them is enough. Because if they have the proper training they can easily grow, do something else entirely, etc. But if they don’t then they are much more likely to get stuck at a lower level, not mature as independent compbio researchers that are sought after in Academia/Industry.

I should also mention a “lighter form” of negligence: When a PI gets a highly computational student but does not necessarily know how to guide her. With all the good intentions this results in “go explore and tell me what you may want to do.” It sounds great in theory, but the problem is that (a) these students commonly lack a strong biomedical base and (b) even if they are computationally savvy they don’t know how to actually translate something they hear/read about to a computationally framed problem. They often don’t even know what questions to ask.

Naturally, I meet many researchers during my work, and some PIs acknowledge the problem. I talked to one such senior PI in a meeting last summer who told me: “you are right. They are desperately needed in the labs, we try to make the best of it, but I know it’s not always good for them”. But not all are like that. I had a quite different exchange with another senior PI. During a social event, the conversation drifted to this, and I said it’s a problem we need to deal with. She said it’s totally fine (using the argument above about having job offers). I iterated our obligation to train them properly computationally and that otherwise we are not doing it right. At which point she said, jokingly, “Well, you are lucky I’m not on your tenure committee.” I could not agree more, and a joke or not I don’t like that kind of humor [1]. Regardless, I see that “everything is fine” answer a representative of a too common approach in biomedical research labs.

So what should we do? There are several things I can think of:

  1. As an institution/graduate program: Make the effort to have computational students be advised properly. So if the PI is not up to it/interested, get a co-advisor [2] and make sure the computational skills development is on the student’s todo list.
  2. As a student:
    1. Same as (1) above regarding skill development and/or co-advising.
    2. Think carefully before that in what institute/program/lab you want to spend your time. Think what it is you actually want, ask questions, shop around. Maybe do research in a lab for a year to get the hang of it and see for yourself before you commit for 5 or so years.
    3. Be Proactive – do not just count on your program/mentor/whatever to take good care of you/your interests. Maybe your interests are not their or not high enough on their priority, or they are just too busy or do not know any better. We are brought up in a system where we follow what the teachers tell us, get good grades, and constantly look for their approval. Ph.D. students are in a  period where they are still training but also transitioning towards independence, the job market, etc. You should still focus on doing a great job, but don’t follow blindly everything else.

The above points also relate to some of my previous posts about finding yourself a good mentor (or Sensei…). At the very least if we all become more aware of this issue I think there is a good chance of improving the upbringing of our future generation of scientists.

UPDATE 1/16/2017:

So, it seems this post got a lot of views but was also misinterpreted by some who got back to me with legitimate concerns and criticism. Specifically,  a senior PI wrote me they read this as “data generation labs are exploiting the students”. That was never my intention. Let me clarify, and I’ll use Penn’s GCB graduate group to make the point. GCB stands for “Genomics and Computational Biology”. I think the creators of GCB were wise to define it as such. It means GCB caters to a wide range of students who want to get exposed to “real life data/problems”. Some are more into methods development to derive hypotheses (hence “Computational Biology”), others are more into actually generating the data and analyzing it themselves (hence “Genomics”). These are crude distinctions of course but the point is not every student is interested in methods development, not every student requires co-advising. And Sometimes a student may need co-advising/collaboration for a specific project/problem and that’s all. As the PI rightfully wrote me “there is no one size fits all”. Indeed. And students that are becoming experts in a certain field while using/producing Genomic data are not “exploited.” As that PI wrote me: “I’d be better off hiring a good bioinformatician then taking on an untrained grad student who typically needs close supervision and mentorship.” That’s a fair point. My worry, and what sparked this blog in the first place, is with students who want to do more “methods development” at some level and do not get to do that because (a) they haven’t realized that’s what they actually want to do (b) they did not articulate it (see my suggestions above) (c) the system/lab they are in does not support it.


[1] This reminded me of a joke my father always liked to tell when I was little: Two guys cross each other on the street. The big guy suddenly slaps the little guy out of nowhere. The little guy looks at him intensely and says: “What was that? Was that a joke or something?” To which the big guy replies: “No, I was serious.” “Oh, you’re lucky then,” says the little guy, “because I really don’t like that kind of humor.”

[2] Co-advising is a solution used in GCB [Genomics and Computational Biology] here at Penn. I was fortunate to be co-mentored through some of my PhD and it was instrumental during my postdoc years.