What does it take to get a PhD?

“What does it mean for you to be/earn a PhD?” “What does it take?” I often pose that question to students at various stages.  I think this is an important question one should ask oneself, during/before training (or when you train others) as this will affect the place/environment one chooses, and what they do during that time.

Today I want to focus on my answer to this question, in the hope that it will help current and future students assess their way. I note that my answers are focused on the research heavy environment, naturally in the computational/CS/biomedical area. But some general principles apply nonetheless.

Anyone who has followed my blog even a little would know that as its name implies, I tend to find connections between the Martial Arts and everyday life. In this case, it so happens that I earned my PhD about the same time I earned my black belt and could not help but see the similarities. For one, a black belt can take 5-10yr to get if you train seriously. Also, people from the “outside” think that if you got a black belt you are “a master”, but any serious practitioner knows it’s hardly the case. You get your black belt when you finish your basic training and then you set out on your way to *start* and become a master. It’s a life long journey and what actually matters is the Way, not the target. PhD is pretty much the same, but what you are training for is to be an independent researcher.

But what does it actually mean to finish basic training to become an independent researcher? It means that if you are given a research question you can formulate it, develop methods to tackle it, assess their results/performance, iterate and finally converge to a solution which you are then able to reproduce, write up in a scientific paper, and present. In order to do that you should establish knowledge of the field you work in (what was done, main problems etc.), the techniques used (experimental, computational – does not matter), developing the ability to find what it is you are missing and learn those missing components as part of the process. You should be able to critique your work and others’ and express this clearly. This means for example that you should be learning how to write reviews as a guided process with your PI so by the end of your PhD you can do it well without them. The observed output of this process are talks, papers, reviews etc. but these are byproducts, not the goal. Which is why I don’t like when PhDs are defined by those.

Why is this important? Because if you adopt this view you can take a step back, look at your PhD and think: OK, what am I missing? What haven’t I done yet? What am I relatively weak at? Then start working towards these so by the end of your PhDs you have gotten through the entire “checklist” and feel you can go through the research process by yourself. Too many students seem to be thinking instead along the lines of “I need to get X papers”. They look at the end-point (the black belt, the paper), not the Way, and therefore actually miss the point.

You might be asking yourselves at this point what about other things? Working in a team? Managing/Teaching/Mentoring?

I think these are great to have and try to get my students practice in those as I think it prepares them better for the modern jobs wherever they choose to go. As a student, you can be proactive about it, look for opportunities, emphasize this to your PI/Thesis committee to maximize exposure etc. But strictly speaking, a PhD at the core is about the ability to execute independent research and not these.

So, what about “knowing the landscape of open problems”  that you can consider going after? Learning how to define the questions in the first place? Get funding?

My view is that these are all good things to learn/know, but again they are not part of the PhD definition. After you finish your basic training you go on a postdoc. As a postdoc, you keep learning more techniques/methods (possibly in new areas), and practice your ability to do independent research. By the end of that, you have proved yourself completely capable of independent research, gained knowledge and experience, built a track record and developed the elements mentioned above (mentoring, view of interesting problems etc) so now you are ready to become a PI. Networking, gaining experience in managing projects/people (and your time…;) are all a big plus. You should strive to build those capabilities to make your life easier regardless of whether your stay in Academia or not.

In practice, some come out of their PhD much more mature than others [1]. A good example is my friend and colleague, John Calarco. John was a PhD student when I was a postdoc at the Belncowe lab. He came out of his PhD with a dozen papers including first author papers in obscure journals such as Cell and Nature. Yes, I know it’s not a great metric but the point is he came out of his PhD with a view, a vision of what he wants to do. He became a Bauer Fellow at Harvard, which meant running his own small lab for a limited time (4 years) with guaranteed funding, and recently got a PI position back at UofT. I see John as a great example of rare maturity at the end of a PhD. I, for one, was not even close to that when I finished my PhD. I did not even know what I wanted to do when I grow up (still working on that one… ;).  All I knew at the time was that *if* I ever wanted to become a PI (and that was a very big if), then I should do a postdoc. So I went looking for a good place for a postdoc (that’s a topic for another post). However, I did get my good basic training at the dojo of my Sensei, I mean advisor, Nir Friedman. And btw, as I noted in a previous blog, “Sensei” does not mean “Master”, it literally translates as “that who has come before”, pointing to the endless chain we create by training the next generation.

So, my view is that if at the end of your PhD you don’t have a comprehensive plan/view of research problems, how to get funding etc. that’s OK. You should not stress about that. But make sure you get your basic training right. Generally speaking, you will *not* get that chance as a postdoc, where you are expected to deliver. Getting your base is intimately linked to choosing an advisor who is a good fit – see a recent post about that from the great Eleazar Eskin here. Regardless, and even if you do not agree with some/all of my points, I hope the above discussion got you thinking and helps you on your “Way” of PhD training or training others!

 


[1] My impression is that students in biomedical fields are generally more mature in that respect than the typical CS ones. Not sure why – maybe it has to do with the time CS students spend on technical abilities, maybe it’s their nature, maybe it’s the culture – I really don’t know, but that has been my general impression through the years.

Follow up on “Proper upbringing vs negligence of computational students in biomedical research environments”

So it turns out my previous post struck a cord with quite a few readers. Some of them contacted me directly to comment and share their stories. I decided to include three. The first is an update I previously posted quickly after a senior PI misinterpreted my post. The other two came later and reflect personal experiences which I thought are important to share both for the PIs out there that think there is no problem and of course for the aspiring compbio students/postdocs who read this. I think those stories speak for themselves and it’s interesting to see the comments and feedback coming from basically all over the world, so this seems to be somewhat of a universal problem in our field. Of note, those stories are not about evil PI exploiting students (the wrong impression the senior PI from the first update was worried about), but rather various forms of negligence and nonsupportive environments, which is what I was actually describing. For obvious reasons, I removed all details that may identify the people involved (and got prior authorization to post those here).

PREVIOUS UPDATE – ORIGINALLY POSTED 1/16/2017:

It seems this post got a lot of views but was also misinterpreted by some who got back to me with legitimate concerns and criticism. Specifically,  a senior PI wrote me they read this as “data generation labs are exploiting the students”. That was never my intention. Let me clarify, and I’ll use Penn’s GCB graduate group to make the point. GCB stands for “Genomics and Computational Biology”. I think the creators of GCB were wise to define it as such. It means GCB caters to a wide range of students who want to get exposed to “real life data/problems”. Some are more into methods development to derive hypotheses (hence “Computational Biology”), others are more into actually generating the data and analyzing it themselves (hence “Genomics”). These are crude distinctions of course but the point is not every student is interested in methods development, not every student requires co-advising. And Sometimes a student may need co-advising/collaboration for a specific project/problem and that’s all. As the PI rightfully wrote me “there is no one size fits all”. Indeed. And students that are becoming experts in a certain field while using/producing Genomic data are not “exploited.” As that PI wrote me: “I’d be better off hiring a good bioinformatician then taking on an untrained grad student who typically needs close supervision and mentorship.” That’s a fair point. My worry, and what sparked this blog in the first place, is with students who want to do more “methods development” at some level and do not get to do that because (a) they haven’t realized that’s what they actually want to do (b) they did not articulate it (see my suggestions above) (c) the system/lab they are in does not support it.

PERSONAL STORY 1:

Your recent blog post strikes a deep personal chord with me because, during grad school, I was one of the “computationally oriented students [that were] basically used as in-house bioinformaticians to solve the bioinformatics needs of a data generating lab”.
Before we go on, I should say that my grad advisor is a very nice person, excellent scientist, absolutely looks out for me, and we have a great ongoing relationship. So, this is definitely a classic case of asking the student (me) to “go explore and tell me what you may want to do” “[w]ith all the good intentions.”
I joined a genetics lab with a lot of interest in computational biology but, being a naive undergrad, I did not realize that, although the science was really cool, my advisor will not be able to advise me on the computational aspects of my work. After I started my work this slowly dawned on me when problems were being posed to me and I was being asked to “solve” them without being given any starting point or subsequent guidance. This was still my first year and I found it very hard to cope with.
I struggled day and night to find relevant papers & reviews, read them end-to-end, read online tutorials, improve both my programming and analysis skills, and started working on the given problems. Then, I started seeking out other bioinformatics/computational-biology faculty on campus to interact with and attend journal clubs with, and I was also doing my best to identify one of them to be my co-advisor.
But, the latter – engaging with other computational faculty – was not easy at all due to complicated politics from all parties involved: my advisor only wanted to ‘collaborate’ and did not want me to be partially subsumed into another faculty’s group, distracting me from my main work; he also did not have good experience/relationship with a few bioinformatics faculty whom he wanted to work with, and so, he decided to “grow the expertise” in his own lab and liked to tout that he had in-house bioinformatics capabilities.
I survived by working very hard, making hundreds of mistakes, interacting with folks far and wide across the campus, and finding a couple of “shadow” mentors whom I could go to for general guidance when things really were not looking well. Along the way (just like you pointed out), I also managed to mildly enjoy the part of being in a lab that was generating data and interacting closely with experimental biologists, both helping me tremendously in my scientific development.
So, in spite of my survival and subsequent success, I couldn’t agree more with your post. Now as a faculty myself, I cannot emphasize enough the value of “advice”, “training”, “guidance”, and “well-rounded professional growth” of my students and I’m committed to “improving the upbringing of our future generation of scientists”.
Thank you for your post. This is a super-important issue and I’m glad you brought it up.
PERSONAL STORY 2:
Your recent post on computational training has touched me deeply.
I have read this almost 50 times and this completely echoes my sentiments.
We all acknowledge the misuse of computational trainees as in-house bioinformaticians, but your post also talks about the “benign form of negligence”, i.e. not knowing what to do with a computational trainee.
I am currently in the same situation, figuring out what to do next. Unfortunately, most people never realize this problem until it is too late.
Thank you again for this post.

Proper upbringing vs negligence of computational students in biomedical research environments

So, today I want to write about a topic I feel strongly about which is how we raise the next generation of computational biologists.

To start with, I think that in many ways we had made great progress compared to the state of affairs when I started my graduate studies: There is a much better understanding of what it is students should actually know, there are dedicated courses, books, online material, etc.

I also want to emphasize that I’m not advocating that computational students do not train/work in biomedical environments. Unless what you really want is only do CS/Math you may miss out *a lot* in terms of real life data/problems (domain specific data science if you will), how biologists think about problems (quite different I tell you, and there is a lot to learn there!), or thinking about the next set of problems/challenges to tackle. Not to mention cutting edge biomedical research you get exposed to can be absolutely fascinating even when no computational problems are involved!

But I’m not here to discuss all that but rather the not uncommon situation where computationally oriented students are basically used as in-house bioinformaticians to solve the bioinformatics needs of a data generating lab. And sure I understand it’s not a black or white and there is great value in getting your hands dirty with real data, and that it’s important to help each other, be a good citizen, etc. That’s not what I’m talking about. I’m talking about students with computational aspirations that end up doing all the bioinformatics work in the lab because (a) it’s really needed (b) they can (c) they are much cheaper and easier to get than a Bioinformatician. Sure, these students may end up on great papers representing great science from great labs. But I argue that’s not enough, and that can not be an excuse. Why? Because they come to *train* and it’s our responsibility to train them. And if you think that just by making them solve your Bioinformatics problems you are giving them proper computational training you are *wrong*: They will not necessarily develop the technical skills in algorithms, proper coding, data analysis, thinking about computational modeling and many more things they should be getting. And don’t tell me that the fact they are coming out to a market that will now snatch them is enough. Because if they have the proper training they can easily grow, do something else entirely, etc. But if they don’t then they are much more likely to get stuck at a lower level, not mature as independent compbio researchers that are sought after in Academia/Industry.

I should also mention a “lighter form” of negligence: When a PI gets a highly computational student but does not necessarily know how to guide her. With all the good intentions this results in “go explore and tell me what you may want to do.” It sounds great in theory, but the problem is that (a) these students commonly lack a strong biomedical base and (b) even if they are computationally savvy they don’t know how to actually translate something they hear/read about to a computationally framed problem. They often don’t even know what questions to ask.

Naturally, I meet many researchers during my work, and some PIs acknowledge the problem. I talked to one such senior PI in a meeting last summer who told me: “you are right. They are desperately needed in the labs, we try to make the best of it, but I know it’s not always good for them”. But not all are like that. I had a quite different exchange with another senior PI. During a social event, the conversation drifted to this, and I said it’s a problem we need to deal with. She said it’s totally fine (using the argument above about having job offers). I iterated our obligation to train them properly computationally and that otherwise we are not doing it right. At which point she said, jokingly, “Well, you are lucky I’m not on your tenure committee.” I could not agree more, and a joke or not I don’t like that kind of humor [1]. Regardless, I see that “everything is fine” answer a representative of a too common approach in biomedical research labs.

So what should we do? There are several things I can think of:

  1. As an institution/graduate program: Make the effort to have computational students be advised properly. So if the PI is not up to it/interested, get a co-advisor [2] and make sure the computational skills development is on the student’s todo list.
  2. As a student:
    1. Same as (1) above regarding skill development and/or co-advising.
    2. Think carefully before that in what institute/program/lab you want to spend your time. Think what it is you actually want, ask questions, shop around. Maybe do research in a lab for a year to get the hang of it and see for yourself before you commit for 5 or so years.
    3. Be Proactive – do not just count on your program/mentor/whatever to take good care of you/your interests. Maybe your interests are not their or not high enough on their priority, or they are just too busy or do not know any better. We are brought up in a system where we follow what the teachers tell us, get good grades, and constantly look for their approval. Ph.D. students are in a  period where they are still training but also transitioning towards independence, the job market, etc. You should still focus on doing a great job, but don’t follow blindly everything else.

The above points also relate to some of my previous posts about finding yourself a good mentor (or Sensei…). At the very least if we all become more aware of this issue I think there is a good chance of improving the upbringing of our future generation of scientists.

UPDATE 1/16/2017:

So, it seems this post got a lot of views but was also misinterpreted by some who got back to me with legitimate concerns and criticism. Specifically,  a senior PI wrote me they read this as “data generation labs are exploiting the students”. That was never my intention. Let me clarify, and I’ll use Penn’s GCB graduate group to make the point. GCB stands for “Genomics and Computational Biology”. I think the creators of GCB were wise to define it as such. It means GCB caters to a wide range of students who want to get exposed to “real life data/problems”. Some are more into methods development to derive hypotheses (hence “Computational Biology”), others are more into actually generating the data and analyzing it themselves (hence “Genomics”). These are crude distinctions of course but the point is not every student is interested in methods development, not every student requires co-advising. And Sometimes a student may need co-advising/collaboration for a specific project/problem and that’s all. As the PI rightfully wrote me “there is no one size fits all”. Indeed. And students that are becoming experts in a certain field while using/producing Genomic data are not “exploited.” As that PI wrote me: “I’d be better off hiring a good bioinformatician then taking on an untrained grad student who typically needs close supervision and mentorship.” That’s a fair point. My worry, and what sparked this blog in the first place, is with students who want to do more “methods development” at some level and do not get to do that because (a) they haven’t realized that’s what they actually want to do (b) they did not articulate it (see my suggestions above) (c) the system/lab they are in does not support it.


[1] This reminded me of a joke my father always liked to tell when I was little: Two guys cross each other on the street. The big guy suddenly slaps the little guy out of nowhere. The little guy looks at him intensely and says: “What was that? Was that a joke or something?” To which the big guy replies: “No, I was serious.” “Oh, you’re lucky then,” says the little guy, “because I really don’t like that kind of humor.”

[2] Co-advising is a solution used in GCB [Genomics and Computational Biology] here at Penn. I was fortunate to be co-mentored through some of my PhD and it was instrumental during my postdoc years.

 

Training DNN and Training BJJ – What’s the Connection??

Brazilian Jiu Jitsu (BJJ) is a grappling martial art focused on submitting your opponent by chokes and joint locks (arm bar, knee bar etc.) Deep Neural Networks (DNN) is a class of models in machine learning which you train on data for specific tasks (e.g. object recognition in images). Is there any connection between the two??

In today’s post I want to get back to one of my favorite themes, drawing connections between Martial Arts and research or everyday life. I usually identify some principal/idea in my Martial Arts, then find a parallel in everyday life/research (see for example here). But this time it started with our current efforts to train deep neural networks. Taming the DNN beasts is somewhat of an art, but that’s not the point I want to make today. Instead, I want to focus on what we optimize, how we optimize, and for what. Sure, you can be technical at this point and talk about the details of your SGD, the learning rate, momentum, dropout, Adam and whatnot. But I want to look at it more, well, philosophically: What you really want to achieve is *generalization*, “mastery of the domain” if you like. And how do you go about it? You try to optimize some sort of temporal cost function. I call it temporal because in typical DNN training the sample set is huge and your SGD is based on your model’s current experience (the sample/mini batch) and its current state (model parameters), with the hope that when the model sees something new in the future (test sample) it will be able to handle it well. The reaction to the current sample is based on the function you set to optimize, i.e the cost function. While in some cases the cost function naturally arise, it is generally something we *make up*. Sometimes it’s a reasonable approximation of what we are truly after (think of #mistakes vs. say cross entropy or hinge loss), sometimes it’s rather crude (e.g. maximum likelihood of a simplistic probabilistic model for a biological process). Another point is that in typical DNN training we don’t even try to achieve the magical “global optimum”, settling for a “good enough” reduction in test error. And if you try to “rush” your learning going too fast for your temporal loss function (e.g learning rate) you do not do as well.

And what happens when you train BJJ? your temporal/local optimization function is winning/losing when you train fighting with someone. But that’s not your *real* goal. Just like your DNN you want to achieve mastery of your domain (BJJ). And you may have other goals as well: Good health/shape, self-defense, fun, etc. But if you focus too much on the local function (winning) you are going to miss out. How? For one, if all you care about is winning/losing you will not push yourself into hard situations (which may cost you the fight), limiting your exploration and therefore slowing yourself on your way to master the vast space of positions/states within the art. If you think of every time you get submitted/choked as merely a (negative) loss you will likely lose not just good lessons, but also much of the fun. But too much focus on winning/losing can have more subtle effects.  Instead of emphasizing good technique and accepting a loss when you fail to execute you will insist in muscling yourself out of bad situations. This eventually leads to injuries which again slow you down, or completely stop you. And the funny thing is it does not need to involve a heroic move/submission or an opponent that goes crazy and breaks your arm. It can be as simple as you exploding out of a bad position for a split of a second, when your body is already tired after say two hours of training with a bunch of opponents that are all better than you. So you (partially) tear your knee’s MCL, have to recover for 3 month and have to go to your conference (ISMB 2015, Dublin) on crutches. Now you could argue that’s a good way to build name recognition (“I remember you, you are the guy with the crutches from last year” – I got this at ISMB 2016 in Orlando this year…) but I would strongly advise against it*.

In summary, it turns out that good practices in training DNN translates to good practices in training BJJ, which can in turn have great impact on your mood, happiness, and health…. Who knew? So just keep in mind in your own DNN/BJJ/whatever training to not let your local optimization function make you lose sight of your true goals….. and good luck!


*When on crutches at a conference, try to say make it on time for a talk, or squeezing in a row for an empty sit when you finally arrive late. After the talk, you can stand in the line for coffee, then get “here is your coffee sir”, then realize you can’t do anything with it and you have to relay on the good help of others. The good news is that it can be temporary and can hopefully go away. Having crutches is a humbling experience, but that’s a topic for another post….

 

 

A window into Research

In my previous post I discussed how Scientific meetings/conferences can be viewed as a Window into Research. Keeping up with the light summertime blog post theme, I wanted to present another interoperation for Window into Research, an “Arts & Science” project I have been working on in the last year.

Almost a year ago our lab moved to its new location in the newly renovated Richards Building. The building, designed by the famous architect Louis Kahn, is a designated National Historic Landmark and (as we found out) a constant tourist attraction. Because it’s a historical site we were not allowed to modify the walls – but no one said anything about the windows….. And so, shortly after we moved in, we turned the big windows into another whiteboard for the lab. The colorful result brought further attention (we recently had the dean visit with donors, admiring our new art form…;) and even Michael Schade, the architect responsible for the renovation, was pleasantly surprised by it (I told him that’s what happens if you let people into your plans…;)

At some point I decided to turn the lab’s corner window into my own mini “Arts & Science” project: A view into our lab’s research/daily life via stills of the window. The original idea was to constantly capture the window area across times of the day/week/month/year. While I failed to systematically do that (for now…), I did manage to collect quite a few images through the year which I turned into the clip below. I plan to keep updating this as the story of our lab’s research unfolds….

A few things worth noting:

  1. The theme song was deliberately chosen. First, it’s a nice song by Mark Knopfler. Second, its name is “Sailing to Philadelphia” matches well the lab’s location at UPenn. But the third reason is maybe the most intriguing one: It revolves around Charles Mason and Jeremiah Dixon. Many people know about their contribution: Mapping the Mason-Dixon line between Maryland, Pennsylvania, and Delaware. This line later became famous as the border between the South and the North, thus representing much more than just a line. But what people do not know is that Mason and Dixon were English Scientists (astronomers, cartographers) that came to America to help map the then uncharted territories of the colonies with their new tools/methods. A few centuries after Mason & Dixon we too are in Philadelphia, mapping uncharted territories but in Genomics and Genetics….
  2. The clip gives a biased view of our research as it depicts more frequently the people actually sitting besides the window (Jordi, Anu). Other people have the misfortune of using plain whiteboards…..
  3. Finally, based on the information contained in the clip:
    1. Can you tell what type of data/problems we commonly work on? (hint: molecules like DNA, but cooler, and a process that rhymes with “rhyming”  😉
    2. Can you guess what is our family pet??

 

And now, for the actual clip… Enjoy!

Making the most out of conferences

I’m finally back from literally going around the globe (Philly->Japan->Singapore-> back the other side to Orlando via San Francisco….). With all the depressing news recently coming in from what seems like everywhere in the world I thought I’ll get back to blogging with something more light, appropriate for the summer time…

My travels involved a string of meetings (RNA2016, SINGARNA, IRB-SIG16, ISMB2016) which I attended/organized/gave talks at with many of my lab members. Going to meetings is always fun but also a lot of work. As a senior lab member and even more so as a PI you have responsibilities for getting your lab members ready (talk/poster preparation/practice) and then getting them acquainted with subjects areas, other labs’ work, and specific people. Here is a funny depicting of this by research-in-progress:

Phd student supervises a master student

Meeting people with my professor at a conference (and in case you are wondering, no, I’m not famous as Obama…)

More seriously, meetings serve as a window into research: you see what’s out there, you establish seeds for future collaborations, and maybe most importantly – you also get to step out of your daily life to look again (as if through a window) on your research projects and priorities. I highly recommend preparing for a conference – think ahead what you want to learn about, who do you want to meet and maybe set these meetings in advance if needed. Another great advice I got a few years ago from my postdoc mentor, Ben Blencowe, is to make sure you attend every year or so a meeting which is outside your list of obvious ones – basically use it to push yourself to learn about new areas. And when you finally come back, make sure to have a summary meeting which will highlight works of interest as well as general trends. This will help not only the poor souls that missed the meeting, but also yourself…

Last and not least – be sure to have some fun. This will help keep you energized and motivated as meetings can sometimes run long and simply fry your brain…. Here is one way of doing it, which involves a lot of Japanese Karaoke + free drinks (seen here are Kristen Lynch and I, working hard to defend the PI honor…)

IMG_20160701_235245

 

And yes, there are audio/video recordings as well – but you probably don’t want to hear those…. 😉

cheers!

 

 

Where are we going? ML hype in BioMedical research

I haven’t written in a little while and was planning to write something else. But somehow the recent bombardment of news, interviews, and blog posts about the future of ML, AI, and its role in biomedical research led to a growing feeling of discomfort which in turn led to this blog post. I definitely have much to learn myself about all of these – I find developing ML algorithms for biomedical problems a continuous and humbling process of constant learning. Still, based on my experience I felt the following points should be voiced and possibly stem a discussion.

I’ll try to keep it brief, so here goes:

  1. I don’t think there is a question about the huge untapped potential in ML/AI in biomedical research. As the old saying goes, we have only seen the tip of the iceberg.
  2. There is a huge hype around ML and specifically deep learning (DL). The ML community, like any (scientific) community gets excited about new things, whether it’s fashion or great promise to advance on difficult problems on which it was stuck.
  3. The boom in deep learning can be attributed to some algorithmic progress (even more so now, with all the gained interest) but mostly to a combination of big data availability (to train on) + computing power (harnessing GPUs etc.).
  4. In the field of bio-medical research, a similar explosive growth is already underway for similar reasons: Bio-medical records across the globe are becoming all connected, amenable to searches and training models, with personalized genomic/genetic data becoming cheaper. There are many dangers/issues involved (privacy, data accessibility, common ontologies etc.) but there is clearly great promise. which brings me to:
  5. We don’t really know where the ML/DL boom will end, in terms of new capabilities, or when. This makes it all the more exciting. Suddenly, it became fashionable (or maybe “legitimate”) for serious ML researchers to discuss at length the essence of intelligence, creativity etc. However,
  6. Prominent ML researchers have already pointed out some of the deficiencies or limitations of current DL technology. These include extremely slow learning rate which requires vast amounts of data and/or computing power to store/process/analyze the data [1]. Think how many games of GO the computer had to play, or how many millions of images a DL algorithm needs for training to identify a single concept (e.g. cats). Another issue is that advancement was mostly achieved in problems for which we have a relatively good understanding, translated into representation of basic “features” which are then fed into the DL algorithm. In Genomics, prominent groups used DL to achieve new heights in tasks such as predicting hypersensitivity sites, or transcription factors binding sites. These are good examples for where we stand: In these works researchers used convolutional networks which, at the end of the day, are like scanning PSSM motifs across a large set of labeled sequences. But now they can scan X number of motifs in parallel, with Y number of positions and many smart tweaks to the learning process like ReLU units, dropouts etc. X,Y in this example would be hyper parameters which, like the entire set of PSSMs, would be blasted through GPUs to optimize performance. The end result is improvement in those prediction tasks, but I would say these results by themselves (a) are built with the same building blocks as before and (b) are not a “revolution” by itself. The domain plays a role here: If you can predict 2% better which ad to display that’s already huge money for Google. In Genomics, pushing the performance on an ROC of a PSSM so it’s more accurate is great but if we still can not get it to be accurate for medical applications or translate it to novel biology then the impact is limited.
  7. In general, we as CS people tend to have a “problem solver” attitude. Think of it: It’s very common to get from a CS person some version of “Tell me what your problem and I’ll tell you how to solve it”. It may be phrased differently but it will boil down to that. Which is great: I think it reflects the usefulness of CS and our pragmatic/interdisciplinary approach (borrowing from engineering, math, physics – anything to get the job done). It also contributes to the relevance of CS to much of the modern life/research. However, this attitude can also come across as arrogant, ignorant, or naive. I’ll give three examples:
    1. I attended a talk by a CS guy at our medical school who advocates for adaptation of functional programming (Scala) and big data tools from the industry (e.g. Spark, Kafka) for Genomics. While his intentions were good he showed a promotional video (probably to recruit CS students) that came across as “come cure cancer with us”. Some people got really upset by this naive representation of the complex problems involved. Some just left. I think the issues with this example are pretty clear so I’ll just go to the next.
    2. I recently read an interview with a prominent ML researcher. The researcher explained that humans are pretty bad at understanding bio-medical/genomics problems and therefore we need “super human intelligence”. So where is the problem with this? I am the first to agree humans are not good in looking at many numbers and identifying patterns – that’s why we use ML. But “super human intelligence” is a very murky term (for example, see an excellent discussion about it in Neil’s Lawrence’s recent post [2]  and also this post by Luciano Floridi about AI future [3]). Especially with respect to DL this approach takes away from us, humans, both the role of defining the input feature and understanding the output. This issue becomes evident in following example.
    3. I was talking recently with a smart, capable, young researcher who visited Penn. He told me how he saw “great promise” in the recent DL in Genomics research since it may be possible to just “put it all into the network and not worry about the feature definition”. Unfortunately for us in CS/ML, ML Expert knowledge ≠ Domain Expert knowledge. We need the latter to guide the former, and in many cases we also want the ML (results, model) to expand/deepen our expert understanding. Well, at least until we get some real “super human intelligence” around here…. 😉 More seriously though, to really make a dent in biomedical research one needs to invest in gaining expert domain knowledge and work closely together with domain experts. For that you need good collaborators/mentors/trainees, a good environment, and a good mindset (maybe more on that in a future post).

In summary, I think statements as the ones illustrated above may cause all of us, researchers engaging in ML for bio-medical research, more harm (how we or our field are perceived by other researchers) than good (getting mentioned in the media or maybe even a grant). It’s already exciting and promising as it is, so let’s just be a bit more humble and focus on getting a few more things done….

 

[1] http://inverseprobability.com/2016/03/04/deep-learning-and-uncertainty

[2] http://inverseprobability.com/2016/05/09/machine-learning-futures-6

[3] https://aeon.co/essays/true-ai-is-both-logically-possible-and-utterly-implausible