Making the most out of conferences

I’m finally back from literally going around the globe (Philly->Japan->Singapore-> back the other side to Orlando via San Francisco….). With all the depressing news recently coming in from what seems like everywhere in the world I thought I’ll get back to blogging with something more light, appropriate for the summer time…

My travels involved a string of meetings (RNA2016, SINGARNA, IRB-SIG16, ISMB2016) which I attended/organized/gave talks at with many of my lab members. Going to meetings is always fun but also a lot of work. As a senior lab member and even more so as a PI you have responsibilities for getting your lab members ready (talk/poster preparation/practice) and then getting them acquainted with subjects areas, other labs’ work, and specific people. Here is a funny depicting of this by research-in-progress:

Phd student supervises a master student

Meeting people with my professor at a conference (and in case you are wondering, no, I’m not famous as Obama…)

More seriously, meetings serve as a window into research: you see what’s out there, you establish seeds for future collaborations, and maybe most importantly – you also get to step out of your daily life to look again (as if through a window) on your research projects and priorities. I highly recommend preparing for a conference – think ahead what you want to learn about, who do you want to meet and maybe set these meetings in advance if needed. Another great advice I got a few years ago from my postdoc mentor, Ben Blencowe, is to make sure you attend every year or so a meeting which is outside your list of obvious ones – basically use it to push yourself to learn about new areas. And when you finally come back, make sure to have a summary meeting which will highlight works of interest as well as general trends. This will help not only the poor souls that missed the meeting, but also yourself…

Last and not least – be sure to have some fun. This will help keep you energized and motivated as meetings can sometimes run long and simply fry your brain…. Here is one way of doing it, which involves a lot of Japanese Karaoke + free drinks (seen here are Kristen Lynch and I, working hard to defend the PI honor…)

IMG_20160701_235245

 

And yes, there are audio/video recordings as well – but you probably don’t want to hear those…. 😉

cheers!

 

 

Advertisements

Where are we going? ML hype in BioMedical research

I haven’t written in a little while and was planning to write something else. But somehow the recent bombardment of news, interviews, and blog posts about the future of ML, AI, and its role in biomedical research led to a growing feeling of discomfort which in turn led to this blog post. I definitely have much to learn myself about all of these – I find developing ML algorithms for biomedical problems a continuous and humbling process of constant learning. Still, based on my experience I felt the following points should be voiced and possibly stem a discussion.

I’ll try to keep it brief, so here goes:

  1. I don’t think there is a question about the huge untapped potential in ML/AI in biomedical research. As the old saying goes, we have only seen the tip of the iceberg.
  2. There is a huge hype around ML and specifically deep learning (DL). The ML community, like any (scientific) community gets excited about new things, whether it’s fashion or great promise to advance on difficult problems on which it was stuck.
  3. The boom in deep learning can be attributed to some algorithmic progress (even more so now, with all the gained interest) but mostly to a combination of big data availability (to train on) + computing power (harnessing GPUs etc.).
  4. In the field of bio-medical research, a similar explosive growth is already underway for similar reasons: Bio-medical records across the globe are becoming all connected, amenable to searches and training models, with personalized genomic/genetic data becoming cheaper. There are many dangers/issues involved (privacy, data accessibility, common ontologies etc.) but there is clearly great promise. which brings me to:
  5. We don’t really know where the ML/DL boom will end, in terms of new capabilities, or when. This makes it all the more exciting. Suddenly, it became fashionable (or maybe “legitimate”) for serious ML researchers to discuss at length the essence of intelligence, creativity etc. However,
  6. Prominent ML researchers have already pointed out some of the deficiencies or limitations of current DL technology. These include extremely slow learning rate which requires vast amounts of data and/or computing power to store/process/analyze the data [1]. Think how many games of GO the computer had to play, or how many millions of images a DL algorithm needs for training to identify a single concept (e.g. cats). Another issue is that advancement was mostly achieved in problems for which we have a relatively good understanding, translated into representation of basic “features” which are then fed into the DL algorithm. In Genomics, prominent groups used DL to achieve new heights in tasks such as predicting hypersensitivity sites, or transcription factors binding sites. These are good examples for where we stand: In these works researchers used convolutional networks which, at the end of the day, are like scanning PSSM motifs across a large set of labeled sequences. But now they can scan X number of motifs in parallel, with Y number of positions and many smart tweaks to the learning process like ReLU units, dropouts etc. X,Y in this example would be hyper parameters which, like the entire set of PSSMs, would be blasted through GPUs to optimize performance. The end result is improvement in those prediction tasks, but I would say these results by themselves (a) are built with the same building blocks as before and (b) are not a “revolution” by itself. The domain plays a role here: If you can predict 2% better which ad to display that’s already huge money for Google. In Genomics, pushing the performance on an ROC of a PSSM so it’s more accurate is great but if we still can not get it to be accurate for medical applications or translate it to novel biology then the impact is limited.
  7. In general, we as CS people tend to have a “problem solver” attitude. Think of it: It’s very common to get from a CS person some version of “Tell me what your problem and I’ll tell you how to solve it”. It may be phrased differently but it will boil down to that. Which is great: I think it reflects the usefulness of CS and our pragmatic/interdisciplinary approach (borrowing from engineering, math, physics – anything to get the job done). It also contributes to the relevance of CS to much of the modern life/research. However, this attitude can also come across as arrogant, ignorant, or naive. I’ll give three examples:
    1. I attended a talk by a CS guy at our medical school who advocates for adaptation of functional programming (Scala) and big data tools from the industry (e.g. Spark, Kafka) for Genomics. While his intentions were good he showed a promotional video (probably to recruit CS students) that came across as “come cure cancer with us”. Some people got really upset by this naive representation of the complex problems involved. Some just left. I think the issues with this example are pretty clear so I’ll just go to the next.
    2. I recently read an interview with a prominent ML researcher. The researcher explained that humans are pretty bad at understanding bio-medical/genomics problems and therefore we need “super human intelligence”. So where is the problem with this? I am the first to agree humans are not good in looking at many numbers and identifying patterns – that’s why we use ML. But “super human intelligence” is a very murky term (for example, see an excellent discussion about it in Neil’s Lawrence’s recent post [2]  and also this post by Luciano Floridi about AI future [3]). Especially with respect to DL this approach takes away from us, humans, both the role of defining the input feature and understanding the output. This issue becomes evident in following example.
    3. I was talking recently with a smart, capable, young researcher who visited Penn. He told me how he saw “great promise” in the recent DL in Genomics research since it may be possible to just “put it all into the network and not worry about the feature definition”. Unfortunately for us in CS/ML, ML Expert knowledge ≠ Domain Expert knowledge. We need the latter to guide the former, and in many cases we also want the ML (results, model) to expand/deepen our expert understanding. Well, at least until we get some real “super human intelligence” around here…. 😉 More seriously though, to really make a dent in biomedical research one needs to invest in gaining expert domain knowledge and work closely together with domain experts. For that you need good collaborators/mentors/trainees, a good environment, and a good mindset (maybe more on that in a future post).

In summary, I think statements as the ones illustrated above may cause all of us, researchers engaging in ML for bio-medical research, more harm (how we or our field are perceived by other researchers) than good (getting mentioned in the media or maybe even a grant). It’s already exciting and promising as it is, so let’s just be a bit more humble and focus on getting a few more things done….

 

[1] http://inverseprobability.com/2016/03/04/deep-learning-and-uncertainty

[2] http://inverseprobability.com/2016/05/09/machine-learning-futures-6

[3] https://aeon.co/essays/true-ai-is-both-logically-possible-and-utterly-implausible

Should I Fire Her?

Some time ago I was sitting over dinner with a couple of other PIs when one of them told the following story:

A student was doing a project in the lab as part of a Masters degree. Before joining the lab for the project the student discussed with the PI which courses to take in order to make good progress in both courses and research. Specifically, they agreed a specific course the student was considering would put too much load and so the student should avoid it. As the time passed the student was falling behind, with the project getting stuck. The PI asked the student time and time again is everything OK, what was going on etc. Only toward the end of the semester the student admitted of taking that course, hiding it, and then failing to manage the courses and the research. The student still wanted to keep doing research in the lab though. The PI was left with a dilemma: Fire the student yes/no?

Of course, the description above is missing many details that can influence your decision: How badly was the project delayed? What were the ramifications of that? Why did the student insist on taking that course? What would be the consequences of “firing” that student from the research project? etc. Notice I tried to remove any mentioning of the gender, field, or rank to minimize the effect of confounding factors. Still, I think it’s a good base for discussion.

The opinions at dinner were split. The PI seemed to lean towards allowing the student to stay. The PI was also the only person to actually know the student. My view leaned towards letting the student go. I explained that I am all for training and teaching our students, and I spend a lot of thought and effort on that, trying to make them grow. But I make it clear to anyone who joins that integrity and trust is never to be doubted or meddled with. I explain to new lab members that while we all hate to F** up, we need to come clean about it as by far our most valuable asset as scientists is our integrity. Where I grew up (i.e. in the army) you literally trusted your life with someone and therefore doubting them was not an option. Soldiers were continuously put into extreme situations and those who lacked integrity were quickly kicked out. I think I have grown more patient and accommodating during the years (you meet a lot of people in both the army and academia, in both environments you need to learn to work with them….) but I guess the low tolerance for integrity faults stuck. Another point I raised is that it’s important to make the student understand they do not live in a vacuum and their actions have consequences. In many cases young students lack awareness for the system around them, they just assume it’s there to serve them. But that system is made of people. If a project gets stuck maybe a paper deadline won’t be met, maybe a grant won’t be renewed. It’s not just about them or, in this case, their courses. A more senior PI brought another argument to support firing the student: He/She explained that by kicking that student out the PI would be doing the student a favor, giving the student a valuable life lesson for cheap, about not meddling with trust/integrity in a work environment. Another PI was more forgiving: Young students, he/she explained, can easily find themselves tangled beyond what they planned but still deserve a second chance.

I’m not sure if this is a clear cut case and we are definitely missing details (e.g. the student was maybe over motivated but with good intentions, not a slacker, what was the student exactly told before, etc.), but I thought it made a good base for a worthy discussion in a post. Some points that came to my mind when thinking about this story were:

  1. Make it very clear right off the start what are the rules/expectations so if a student does something like that they also understand how this is viewed and what they can expect. (One way to do this btw is to write a blog about it… 😉
  2. Try to see the whole story from all sides before making a final decision. That goes for both the PI and the student.
  3. Even if you decide to keep the student you need to send a very clear signal to all the lab about this issue. Otherwise you are facing a very slippery slope which would be very hard to get out of. This is not about theory and ideals: There is much more than just this student/project at stake for you too.
  4. Make your decision for the right reasons. The point here is that the most consistent advice I got when I was interviewing for a tenure track position was: Do *not* compromise on the people you take to the lab. As a young PI you badly want people to get things done. Since you just started maybe students/postdocs do not line up in front of your door. So you interview someone and think “I can make it work”. No you can’t. And you are likely to spend a lot of time/energy/money learning that. This does not mean you should wait for an Olympic Champion, because that’s not likely to come your way in the beginning either. But you need to find someone which you strongly feel can be a good fit, not someone you take against your gut feeling. So, keeping the same line of logic, if you decide to keep a student like the one in the story then do it because of the reasons listed above and not because you badly want people and think you can “make it work”…

So, what would you do??

Do Not Seek Your Goal

You do not seek your goal, you encounter it.

In today’s post I want to get back to connecting principals from the martial arts to everyday life and research. The above quote is from a Brazilian Jiu Jitsu (BJJ) seminar by Ryron Gracie which I attended a while ago. Ryron attributed it to someone attending his seminar in Spain.

The point Ryron was trying to make is that at high levels of BJJ you should not look for a submission, trying to “get it”. Instead, you just observe and seize the opportunity when it arises. I thought it was a great point, something to strive for in my training. I also thought it was a manifestation of a much older and general principal in Martial Arts by which “technique will occur in the absence of conscious thought”. This quote (Te wa ku ni to sunawachi hairu) is from the  Kempo Hakku, a passage in the Bubishi [1]. I relate this principal to ultimately achieving the mental state of Mushin. Mushin, or mushin no shin, is a Zen term translated to English as “no mind”, or “the mind without mind “. The term refers to a state of “no-mindness” where in combat, but in everyday life as well, your mind is free from thoughts. You do not think what should be the next move – it naturally occurs, without hesitation and without disturbance from any thoughts/anger/fear/ego. In modern studies of combat you can find descriptions of high level of performance on “autopilot”, where different parts of the brain “take control”, with high levels of awareness and heart rates of 175 pbm [2]. But what does all that have to do with our work as researchers? Ah, I’m glad you asked.

In research, and life in general, we also need to keep our mind, eyes, and ears open in all directions. Then serendipity will come knocking. Instead of being married to an idea, trying to “get it”, You “listen to what the data tells you” (see a previous post on that here). Moreover, you are likely to start working towards idea A encounter B and end up doing C [3]. But that’s OK – because you were never married to the A idea, and you knew to let it go noticing C when you “encounter it”. You will not “encounter it” though if you are too busy seeking your goal, being married to A or bummed down by B.  A good example from our work is how we ended up developing MAJIQ, described here.

Seemingly, this “mindless” approach is almost contradictory to the rational way we are brought up to approach problems, especially in the Life Sciences – format a hypothesis, test it, etc. There is no real contradiction though: I would say that the “classical” formulation of the scientific methodology is the local tool that moves you forward in your research, allowing you to test/verify things, while the “no mind” approach is the more general rule by which you strive to live, fight, or direct your Science.

Possibly related to that, Francis Crick said [4]:

“…It is amateurs who have one big bright beautiful idea that they can never abandon. Professionals know that they have to produce theory after theory before they are likely to hit the jackpot. The very process of abandoning one theory for another gives them a degree of critical detachment that is almost essential if they are to succeed”

To be fair though, I don’t think Crick was thinking about Mushin, and the above quote has to do more with a specific confounding factor in Scientific research – our egos. But that’s maybe something for another post.

[1] https://en.wikipedia.org/wiki/Wubei_Zhi
[2] On Combat: The Psychology and Physiology of Deadly Conflict in War and in Peace, lieutenant colonel David Grossman, 2004
[3] Related to that: see Uri Alon’s excellent talk about getting lost in the research “cloud” https://www.youtube.com/watch?v=RVoz_pEeV8I
[4] Crick F., What mad pursuit: A personal view of scientific discovery 1988, P. 142, Basic Books, NY

MAJIQ and The Naming of Things

In an earlier post I discussed how we might benefit by learning from other cultures/fields a term or concept that we never had, then implement it in our everyday life. I also suggested a way we can actually hope to achieve that: importing the (native) term into our own vocabulary, then actively look for opportunities to observe/implement it. The examples I gave came from Hindi, living in Canada, and my army service. In this blog I want to expand on the idea of “the naming of things”, connecting it to ideas in linguistics and our recent research work.

In linguistics the Sapir Whorf hypothesis [1] states that language determines or at least influences our thought. This idea was followed in many works, from the study of cultures to programming languages. Franz Boas for example popularized the idea that the Eskimo languages has a reach vocabulary to describe the many possible forms of snow. Kenneth Iverson, the turing awardee and developer of APL (the historical ancestor of Matlab and math operation based coding languages), argued for this idea in the context of coding languages.

I know this all sounds theoretical and ancient but I was thinking about Sapir Whorf when we ran into some surprising findings during our research. Our lab is interested in RNA processing and post-transcriptional regulation. Historically, the study of RNA splicing variations has focused on two main approaches: studying whole transcripts or quantifying alternative splicing “events”. The latter have been identified in model systems and categorized to basic subtypes such as a cassette exon (including/skipping an exon), intron retention, and alternative 3’/5′ splice sites. The common wisdom, supported by high-throughput studies since 2008, was that these are the most common forms of splicing variations. And so, the fields “vocabulary” was set for the decades that followed and subsequent works either studied whole transcripts or these AS “events”.

When we started working on MAJIQ (Modeling Alternative Junction Inclusion Quantification) 2.5 years ago we just wanted to better quantitate alternative splicing “events” from RNA-Seq. But working on MAJIQ led us to define LSVs, or local splicing variations. LSVs can be thought of as splits in a gene splice graph coming from or going into a single exon (hence “local”). Here is a simple illustration of such LSVs on a splice graph:

LSVIllustration

Besides the intuitive definition, one nice thing about LSVs is that while they are graph based, they actually correspond directly to biology (i.e. which RNA segments should the spliceosome splice together at any given point) as well as direct experimental evidence (the junction spanning reads). The second nice thing about LSVs is that unlike full transcripts they can be directly inferred from short RNA-Seq reads, yet still offer a much more expressive language than previously defined AS “events”. Specifically, previously defined AS “types”, appear as special cases of binary LSVs, but much more complex variations can be captured. Here is a simple illustration of this:

BinaryComplexLSVIllustration

Now that we suddenly had this reacher language at hand we could actually study the full spectrum of LSVs – identify, quantify, and visualize those. We realized that complex, non binary, variations made up over a third of the variations in mouse and human and were highly enriched in regulated variations. In our recent paper published in eLife we started characterizing the spectrum of LSVs and how they are relevant to gene regulation, development, and disease. But this is really the tip of the iceberg. And we hope that with the tools we created (MAJIQ and the matching visualization package VOILA), the full effect of LSVs will be discovered by the greater scientific community. Or, to put it in linguistics terms, now that we have the ability to name those variations, we can bring those to the focus of our attention.

[1] https://en.wikipedia.org/wiki/Linguistic_relativity

“Nobody Understands Quantum Mechanics”

To start the new year with a smile I thought to follow up on my previous post about importing concepts from other languages/cultures with a short, light, story.

As an undergrad in Physics and Computer Science at the Hebrew University our TA gave us a short story when we learned about Quantum Physics. The humorous story, dealing with our trouble in truly grasping it, was written by a famous Israeli novelist named Etgar Keret based on a quote from Feynman. While the pains with Quantum Physics are famously universal Keret’s humor in that story is distinctly Israeli. A few years ago I decided to make a free translation of the story, aiming to maintain the spirit/humor for non Israeli readers. Not sure how well it turned out, but since I really like the story I thought I’ll post it here for posterity…

“I think I can safely say that nobody understands quantum mechanics.” ~Richard Feynman
(A free translation of a short story by Etgar Keret)

“It’s your family name”, the psychiatric told Quantum. “You know, `Mechanics`, everybody sort of expects something simple that they can relate to, like a ball falling off a high building or a canon ball shooting through the sky”.

– “I can’t do much about that”, said Quantum.
– “Well then”, said the psychiatrist, “Maybe try to communicate yourself better?”

What a great advice that was. Quantum started by trying to make amends with his old friend from grad school, Albert. “Perfect timing” he thought as he walked over to Albert’s house, “Yom Kippur* is coming up”.

– “I’m not home,” shouted Einstein from behind a closed door, “go play with your dice”.
– “But it is Yom Kippur” tried Quantum.
– “Not where I’m at” came the answer, and Quantum knew better than to argue with Albert about relative points of view. Then, on the way back home, people kept stepping over him on the subway. That’s what you get for being so small.

But forget the size – no matter what he says it seems people just don’t get him. He can go unnoticed on everyday life but then say something innocent like “wow, did you see that cat?” and right away there are news breaks that he’s making provocations again and reporters rush to interview Schrodinger about it. The media has been awful in portraying him. It started when he was once interviewed in Science and said that the observer affects the observed event. All the journalists immediately thought he was talking politics, trying to avoid an objective discussion about the Middle East, the economy, or what not.

Probably the worst is most people think Quantum is heartless, that he has no feelings, but that’s just not true. On Friday, after a documentary on Hiroshima, he was on the expert panel. And he couldn’t even speak. Just sat in front of the open mic and cried, and all of the viewers at home, that don’t really know Quantum, couldn’t see that Quantum was crying. They just thought he was avoiding the question. And the sad thing about it all is that even if Quantum writes dozens of letters to the editors of all the scientific journals in the world and proves beyond any doubt that for the atomic bomb thing he was just being used and he never thought it would end this way, it wouldn’t help him. Because nobody really understands Quantum Mechanics. Least of all the physicists. Ask Feynman.

 
*Yom Kippur: the Day of Atonement, the holiest day of the year for the Jewish people.

The Naming of Things

In my previous post I asked how can we teach ourselves, our kids, or our students the mindset associated with jugaad (aka Fruglavation) or “Knitting” to get the job done. With the hype that seems to build around jugaad I noticed various entities offer courses around it etc. Maybe they are great. But the solution I wanted to offer here is much simpler.

It starts with the naming of things. That’s why I actually don’t like “Fruglavation” which is a mouthful and seems to take away the cultural context instead of celebrating it. Once you become aware of a concept and have a (nice, short) name for it, you can start consciously looking for instances for where it can be applied. This requires sustained deliberation which can eventually become a habit. It does not have to be filled with effort or frustration. I know this all sounds very theoretical so I’ll give two examples.

When we moved to Canada we found many things that we really liked in the Canadian society. One of those was the emphasis Canadians gave on “Team work”. Since early kindergarten the term was drilled into the younglings in every activity they had. It even came with a nice sing a long tune (“What’s gonna work? Team work..”*). So, we started using the term (and the sing-a-long) whenever that was relevant for a situation we encountered with our kids and when operating as a family. I don’t know if that’s a solid proof of concept (n=3) but I think it helped make us/our kids more collaborative and accommodating.

The second example, has to do with a term I learned at the advance driving school lesson I was forced to take** when we moved to Canada. You might be wondering about my sanity at this point, but these classes I had to take emphasized the benefits for society of “courteous driving”. They gave specific examples, including movie clips etc. to illustrate this point. Now this may all sound trivial to some of you but, putting it mildly, I would say anyone coming from Israel will tell you “courteous driving” is not part of the curriculum. I liked it so much I happily incorporated this into my everyday driving practice: accommodating for other drivers changing lanes, joining from side streets etc. I can’t say I’m always patient and accommodating (and maybe I’m just getting old.. 😉 ) but I think it helped me become a better sharer of the road.

More generally, I think there is a lot to be learned by looking at other cultures or disciplines and bringing the lessons back to our own practice/domain. Many of those lessons/insights I find useful for everyday life come from the Martial Arts (as the title of my blog eludes). But it all starts with conscious awareness and the naming of things. I recently found an example for that in our lab’s research, hopefully I’ll get around to write about it in another post soon. In the meantime – Happy new 2016!

*Based on the theme song of “Wonder Pets’, an American Emmy award winning TV series for your kids.

** Well, “forced’ may be a bit of an exaggeration but let’s say the insurance rates highly motivate you to take those when you arrive.