[PhD] Virtual machine for debugging high performance computing

The RMOD team of INRIA Lille has an open position for a PhD student.

I. Context

Language virtual machines (such as the Java virtual machine) are deployed everywhere. They can be found on mobiles, on web browsers, on server applications and embedded devices. They implement many features and allow applications to be packaged once and deployed on many different platforms. They provide debugging facilities for the program they run.

Besides, in the context of high performance computing, a single program is deployed on many different machines with different hardwares (CPUs, GPUs, TPUs, FPGAs, ...). Tracking down bugs in this context is tough: all the different machines communicate asynchronously with a nondeterministic order of execution due to side-effects such as latency. Debugging this kind of problem is today close to impossible.

In the RMoD team in Inria Lille, we have skills in virtual machine implementation and application debugging. We built a virtual machine with one of the most powerful debugging engines. Our virtual machine has no support, so far, for vectorised instructions which are really important for high performance computing.

II. Problem & research directions

How to extend the language virtual machine to solve easily and efficiently bugs happening in high performance computing problems in a heterogenous cluster of machines?

The Ph.D will be split in 3 phases:

  1. How to support vector instructions in a language virtual machine with full debugging support? The student will focus on Intel SIMD instructions (AVX, AVX-512) and most likely will design a debuggable DSL on top of the virtual machine using those instructions.
  2. How to integrate vector instructions with the eco-system (frameworks, development tools, virtual machine)? The student would need to implement a library or framework (Matrix library ?) to prove that it is possible to use the DSL implemented in part 1). In terms of development tools, some enhancements are required for them to support vector instructions. Regarding the VM, multiple problems arise (How to align vector objects? How to deal with 256/512 bits of raw data on stack? How to support vector instructions efficiently in JIT compilation?)
  3. Can we efficiently debug high performance computing programs by launching part of the execution on our virtual machine while keeping decent performance ?

The student will need to adapt an existing high performance computing framework to compile to our virtual machine and build the debugging toolchain to be able to debug the part of the program run on our virtual machine. Performance will be slower than normal execution, but it has to stay practical (an hour computation can be down to a night of computation to help solving a problem, but not to months of computation).

Application

To apply, please send us :

  • a CV,
  • a copy of your Master diploma
  • a copy of your Master thesis
  • 2 (two) reference letters, with the contact details of the referents

The application materials should be sent by email to C. Béra <bera.clement@inria.fr> and S. Ducasse <stephane.ducasse@inria.fr>.

Email subject must start with : [PhD-RMoD-VM-2018].

Links

Posted by admin at 21 February 2018, 4:16 pm link