What is it like to be a NLP research engineer at Textkernel?
19 January 2017 Blog Kim Pieschel
Tarek Mehrez started working at Textkernel in November 2016 as a NLP research engineer. Originally from Cairo, Egypt, he finished his Master’s in natural language processing in Germany and pursued some internships with Amazon & Sony. In this interview he will tell us a little bit more about his daily tasks and his career as a research engineer at Textkernel.
By Tarek Mehrez
How did you find out about Textkernel and how did you apply or get in touch with us?
I literally googled NLP companies in Europe, came across Textkernel’s career website and applied.
How does a typical day in your team look like? Do you usually know in advance what you will be working on or do you have a lot of ad-hoc tasks?
Kind of both. Mainly it’s predefined tasks that we plan together as a team, then each team member gets some tasks to work on for a sprint of 2 weeks. At the end of the sprint we evaluate the team’s performance and plan a new sprint. However, you also get some spontaneous daily tasks that you have to work on.
What technologies are your favourites and why?
Definitely Python. It has tons of open-source libraries for machine learning and natural language processing that are pretty handy. This makes all of our wild ideas easier to test and implement. I think it’s really important to be able to iterate fast enough, especially in machine learning when it’s all about trying a million different ways before getting the results you need.
Why did you chose to be a research engineer in the industry and not in academia?
Academia has its benefits, you can pursue whatever you are interested in. But it’s also kind of a bubble where you get to do experiments in a controlled environment. Usually, the real world is slightly different and getting nice research ideas to work on an industry scale is not easy and far more challenging. This is where things are much more interesting. Textkernel is trying to bridge the gap between both. The company offers a great research philosophy where we can still try out new ideas and work on interesting engineering problems as well.
What is the biggest challenge when it comes to machines understanding cvs?
It’s easy for us as humans to unintentionally include all knowledge we have about the world before reading a CV, this makes it easier for us to deal with job titles or company names that we see for the first time. That’s not the case for machines, as their entire awareness is limited to a few thousand cvs that they were trained on, in addition to a few knowledge bases. Therefore, it’s really hard to include real-world knowledge for our algorithms to be able to handle all sorts of cases in an autonomous way. I would say coming up with a good parser is not that hard but making it perfect given the numerous special cases that we encounter, for all languages we support, that’s the challenging part.
What do you like the most about your job? What’s your favourite task?
Teaching computers how to understand cvs :). Developing machine learning models which mimic our brain in how we read and understand documents, that’s the most fun part of it. If anyone is interested in cognition or AI, this is one of the most intriguing tasks you can work on.
What’s your least favourite task?
Dealing with really old parts in the code base and trying to understand why a colleague wrote the code in that way 15 years ago and to do some tasks that you are not too familiar with. Sometimes things can get really interesting then.
What did you want to be when you grew up? Is you current job your dream job?
When I grew up I didn’t really have a dream job but I did develop a passion for software engineering and machine learning since I was 17, and I have been working on it since then.
If you received your current salary anyway, what kind of job would you do? The same one or something completely different?
If travelling the world doing some freelance photography isn’t an option, then definitely working as a machine learning engineer would be my thing!