Research & Development (Neural Machine Translation) Internship in Mumbai at IIT Bombay
Research & Development (Neural Machine Translation)
Start Date
15 Jun - 30 Jun' 20
6 Months
3000-8000 /month
Apply By
31 May' 20
The hiring for this internship will be online and the company will provide work from home/ deferred joining till current COVID-19 situation improves
About IIT Bombay
The Indian Institute of Technology, Bombay (IITB) is one of the fifteen higher institutes of technology in the country set up with the objective of making facilities available for higher education, research, and training in various fields of science and technology. With the same mission and vision, professor Ganesh Ramakrishnan is gearing to take rural India a leap ahead. For his outstanding contributions, he has also been awarded the IBM Faculty Award in 2011. IIT Bombay has also honored professor Ganesh's work on "adaptive framework for end-to-end corrections in Indic OCR".
About the internship
Selected intern's day-to-day responsibilities include:

1. Work on data cleaning, pre-processing, and text parsing
2. Write well designed, testable, efficient code by using best software development practices
3. Work on implementation of ML model for unsupervised tasks
4. Engage in server handling and API deployment
5. Stay plugged into emerging technologies/industry trends and applying them to operations and activities
6. Develop the next generation of core MT technology to allow our users to communicate across language barriers
Skill(s) required
Software Testing Machine Learning Python Natural Language Processing (NLP) Deep Learning
Who can apply

Only those candidates can apply who:

1. are available for full time (in-office) internship

2. can start the internship between 15th Jun'20 and 30th Jun'20

3. are available for duration of 6 months

4. have relevant skills and interests

Other requirements

1. Expertise in key language technologies including machine translation or natural language processing

2. Proven background in machine learning and deep learning including deep neural networks, sequence-to-sequence models, etc.

3. In-depth knowledge of architectures like Transformers, Encoder-Decoder, LSTMs, RNNs, etc.

4. Hands-on experience with deep learning toolkits including Tensorflow, PyTorch, Keras, etc.

5. Ability to formulate a research problem, design, experiment and implement solutions in Python

6. Excellent spoken and written communication skills

7. Strong dedication and consistency towards long research projects

8. Good to have experience working with standard MT/NLP toolkits, e.g. Sockeye, OpenNMT, etc.

Certificate Letter of recommendation Informal dress code
Additional Information

This is an in-office internship & will start in June.

The need for translating domain specific content such as legal documents, technical and non-technical documents, educational materials, government procedures and services is increasing exponentially. Most of the tools manufactured by Russia are written in Russian. These need to be translated efficiently into English to be of benefit to the Indian Navy.

This includes the automated translations for Standards (GOST), Operating Documents, Repair Technical Documents (RTD), Technical Drawings, Contracts, Supplementary Agreements (SAs), Price Catalogues, Speeches, Minutes of Meetings, etc. These documents are available in formats like Word, Excel, PDF, Power Point, Image etc. The translation process presently being undertaken by RTC is manual.

Manual Translation of documents is evidently tedious and time intensive. Hence the goal is to build processes and models that would lead to enhanced translation tools enabling large-scale translation of domain specific content into English. We propose an online framework for translating Russian to English.

Number of openings

