| Job ID: | #4968 |
|---|---|
| Title: | Artificial Intelligence Data Processing Specialist – Language Model Development |
| Location: | Iqaluit, Nunavut, Canada |
| Benefits: | Relocation Assistance, Travel and Accommodation |
| Work location: | Hybrid remote |
| Job type: | Independent Contractor |
| Estimated Pay: | CA$60.00 - CA$90.00 Per Hour |
| Experience required: | 3 years |
| Minimun Education: | Bachelor |
Location: Nunavut; hybrid remote and on-site, or remote within Canada
Employment Type: Full-time, contract, or term
Estimated Hourly Rate: $60 to $90 CAD
Position Overview
The Artificial Intelligence Data Processing Specialist supports the development of advanced multilingual artificial intelligence systems, including natural language processing (NLP), speech recognition, and large language models (LLMs). This role focuses on preparing high-quality text and audio datasets used for training, fine-tuning, and validating AI and machine learning models, with a strong emphasis on accuracy, consistency, and traceability throughout the data lifecycle.
Key Responsibilities
Multilingual AI Data Preparation
- Extract and curate multilingual text and audio data from approved and authorized sources
- Prepare datasets for AI, machine learning, NLP, and language model training and evaluation
- Ensure data handling aligns with defined AI data governance standards and documentation requirements
Translation Alignment and Validation
- Align translation pairs across multiple languages for machine translation and language model training
- Validate linguistic accuracy, semantic consistency, and correct language pair mapping
- Confirm dataset integrity for downstream AI and NLP workflows
Audio and Speech Data Processing
- Segment, clean, and normalize audio files for speech recognition and language model training
- Ensure audio quality meets technical specifications for ASR, TTS, and speech-based AI systems
- Support creation of aligned audio-text datasets for multimodal AI models
Dataset Formatting and Quality Control
- Format datasets to required schemas and structures for AI pipelines and training environments
- Perform quality assurance checks to confirm dataset completeness, usability, and accuracy
- Identify, document, and resolve data issues, anomalies, or inconsistencies
Tracking, Documentation, and Reporting
- Maintain detailed records of AI data sources, dataset volumes, and processing status
- Track progress across data processing stages to support reproducibility and auditability
- Produce clear technical documentation and reports for project oversight and compliance
Qualifications and Experience
- Post-secondary education in Artificial Intelligence, Data Science, Computer Science, Linguistics, or a related field; or equivalent hands-on experience
- Experience working with multilingual text corpora and or speech datasets for AI or machine learning
- Strong attention to detail and commitment to high-quality AI training data
- Ability to follow structured workflows, technical standards, and documentation requirements
Technical Skills
- Experience handling large-scale text and audio datasets for AI or NLP projects
- Familiarity with data cleaning, annotation, segmentation, alignment, or labeling tools
- Basic scripting or data processing skills using Python, shell scripting, or similar tools
- Exposure to machine learning, natural language processing, speech recognition, or language model development
Assets (Nice to Have)
- Knowledge of language technologies such as NLP pipelines, ASR, TTS, or machine translation
- Experience working with Indigenous, minority, or low-resource languages in AI contexts
- Background in research, regulated, or highly controlled data environments
Why Work in Nunavut
- Meaningful impact; your AI work supports culturally grounded services and community wellbeing
- Culture first; Inuit knowledge, language, and lived experience inform how technology is built and used
- Unique experience; collaborate across remote communities in one of Canada’s most culturally rich regions
- Flexibility; contract, term, or full-time options with on-site, hybrid, or remote work models
- Purpose-driven AI; contribute to reconciliation and culturally safe, responsible technology