Who we are:
Innodata (NASDAQ: INOD) is a leading data engineering company. With more than 2,000 customers and operations in 13 cities around the world, we are the AI technology solutions provider-of-choice to 4 out of 5 of the world’s biggest technology companies, as well as leading companies across financial services, insurance, technology, law, and medicine.
By combining advanced machine learning and artificial intelligence (ML/AI) technologies, a global workforce of subject matter experts, and a high-security infrastructure, we’re helping usher in the promise of clean and optimized digital data to all industries. Innodata offers a powerful combination of both digital data solutions and easy-to-use, high-quality platforms.
Our global workforce includes over 3,000 employees in the United States, Canada, United Kingdom, the Philippines, India, Sri Lanka, Israel and Germany. We’re poised for a period of explosive growth over the next few years.
About the Role
We’re looking for curious, hands-on interns who want to build real-world GenAI systems — from prototyping LLM pipelines to scaling APIs and data infrastructure. You’ll work across engineering, research, and product teams to turn cutting-edge AI into production-ready solutions.
Key Responsibilities
Prototype LLM + retrieval pipelines with safety and filtering.
Operate knowledge graph/ vector DBs (Pinecone, Weaviate) and manage embeddings.
Build FastAPI services for search, recsys, and memory.
Design resilient systems with caching, retries, observability.
Run data pipelines for large-scale indexing and embeddings.
Capture personalization signals (search, chat, purchase).
Optimize for low-latency APIs & high-throughput pipelines.
Collaborate with research and product on evaluation and UX.