Tiger Analytics is looking for a skilled and innovative Machine Learning Engineer with hands-on experience in Google Cloud Platform (GCP) and Vertex AI to design, build, and deploy scalable ML solutions. You will play a key role in operationalizing machine learning models and driving the end-to-end ML lifecycle, from data ingestion to model serving and monitoring.
Key Responsibilities:
Develop, train, and optimize ML models using Vertex AI, including Vertex Pipelines, AutoML, and custom model training.
Design and build scalable ML pipelines for feature engineering, training, evaluation, and deployment.
Deploy models to production using Vertex AI endpoints and integrate with downstream applications or APIs.
Collaborate with data scientists, data engineers, and MLOps teams to enable reproducible and reliable ML workflows.
Monitor model performance and set up alerting, retraining triggers, and drift detection mechanisms.
Utilize GCP services such as BigQuery, Dataflow, Cloud Functions, Pub/Sub, and GCS in ML workflows.
Apply CI/CD principles to ML models using Vertex AI Pipelines, Cloud Build, and GitOps practices.
Implement model governance, versioning, explainability, and security best practices within Vertex AI.
Document architecture decisions, workflows, and model lifecycle clearly for internal stakeholders.
-
Advanced Generative AI
- Advanced RAG including Graph based hybrid retrieval
- Multimodal agent
Deep knowledge on ADK , Langchain Agentic Frameworks
Fine tuning and Distillation
-
Python Expertise
- Expert in Python with strong OOP and functional programming skills
- Proficient in ML/DL libraries: TensorFlow, PyTorch, scikit-learn, pandas, NumPy, PySpark
- Experience with production-grade code, testing, and performance optimization
-
GCP Cloud Architecture & Services
- Proficiency in GCP services such as:
- Vertex AI
- BigQuery
- Cloud Storage
- Cloud Run
- Cloud Functions
- Pub/Sub
- Dataproc
- Dataflow
- Understanding of IAM, VPC
-
API Development & Integration
- Designs and builds RESTful APIs using FastAPI or Flask
- Integrates ML models into APIs for real-time inference
- Implements authentication, logging, and performance optimization
-
System Design & Scalability
- Designs end-to-end AI systems with scalability and fault tolerance in mind
- Hands-on experience in developing distributed systems, microservices, and asynchronous processing
This position offers an excellent opportunity for significant career development in a fast-growing and challenging entrepreneurial environment with a high degree of individual responsibility.