Location: Must be based in Mexico or LatAm

We’re a dynamic team at one of the Best Places to Code companies based in Mexico. Our mission? To create fully-fledged platforms using a wide range of tools and technologies.

Keep reading if you’re passionate about clean, elegant code and love collaborating with experts!

About the role

This position is responsible for solving complex problems related to the design, deployment, and continuous optimization of scalable machine learning platforms and production workflows. You will be responsible for architecting and scaling our ML systems to support a growing number of machine learning models and an increasing volume of real-time predictions. This role will also spearhead our initiatives in Generative AI, designing and implementing systems that leverage Large Language Models (LLMs) to translate model predictions into powerful user-facing insights and agents. You will have a significant impact on the performance, reliability, and scalability of our machine learning and AI solutions.

The ideal candidate has a proven record of building and managing large-scale ML platforms and leveraging expertise in machine learning, software engineering, Generative AI, and cloud technologies to optimize performance while collaborating effectively across teams.

vacante-React -NET Developer-web

What you’ll do

N
Architect and implement a scalable, high-performance machine learning platform to support model development, deployment, monitoring, and analysis for both predictive and Generative AI models.
N
Lead the technical strategy and evaluation for our Generative AI infrastructure. This includes assessing the trade-offs between managed services and self-hosted open-source models, defining our LLM hosting strategy, and validating the end-to-end architectural approach for scalable, reliable AI features.
N
Ensure the platform supports a wide range of ML use cases, including real-time prediction serving, batch processing, and model experimentation.
N
Design and implement robust LLM orchestration for advanced applications, enabling the integration of our proprietary predictive models with LLMs to power new insights and workflows.
N
Optimize system performance and model latency to ensure robust, low‑latency inference across distributed systems, with a specific focus on the unique challenges of LLM serving.
N
Identify bottlenecks, evaluate, and integrate new technologies and tools.
N
Collaborate closely with data scientists to productionize both traditional predictive models and novel Generative AI solutions, focusing on systems that combine proprietary model outputs with LLMs to create actionable insights.
N
Contribute to the overall quality of the codebase, ensuring maintainability and best practices.
N
Drive ML / DS best practices, give technical recommendations on challenging problems.

Required Qualifications:

N
Bachelor Degree in Computer Science, Software Engineering, or a related field.
N
4-5 years of experience building and managing production-level machine learning platforms and infrastructure, with a focus on model deployment, optimization, and scalability.
N
Demonstrated ability to improve the performance, reliability, and cost-efficiency of ML systems.
N
Strong experience with cloud-based ML infrastructure (AWS, GCP, Azure) and MLOps practices.
N
Nice to have and/or equivalent combination of education and experience such as a Master Degree or PhD in Computer Science, Data Science, Mathematics, Statistics, or related quantitative field.

Must-have

N
Core Programming & Machine Learning:

  • Proficiency in Python and deep experience with its data science and ML ecosystem (e.g., PyTorch, TensorFlow, scikit-learn, Pandas, NumPy).
  • Hands-on experience with Generative AI frameworks and libraries such as LangChain, LlamaIndex, or Hugging Face Transformers.
N
MLOps & Infrastructure:

  • Expertise in building and maintaining MLOps infrastructure, including containerization (Docker), orchestration (Kubernetes), and CI/CD pipelines for both traditional ML models and LLM-based applications.
  • Proven skill in managing cloud resources using Infrastructure as Code (Terraform).
N
Cloud Platforms & Services:

  • Extensive hands-on experience with cloud platforms, particularly AWS. Required experience with core services (S3, EC2, Lambda) and ML services (SageMaker).
  • Direct experience with or deep knowledge of managed Generative AI services like AWS Bedrock, Amazon Titan, or equivalents (e.g., Google Vertex AI, Azure OpenAI Service).
N
Data Systems & Storage:

  • Advanced proficiency in SQL for complex data extraction and transformation.
  • Experience with a variety of data storage solutions, including relational databases, NoSQL databases, and vector databases (e.g., Pinecone, Weaviate, ChromaDB).

The journey:

We know your time is valuable, so know the whole process will take about 2 weeks. There will be 4 interviews total (an initial one with Human Capital, a technical skill one, one with an Account Manager, and probably one with the client at the end), possibly with a technical test, if necessary.

We will keep you regularly updated about your application, but you can also get in touch with us to ask about its status or anything else you might want to know. Just have fun! If you are a good match for Scio, we will give you a formal job offer and ask you to get the pre-hiring requirements to us within 5 days at most, so preparedness is key.

How to Apply:

If this is the perfect fit for you, send your resume in English to humancapital@sciodev.com. We’ll keep you updated throughout the process.

Feel free to reach out if you have any questions or need further details!