RAPID EAGLE INC
Python Developer
Full Time • Hybrid - US
Benefits:
- Dental insurance
- Health insurance
- Paid time off
Python Developer
Onsite Role
Charlotte NC
Key Responsibilities
• Build and maintain large-scale data processing pipelines using Apache Spark for batch and streaming data.
• Design and implement ML training and inference workflows using PyTorch and integrate them into production systems.
• Develop and orchestrate ETL and ML pipelines with Apache Airflow, ensuring reliability, scalability, and observability.
• Optimize performance of data pipelines and ML model training on distributed clusters.
• Collaborate with Data Scientists and ML Engineers to productize models and deploy them into production environments.
• Implement best practices for code quality, CI/CD, unit testing, and monitoring.
• Ensure data quality, integrity, and security across all pipelines.
• Troubleshoot performance bottlenecks and optimize resource utilization.
• Stay up to date with advancements in ML frameworks, distributed computing, and workflow orchestration tools.
Required Qualifications
• Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
• 5+ years of professional Python development experience, with strong object-oriented programming and software engineering fundamentals.
• Hands-on experience with PyTorch for model training and inference.
• Deep understanding of Apache Spark for distributed data processing (PySpark or Scala is a plus).
• Strong experience with Apache Airflow for workflow orchestration in production environments.
• Proficiency in SQL and working with relational and NoSQL databases.
• Experience with Docker, Kubernetes, and cloud platforms (AWS/GCP/Azure).
• Familiarity with data versioning and ML model lifecycle management (MLflow or similar).
• Strong problem-solving and debugging skills in distributed systems.
Preferred Skills
• Experience with real-time data processing frameworks (Kafka, Flink).
• Knowledge of feature stores, data lake architectures, and Delta Lake.
• Familiarity with MLOps practices (CI/CD for ML, model registry, automated retraining).
• Experience with GPU-accelerated ML training and performance optimization.
• Contribution to open-source ML or data engineering projects.
Onsite Role
Charlotte NC
Key Responsibilities
• Build and maintain large-scale data processing pipelines using Apache Spark for batch and streaming data.
• Design and implement ML training and inference workflows using PyTorch and integrate them into production systems.
• Develop and orchestrate ETL and ML pipelines with Apache Airflow, ensuring reliability, scalability, and observability.
• Optimize performance of data pipelines and ML model training on distributed clusters.
• Collaborate with Data Scientists and ML Engineers to productize models and deploy them into production environments.
• Implement best practices for code quality, CI/CD, unit testing, and monitoring.
• Ensure data quality, integrity, and security across all pipelines.
• Troubleshoot performance bottlenecks and optimize resource utilization.
• Stay up to date with advancements in ML frameworks, distributed computing, and workflow orchestration tools.
Required Qualifications
• Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
• 5+ years of professional Python development experience, with strong object-oriented programming and software engineering fundamentals.
• Hands-on experience with PyTorch for model training and inference.
• Deep understanding of Apache Spark for distributed data processing (PySpark or Scala is a plus).
• Strong experience with Apache Airflow for workflow orchestration in production environments.
• Proficiency in SQL and working with relational and NoSQL databases.
• Experience with Docker, Kubernetes, and cloud platforms (AWS/GCP/Azure).
• Familiarity with data versioning and ML model lifecycle management (MLflow or similar).
• Strong problem-solving and debugging skills in distributed systems.
Preferred Skills
• Experience with real-time data processing frameworks (Kafka, Flink).
• Knowledge of feature stores, data lake architectures, and Delta Lake.
• Familiarity with MLOps practices (CI/CD for ML, model registry, automated retraining).
• Experience with GPU-accelerated ML training and performance optimization.
• Contribution to open-source ML or data engineering projects.
Flexible work from home options available.
Compensation: $55.00 - $75.00 per hour
(if you already have a resume on Indeed)