Bangalore, Karnataka, India
Information Technology
Full-Time
Encardio Rite
Overview
Description: As a Data Scientist at Encardio, you will analyze complex time-series data from devices such as accelerometers, strain gauges, and tilt meters. Your responsibilities will span data preprocessing, feature engineering, machine learning model development, and integration with real-time systems. You'll collaborate closely with engineers and domain experts to translate physical behaviours into actionable insights. This role is ideal for someone with strong statistical skills, experience in time-series modeling, and a desire to understand the real-world impact of their models in civil and industrial monitoring.
Languages & Libraries
- Responsibilities
- Sensor Data Understanding & Preprocessing
- Clean, denoise, and preprocess high-frequency time-series data from edge devices.
- Handle missing, corrupted, or delayed telemetry from IoT sources.
- Develop domain knowledge of physical sensors and their behaviour (e.g., vibration patterns, strain profiles).
- Exploratory & Statistical Analysis
- Perform statistical and exploratory data analysis (EDA) on structured/unstructured sensor data.
- Identify anomalies, patterns, and correlations in multi-sensor environments.
- Feature Engineering
- Generate meaningful time-domain and frequency-domain features (e.g., FFT, wavelets).
- Implement scalable feature extraction pipelines.
- Model Development
- Build and validate ML models for:
- Anomaly detection (e.g., vibration spikes)
- Event classification (e.g., tilt angle breaches)
- Predictive maintenance (e.g., time-to-failure)
- Leverage traditional ML and deep learning and LLMs
- Deployment & Integration
- Work with Data Engineers to integrate models into real-time data pipelines and edge/cloud platforms.
- Package and containerize models (e.g., with Docker) for scalable deployment.
- Monitoring & Feedback
- Track model performance post-deployment and retrain/update as needed.
- Design feedback loops using human-in-the-loop or rule-based corrections.
- Collaboration & Communication
- Collaborate with hardware, firmware, and data engineering teams.
- Translate physical phenomena into data problems and insights.
- Document approaches, models, and assumptions for reproducibility.
- Reusable preprocessing and feature extraction modules for sensor data.
- Accurate and explainable ML models for anomaly/event detection.
- Model deployment artifacts (Docker images, APIs) for cloud or edge execution.
- Jupyter notebooks and dashboards (streamlit) for diagnostics, visualization, and insight generation.
- Model monitoring reports and performance metrics with retraining pipelines.
- Domain-specific data dictionaries and technical knowledge bases.
- Contribution to internal documentation and research discussions.
- Build deep understanding and documentation of sensor behavior and characteristics.
Languages & Libraries
- Python (NumPy, Pandas, SciPy, Scikit-learn, PyTorch/TensorFlow)
- Bash (for data ops & batch jobs)
- FFT, DWT, STFT (via SciPy, Librosa, tsfresh)
- Time-series modeling (sktime, statsmodels, Prophet)
- Scikit-learn (traditional ML)
- PyTorch / TensorFlow / Keras (deep learning)
- XGBoost / LightGBM (tabular modeling)
- Jupyter, Matplotlib, Seaborn, Plotly, Grafana (for dashboards)
- Docker (for containerizing ML models)
- FastAPI / Flask (for ML inference APIs)
- GitHub Actions (CI/CD for models)
- ONNX / TorchScript (for lightweight deployment)
- Kafka (real-time data ingestion)
- S3 (model/data storage)
- Trino / Athena (querying raw and processed data)
- Argo Workflows / Airflow (model training pipelines)
- Prometheus / Grafana (model & system monitoring)
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in