About
As a Principal Data Science with deep expertise in geospatial data and transportation, I thrive at the intersection of complex problem-solving and innovative solutions. Over the past few years, I’ve honed my skills in spatial analysis, data engineering, and machine learning, building robust data pipelines and delivering impactful insights for mobility and transportation systems.
My background spans data science, engineering, and project management. I’ve worked extensively with Python, Kubernetes, Docker, and cloud platforms (AWS, Snowflake), creating scalable solutions that range from vehicle speed prediction models to real-time data processing for transportation systems using GTFS and GPS data.
In addition to my technical capabilities, I am passionate about creating user-centered products that translate complex data into actionable insights. My hands-on experience with tools like Plotly, KeplerGL, and Grafana allows me to build engaging visualizations that drive decision-making.
Ask me about
Work experience
- January 2024 – present
Via
Principal Data ScientistAdvanced in Python for Data Science. Vehicle speed prediction using Machine Learning models. Data pipeline creation involving AWS lambda, orchestration tools and different storage options (Snowflake, AWS Redshift, AWS Athena, S3, etc) Extensive experience working with static GTFS and GTFS RT. GPS noise reduction and conflation to the street network using Valhalla and OSMX. Designed ready to deploy Machine Learning models using Docker and MLflow. Trained Machine Learning models using AWS Kubernetes clusters. Bayesian modeling in problems of different nature. Creation of insightful visualizations plots, geographical maps and dashboards using different tools (Plotly, Holoviews, Grafana, KeplerGL, Leaflet, etc) Maintainer of the Python package *gtfs_functions* with 120 stars in GitHub. Survival analysis applied to finance forecasting. Work daily with the AWS environment. Setting up EC2 instances, Lambda functions, ECR and Elastic Kubernetes Service. Work daily in SQL with Snowflake and Postgres databases. - December 2020 – present
Passion project
Author of "gtfs_functions" Python package"gtfs-functions" is a Python package meant to make it easier to work with the GTFS data standard. It allows easy parsing into GeoDataframes by time period, calculation of frequency by bus stop and stop-to-stop segment as well as calculating the planned speed by stop-to-stop segment for different times of the day. Being GeoDataframes, all these outputs can easily be plotted in a map. - March 2021 – December 2023
Via
Data Scientist Associate PrincipalAdvanced in Python for Data Science. Vehicle speed prediction using Machine Learning models. Data pipeline creation involving AWS lambda, orchestration tools and different storage options (Snowflake, AWS Redshift, AWS Athena, S3, etc) Extensive experience working with static GTFS and GTFS RT. GPS noise reduction and conflation to the street network using Valhalla and OSMX. Designed ready to deploy Machine Learning models using Docker and MLflow. Trained Machine Learning models using AWS Kubernetes clusters. Bayesian modeling in problems of different nature. Creation of insightful visualizations plots, geographical maps and dashboards using different tools (Plotly, Holoviews, Grafana, KeplerGL, Leaflet, etc) Maintainer of the Python package *gtfs_functions* with 120 stars in GitHub. Survival analysis applied to finance forecasting. Work daily with the AWS environment. Setting up EC2 instances, Lambda functions, ECR and Elastic Kubernetes Service. Work daily in SQL with Snowflake and Postgres databases. - July 2019 – March 2021
Remix
Senior Geospatial Data EngineerLead prototype and pilot design to demonstrate the value of innovative features and data analyses our users need. Partner with Product and Engineering teams to solve problems and identify trends and opportunities for new features. Developed the open source Python package *gtfs_functions* to easily parse GTFS public transit standard into usable GeoDataFrames. Run transit-related geo spatial analysis in Python GIS: load passenger calculation, speeds and conflation of data to the street network. Developed passenger load analysis: Worked with some of the biggest Transit Agencies in the US on conflating their APC ridership data to GTFS data to visualize and analyze bus occupancy on a map. Designed predictive modeling experiment: Estimated alightings at the stop level from boardings-only datasets. The model reached nearly 80% of accuracy. Daily user of Python Pandas, Geopandas, Folium, KeplerGl, Plotly, Streamlit, and many other Python libraries. - July 2017 – March 2021
Remix
Project ManagerContributed to start our EU operations going from zero customer to 15 transit agencies in less than 2 years. Acted as a software expert linking our users with the Product team identifying business needs and translating them into technical needs. Led software implementation, training, and support European Transit Authorities in Norway, Sweden, Denmark, Finland, Spain, UK, Luxembourg, and the Netherlands. - October 2016 – June 2017
AndSoft
IT Project ManagerWorked with customers to co-develop implementation plans and scope of the project. Led the discussions between our customers and engineering team to understand business processes and co-develop on requirements, and new features. Applied Agile project management concepts to execute progress tracking plans to ensure projects were delivered on-time, within scope and budget. Acted as an interface communicating technical complex problems to executive teams and decision makers. - November 2014 – October 2016
Goal Systems
Senior Business & Optimization ConsultantImplemented bus scheduling optimization software in France, Spain, Argentina, Chile, China, and Brazil. Co-developed extensive Software Requirements Specification (SRS) interfacing between our customer and our engineering teams. Identified business needs and collaborated with our engineering team in product development. - May 2013 – May 2014
Aluar
Junior Strategic Planning AnalystDeveloped complex techno-economic studies and presented the results to the board of investors. - February 2012 – July 2012
CRP Henri Tudor
Internshit - Masters' ThesisThesis in Genetic Algorithms for Job Shop Scheduling. The goal of the thesis was to create a genetic algorithm in VBA capable of solving operation planning problems optimizing the selected criteria. This was my first contact with mathematical programming and introduced me to ML (Machine Learning) world that I keep exploring with other languages, mostly R and Python.