Data observability is so hot right now...but do you know what's also hot? Using some tried and tested ingredients like Apache Airflow and PyOD to perform painless anomaly detection on your key business metrics. You don't need to run off and buy an (expensive!) subscription for the latest hot data observability Sass offering (there is … Continue reading Painless Anomaly Detection with Apache Airflow
Tag: python
Stripe Webhook + GCP Functions Framework (Python)
This took a couple of days of messing around so decided to make a post out of it. Here is a minimal enough example repo using Terraform and GCP Functions Framework to build a GCP Python function that will receive a Stripe webhook event, perform signature verification, and then just print the event. You can … Continue reading Stripe Webhook + GCP Functions Framework (Python)
Colab to just run some curl
Here is a little google colab notebook to just paste in some curl command and get the response back into a json dictionary. This can be handy when working with backend and frontend engineers who might be using a different language than you and just send you some curl commands that you want to explore … Continue reading Colab to just run some curl
Explaining kmeans clustering for unsupervised anomaly detection
Here is a video I did in work explaining how our anomaly detection works. https://www.youtube.com/watch?v=L1xleckyuDQ https://www.netdata.cloud/blog/how-netdatas-machine-learning-works
Hugging Face Text Classification Quickstart
I have been working a bit lately with some text classification stuff using Hugging Face - its great n all but their docs can actually be a bit overwhelming. So here is a minimal text classification example, using huggingface and either pytorch or tensorflow (you decide). Will try to update and maintain the colab here: … Continue reading Hugging Face Text Classification Quickstart
Airflow “Trigger Dags” Python Script
You have some dag that runs multiple times a day but you need to do a manual backfill of last 30 days. It's 2022 and this is still surprisingly painful with Airflow. The "new" REST API helps and mean's all the building blocks are there but, as I found out today, there can often still … Continue reading Airflow “Trigger Dags” Python Script
Time series anomaly detection using PCA
Here is a little recipe for using good old PCA to do some fast and efficient time series anomaly detection.
streamlit multi-page app minimal example
too obvious? maybe. probably. Recently i had a need to assess streamlit for some internal DS/ML/Data apps i wanted to build in my job. By "i had a need" i mean i heard it was the new cool thing so i wanted to play with it and feel better about myself. Anyway, as part of … Continue reading streamlit multi-page app minimal example
Some asyncio fun/pain
Taken from this great Talk Python Training course - get the lifetime bundle if you can! You have a list of api endpoints you want to pull data from and collect results into some results list or dataframe for further processing. You could just loop over that list and make a load of requests.get() calls … Continue reading Some asyncio fun/pain
Anomaly Detection using the Matrix Profile
I like an excuse to play with fancy things, so when i first learned about the Matrix Profile for time series analysis, particularly around anomaly detection, i was intrigued. When i learned there was a nice python package (STUMPY) i could just pip install i was outright excited, as one thing i like more than … Continue reading Anomaly Detection using the Matrix Profile