Visualizing code as a syntax tree is both funny and useful, as seen from impressive applications such as creating lineage of SQL which helps to understand complex queries in business. Abstract syntax trees are not only widely used in industry but are still a subject of top academic research1,2. This post demonstrates how to work …
ML Prototyping Environment on Cloud
How to prototype with multiple ML libraries in Cloud? Best to build on top of a rich pre-configured environment such as Kaggle image, extending it with a local virtual environment.
Efficient Pre-Commit Hooks with GitHub Actions
How to run pre-commit checks on CI/CD effectively?
Free and robust Tweets extraction
As anticipated by many, Twitter stopped offering its (limited!) API for free 1. Now, what options do you have to programmatically access the public content for free?In this context, it is worth mentioning the library snscrape, a tool (well-maintained as of now) for extracting the content from social media services such as Facebook, Instagram or …
Monitoring Azure Experiments
Azure Cloud is a popular work environment for many data scientists, yet many features remain poorly documented. This note shows how to monitor Azure experiments in a more handy and detailed way than through web or cl interface. The trick is to create a dashborad of experiments and their respective runs, up to a desired …
Repo Passwords in Poetry
Poetry, a popular Python package manager, prefers to use keyring to manage passwords for private code repositories. Storing passwords in plain text is a secondary option, but may be needed in case of either issues in poetry itself or with keyring configuration (may not be properly installed, be locked etc). To disable the use of …
Marking Python Tests as Optional
Often code tests are to be run on special demand, rather than in a CI/CD manner: for instance, they may be slow or work only in a local mode with protected data. This note shows how to declare code tests optional in pytest, the leading testing framework for Python. The article is inspired by the …
Debug CI/CD with SSH
What to do when CircleCI reports are not informative enough on errors? Debug the failing environment live with SSH!
Robust Azure ETLs with Python
Microsoft Azure faces criticism for being poorly explained, but remains a popular cloud computing platform for many companies. How data engineers can build robust extract-load-transform processes on top of it using Python
Recruiters nowadays use online timed tests when screening developers. I recently looked at Python & Algorithms Hard questions at TestDome. While the timing and hints seem to push towards implementing tricks from scratch, for the quality in long term it is better to structure the problem and use established solutions (divide & conquer). The battery …