- Willem's Fizzy Logic

Why You Need a Sandbox Separate from Your ML Pipeline to Create Successful Experiments

Created 2/27/2023 7:08:58 PM
Filed in Machine Learning

Machine learning operations (MLOps) is a rapidly evolving field that is transforming the way businesses approach data science. One of the most critical aspects of MLOps is the ability to create and test experiments quickly and efficiently. While it may be tempting to use your production ML pipeline as a testing ground, it's essential to have a separate sandbox environment for several reasons.

Github Copilot: What you need to know to get started with AI-assisted programming

Created 2/16/2023 8:21:39 AM
Filed in Machine Learning

We’re starting with a pilot to test Github Copilot at Info Support. This is awesome news for us because we can now learn how much copilot is going to help us with our code. In the past few months, I’ve been testing Github Copilot on my own projects and learned a ton about things like use cases and plugins for various IDEs.

Why a Data Fabric is Not Enough: The Importance of Human Expertise in Data Governance

Created 2/2/2023 3:21:01 PM
Filed in Data

Are you considering implementing a Data Fabric for your organization's big data management needs? Before making a decision, it's important to understand the limitations and challenges of this approach. In this article, we dive into why a Data Fabric is not enough to ensure the quality and accuracy of your data.

Did you know DVC doesn't handle large datasets? Neither did we and here's how we fixed it

Created 1/13/2023 9:37:46 AM
Filed in Machine Learning

We hit a nice roadblock yesterday while working on a dataset with 5 million samples. It took us 8 hours to upload the dataset and 11 hours to download and unpack it. We learned some valuable lessons there that I’ll discuss in this post so you don’t have to run into the same issues.

Building a custom Kubernetes operator in C#

Created 1/7/2023 9:47:12 AM
Filed in .NET

Setting up environments in Kubernetes can be done in quite a few ways. You can use Helm, plain manifests, Kustomize and custom-built CLI tools. But what are you going to do after you’ve configured the environment? Read this post to find out how I used C# to write a custom operator in C# to manage my MLOps environment.

How to configure memory limits in WSL2

Created 1/5/2023 8:01:11 AM
Filed in Machine Learning

One of the downsides to building machine learning models is that you need quite a bit of power in your machine. I've just upgraded my home desktop that I use for side projects and experiments to 64GB of memory because it was clearly no longer up to the task. However, when I booted up my WSL linux distro, it couldn't get more than 32GB of memory. Time to figure out how to fix that! In this quick post I'm showing you how you can limit or increase the memory available to your WSL2 distributions. Let's go!

How to build scalable workflows with Prefect and Ray

Created 12/23/2022 7:38:36 AM
Filed in Machine Learning

In the past weeks we've been working on a new computer vision project at Aigency. As part of the project we have to process images using OpenCV image filters. This is a rather tedious process that takes a lot of time, so we made a nice workflow for it. In this blog post I'll show you how we used Prefect and Ray to build a scalable image processing workflow.

How-to use Orchard Core as a decoupled CMS for your website

Created 12/4/2022 9:03:40 AM
Filed in Web development

We've been using Wordpress as our CMS for the Aigency website for a while. However it has so many problems that we decided to take a look at Orchard Core as a possible alternative. In this post we'll explore how to set up Orchard Core.

How to set up CUDA and PyTorch in Ubuntu 22.04 on WSL2

Created 11/2/2022 6:24:44 PM
Filed in Machine Learning

Working with machine-learning code in Windows can be quite a hassle. Often I run into problems with my tools not working properly on Windows. Recently, I even decided to convert my laptop to Ubuntu 22.04 for the time being. Which is really nice for data sciency stuff, but not great for my Powerpoint and Excel since LibreOffice sucks big time. Needless to say, I needed another solution. Especially for my deep learning setup. In this quick post I'll show you how I configured my Ubuntu 22.04 installation on Windows 11 to run CUDA and PyTorch.

Training machine-learning models with DVC

Created 10/17/2022 8:58:56 AM
Filed in Machine Learning

I've shown how to use DVC to manage data for your machine-learning project on my blog before. Today it's time to take things to the next level. Let's build a pipeline to train a machine-learning model with DVC.

Managing machine-learning datasets with DVC

Created 10/14/2022 9:59:21 AM
Filed in Machine Learning

DVC is a data versioning tool that is designed to make it easier to work with data for machine-learning projects. It integrates with git so that you can easily keep track of your data files and track changes to them over time. It also allows you to easily share data files between different projects and collaborators. In this post I'll show you how you can start using DVC in your ML project.

How-to clean up your Docker Desktop and reclaim that diskspace

Created 9/19/2022 8:24:31 AM
Filed in Docker

My laptop ran out of space this morning. I noticed some slowness last week, but it stopped working this morning. One of the culprits was a huge VHDX file for Docker Desktop. Here's how to get rid of some space so you can continue enjoying your Windows machine.