Tech Articles

Zapier automation for LinkedIn posts

Automation 'for the little guy' will have a big impact

26 Jun 2022

I recently configured a bit of personal automation for myself, and if you arrived here via my share on LinkedIn, you've already seen it in action. I wanted to share how easy it was to configure with Zapier and why I am excited access to software automation is becoming democratized.

what does automation get you?

As someone with a more than a few absent-minded-professor tendencies, I knew that posting to my LinkedIn manually to announce each new post would quickly fall apart. While things like avoiding manual errors and not having to remember each time are welcome benefits, I always like to emphasize my biggest reason for automating:

The less something costs to create, the more it will be produced.

While few may disagree with the statement in theory, the temptation to forget in practice is very real. I admit I kick the can too often myself (the current lose ends in this site's repo provides a case-in-point), but when each productive iteration requires an arduous process that is left unchecked, a feeling of dread is coupled to the creation process. It can become so internalized that it is shrugged off as a "given" cost to producing whatever could come next.

Without process improvements this can quickly give rise to a lethargic creative-depression: new ideas increasingly fall under the shadow of 'too much of a bother'. Ideas carrying the potency to be big breakthroughs are ironically rejected for their potential to 'open a Pandora's box'. I have seen first hand how much this can hollow-out a software development team's productivity, but much has already been written about the impact technical debt creates in that setting and the stand-stills and hidden inefficiencies it can usher in if not taken seriously. I do not dismiss that debate as much as want stay focused on something much closer to my heart of late---enabling automation for the individual.

How automation usually goes

So I started the process as many others with experience in the software automation world would:

  1. Review the API documentation to ensure the data I require for my automation is available via public endpoint(s) in the first place.
  2. Search for an existing API-client that wraps things like authentication and endpoint calls, ideally in one of my preferred programming languages.
  3. Import and use the existing API-client if it is available OR write a minimal implementation of one myself.

Step 1 was easy enough. LinkedIn does indeed provide an API path enabling automatic sharing, including my need to share a url.

Step 2 is where things got messy. First I found a python-linkedin lib that looked promising, until I noticed the last update was in 2015, a year before Microsoft acquired LinkedIn. No thanks.

Next I found the more current linkedin-api. It initially seemed promising until I looked closer for a way to create a share and could not find one. This library leans on an alternative service for its endpoints called Voyager and it looks like the share endpoint has not been exposed in the python lib yet. I initially tried to add the feature myself, but after going well down the path of forking the repo, configuring a dev build, setting up an account for the .env file for the tests to pass (which make requests to the live service while also using a timer-delay to prevent throttling, resulting in minutes-long test runs), I started to question this approach. After looking at the code, I began to have doubts about the time I would save spending so much effort to add a fairly simple feature to a python project screaming for a refactor. So I went to sleep that night thinking I would be writing a minimal python client-wrapper in a utils directory of my site's repository to call from a new post-step in my Jenkinsfile after a successful build-deploy from the 'main' branch of my site's repo...

So I think it is safe to say my experience demonstrates that even when you know what you are doing, building and configuring automation gets complicated quickly. That fact is a key to understanding a problem I have been increasingly concerned with, but that a platform I just discovered named Zapier is already fixing.

Discovering Zapier

It felt fitting that I should taste some of the discouragement that keeps automation prohibitive for most. Perhaps the frustration was the necessary motivation for getting myself out of the professional custom-software-automator box enough to find Zapier, which I think is easy enough for someone with no programming background to use themselves.

Zapier allows you to create what they call "Zaps" that you configure for each unit of automation. I can also share my Zap for others to refer to and copy themselves if they have the same use-case.

My Zap is about as simple as you can make one. The trigger for activating it is something that will routinely poll my site's RSS feed. If it finds that new content has been created recently, it will run the second 'LinkedIn' step, which posts a share with a link back to my new post.

Punching this all in via a GUI-based configuration means you don't need to know how to read or write code to set one up yourself. Additionally, Zapier made it pretty easy to test my LinkedIn step while my Zap was in draft status by pulling the latest record from my RSS feed as test data. This saved me from the awkwardness of having to finalize my configuration and generate new content just to confirm the non-trigger portion is working.

While Zapier does have paid tiers as needs scale, most indivduals will be fine starting under the free tier and its ease means Zapier's automation-as-a-service has value to individuals and small operations typically unable to afford contracting a coder for traditional custom automation.

Automation Democratization? It's about time...

...and not just about saving people more of it. It's about time software automation tools became widely available. I'll repeat the axiom I opened with here: the less something costs to create, the more it will be produced.

If you share my interest in economics, you may have already guessed at the inspiration for this axiom: "if you want more of something, subsidize it" and the inverse, "if you want less of something, tax it". You may be familiar with a similar price-control phenomenon, where effective price-floors tend towards surpluses, and shortages tend to follow price-caps. Here the inversion is no less true: the more something costs to create, the less it will be produced.

Simply running most simple automations is usually not costly, the true cost comes in the complexity in coding, configuring and maintaining the code and infrastructure to keep it on. Yet corporations with software needs still value it enough to hire specialized engineers to build automation, because they have correctly identified its worth.

But how well do those outside the tech world understand the value of automation?

When I moved to Austin I began meeting other indpendent artists and talked to a few owners of small businesses in the creative industry around town. Contrasting their utilization of automation against my time building it for years in the corporate tech industry, it felt like tech corporations had sailed into a new era on some automation-laden super-yacht while the rest of the world gets left behind with only oars and paddles. It helped me realize how much of an oppurtunity there is for more widespread automation and I regretted not being aware of a platform like Zapier sooner. What is the cost of not having this automation? What have we been missing out on?

In talking to indepdent artists, I realized how many are spending hours each week on at least partially-automatable tasks. For example:

  • posting and scheduling the same content across multiple platforms
  • downscaling and watermarking images and video
  • image-searching their artwork to protect against theft and misuse
  • turning away spam accounts and requests for free art
  • providing a streamlined commission-request system for potential clients

That's potentially hundreds of hours a year that each each artist is spending on not producing art. And this does not just apply to indepedent artists, but any single-person or small business that relies on an online operation. I went through the initial trouble of automating my site's build, deployment, and LinkedIn notification steps because I don't want some exciting idea I have for a post to be accompanied by repetitive barriers to its production, just as I don't want an artist or music producer whose content I enjoy to have to face similar headwinds.

So I hope I have encouraged you to check out Zapier and other free-tier automation platforms, or at least help spread the word to those that can benifit. Access to automation extends a power that can go beyond simply saving time. I am excited to see what new things will be created, now that a long-neglected hurdle has been shortened so nearly anyone can leap over it.

How I run my Jenkins cloud instance

Jenkins + helm template + kubectl apply -k

20 Jun 2022

I have a few automation projects nearing the top of my to-do list (there is a lot of repetitive work to do when you post and sell art online... yikes!), so I recently dusted off my repo to turn my personal Jenkins instance back on. In updating it, I was reminded that it was a bit of process settling on a fully-declaritive solution I liked, so I wanted to make a post about what I learned in case anyone finds themselves walking down the same path.

Updating to the cloud

My Jenkins experience goes back to the non-containerized era, when I was serving on a team that managed a company-wide "old-school" Jenkins instance. When our machines neared capacity, our team would need to request a new VM, set it up manually, and ensure the proper dependencies were on the machine. Moreover, the monolithic Jenkins would routinely be in "plugin-hell", where so many plugins were installed on the same Jenkins instance, it began necessitating lengthy changelog review and testing processes before we could risk updating plugins, nevermind updating the version of Jenkins itself, which would be perpetually spewing out warnings like "deprecated" and "not secure".

Knowing containers provided a code-managed way out of some of the traditional maintainence headaches, I used an oppurtunity I had while supporting a smaller independent team to roll out a google cloud-friendly instance running on Kubernetes, which had me quickly settling on Helm.

I'll gloss some of the additional details because this youtube on 'The Jenkins Journey' will probably give you a better idea of how Jenkins will make a solution like Helm as its usage and needs grow.

Helm is nice, but...

Helm, being a tool that writes and deploys to Kubernetes, did make it easier to get Jenkins off the ground without having to know much Kubernetes, but customization quickly became a new bottleneck using it.

My existing experience configuring Jenkins started to make using Helm feel like putting the training wheels back on after already learning to ride a bike. I could usually get it to do what I wanted eventually, but it was arduous to keep solving the "how do I do this through the Helm config?" puzzle each time I needed to configure something new in Jenkins.

Moreover, what is Helm actually doing with the Kubernetes config? Since the k8s code Helm ultimately applies doesn't exist in the repository, there is no meaningful review process for Jenkins configuration changes unless another team member already knows what Helm will spit out. And wasn't a big reason I started down this path in the first place to get all configuration into my versioned codebase? While those are indirectly expressed in Helm's values.yaml file, it really starts to feel more like a pseudo-declarative project.

Kustomize is better

Thankfully, a friend pointed me to a newer tool, kustomize. I will again refer you youtube for more details, but the TLDR is that kustomize lets me organize my project like this:

jenkins
├── helm-base ╍╍╍╍╷
└── overlays      ╎
    └── gke-tom ╍╍╵

In this case, my helm-base folder holds a yaml config of what Helm would deploy if I was still using it as a package/deploy manager. The line on the right side connecting that folder to overlays/gke-tom is showing that gke-tom inherits from helm-base and overlays the gke-tom code over it, essentially inserting and replacing fields in the configuration that aren't specified or contain different values in helm-base.

So in less-technical terms, helm-base becomes "the rest of the world's best-pratice default yaml config" of Jenkins, and overlays can be thought of as "our deviation(s) from it". Since it is a yaml to yaml operation, you can keep stacking overlays on top of one another, like if I wanted to build a helm-base > test > live inheritence relationship, helm-base > in-house-base > *multiple-in-house-jenkins-instances, etc.

A basic example

So how does it work? Using the helm template command, I can build the Jenkins helm-base folder after installing Helm and pulling the Jenkins chart:

helm repo add jenkins https://charts.jenkins.io
helm repo update
helm template example jenkins/jenkins -n helm-base > helm-base/jenkins.yaml

Applying this configuration will get me the default Jenkins—a fully configured and runnable Jenkins instance that I could run as-is—but in order to demonstrate some of the kustomization I'll add a simple overlay:

# helm-base/kustomization.yaml:

resources:
  - jenkins.yaml

# overlays/example/kustomization.yaml:

namePrefix: demo-
namespace: temp
resources:
  - ../../helm-base

patches:
- target:
    kind: ConfigMap
    name: example-jenkins
  patch: |-
    - op: replace
      path: /data/plugins.txt
      value: |-
        github:latest

Here I am kustomizing the list of plugins by updating the file defined in helm-base. With the file in this ConfigMap swapped out, this Jenkins instance should come with the latest github plugin pre-installed. Finally, I apply it with the familiar kubectl apply, but use the -k flag, which will merge the base and overlays before sending the result to Kubernetes:

kubectl apply -k overlays/example

After signing on giving the pod a chance to start, I can hop on and verify the github plugin is there after the first startup:

Github plugin installed

Closer to the applied config

Kustomize puts the code in better peer review territory than the k8s-behind-the-curtains helm install approach. With a yaml-to-yaml solution that mirrors the applied Kubernetes config, the code that changes more frequently will be in that deviation-from-the-default overlays folder and it is much easier to see the impact on the applied configuration, which you can also print out with the kubectl apply -k --dry-run=server command.

I can also update base as often or as little as I want, like when I turn my Jenkins back on after some time away. A helm-base update is as easy re-running the helm repo update and template commands on top of a git branch and using the diff of the new base to adjust anything that changed out from under the overlays.

Migrating my site to Pelican

A Site Restructuring Adventure

12 Jun 2022

I recently came across a nice theme I liked in Hugo called Dimension, based on the responsive html5up theme of the same name. The thing I liked most about the theme is how nicely it fixed a problem with my old site...

The old layout placed far too much emphasis on my tech skills, while barely mentioning my passion for art and music. I love the tech skills I have acquired over the years, and while coding can be a creative outlet and mode of expression in some ways, I often feel more integrated and free when I am spending time at the piano or drawing.

So I really liked the idea of what you see on my homepage now, but I quickly ran into problems trying to apply it to my site with hugo. I recently spent a few weeks working on my web-chops, so I had some fresh html, CSS, and JS skills I was eager to apply. And as it often ironically goes, Hugo, the tool that got me off the ground quickly with a site, was now getting in the way. In order to add serious customization to hugo, I would have to learn the go templating syntax and potentially some golang too. I thought that might stretch my "learning new things-applying those things" loop way too wide, so I went with another option instead...

Pelican to the rescue

I took a look at the current options for static site generation that Python has to offer and settled on Pelican. With this I would only have the jinja2 templating syntax to grapple with, something I had much more familiarity with. And of course if I required any further customization (and I quickly found out I would), Pelican provides "sky's the limit" customization via their plugin interface, so with 11 years of off-and-on Python experience, I was feeling pretty good about undergoing a bigger switch and migrating over.

The real work was the customization

Migrating the content didn't really require anything, since Hugo and Pelican both can read markdown. Some of my shortcodes didn't carry over from hugo, but it was pretty easy to replace that with the jinja2content plugin instead.

The first wrinkle was the theme and styling. Since Dimension is a single-page theme that assumes a fairly flat blog-style layout, it didn't have a good submenu option for listing article-style pages separately out of the box, so I needed to add a #article-menu div for when I would need that. I added supporting CSS that hooked into the existing stretching on page-load script and had to wrestle with some alignment and padding issues:

#article-menu {
        display: -moz-flex;
        display: -webkit-flex;
        display: -ms-flex;
        display: flex;
        -moz-flex-direction: column;
        -webkit-flex-direction: column;
        -ms-flex-direction: column;
etc...
}

The second big problem was with Pelican itself. Without a custom plugin, Pelican will flatten all the folders you identify as articles and pages. While there is only one big article list, Pelican does include categories, tags, and pagination. With these settings enabled, the site-builder will generate extra pages in order to navigate a larger article list sensibly.

While you can tweak the save-as Pelican settings to get articles and pages to match their content-folder depth, in my case I needed Pelican to be able to sort and find only the articles and pages for the index.html it was building. I organized everything in my content folder before writing my plugin (see screenshot below). I didn't want to have to reorganize how I wanted to think of my site, just to fit Pelican's existing model. So I realized I had to disable the existing article and page generators in Pelican for starters:

def disable_page_writing(generators):
    """
    Disable normal article and page generation.
    The html5up Dimension theme fits better as index pages.
    """
    def generate_output_override(self, _):
        if isinstance(self, ArticlesGenerator):
            log.debug('Skipping normal article generation...')
        if isinstance(self, PagesGenerator):
            log.debug('Skipping normal pages generation...')

    for generator in generators:
        if isinstance(generator, (ArticlesGenerator, PagesGenerator)):
            generator.generate_output = types.MethodType(generate_output_override, generator)

def register():
    signals.all_generators_finalized.connect(disable_page_writing)

Sticking with a layout that made sense to me, my content folder looks something like this:

Content Directory

I am currently only using one layer of depth, but my plugin will recognize an index file at any depth if I want to split it deeper in the future. If a sub-interest starts to occupy a lot of my time, maybe I'll add a sub-sub-page, who knows. Under the hood my plugin will use a recursive glob to find all the index files, in my case set to index.md. So my /index.md will generate /index.html, /tech/index.md will generate /tech/index.html, etc.

This fit pretty nicely with Dimension's single index.html design. Once I had some code to gather the necessary articles for each, my code can build everything it needs looping over the index file list.

So for right now I really only need four big index.html files: one at the root url in addition to the ones at /art /music and /tech. Any pages in the same directory as that index file show as buttons, and any articles at the same depth are listed in the article-menu. The article and page paths are still specified by the standard Pelican settings.

The resulting code was a new generator that still leveraged Pelican's stock ArticleGenerator and PageGenerator for reading the article and page markdown content. My plugin will prevent it from generating output under separate html files, since it is coupled pretty tightly to the Dimension page design.

So to illustrate this, the location of pelican-migration.md, the post you are reading right now, is located in a path I have specified has articles in it: tech/blog. So as the plugin is building tech/index.html, it includes a list of those post in the tech article menu, and it also dumps in the content for that post as a #link, hidden by Dimension's JS and CSS code until the user clicks it.

So it seemed fitting to name the class that does this IndexGenerator:

class IndexGenerator(Generator):
    def generate_context(self):
        """
        Find all index.{ext} files in content folder
        Add them to self.context['generated_content']
        """
        ...
    def generate_output(self, writer):
        """
        For each index page, generate index.html with 
        articles and pages at the same depth.
        """
        ...

def get_generators(pelican):
    return IndexGenerator

def register():
    signals.get_generators.connect(get_generators)

I'm glossing over the code here because there are quite a few lines, but you can look at the full source on github. The two important hooks in Pelican generators are the generate_context and generate_output methods. Pelican will not generate any output until all Generator class objects have finished generate_context. This is how I am able to highjack generate_context code from the Article and Page Generators without their generate_output methods creating the standard (in my case extra) articles and page output files.

While this felt like a bit of an adventure, I much happier landing with Pelican. The site-builder fits my content folder, not the other way around, but I didn't have to reinvent any wheels to get there either. Pelican's plugin interface allowed me to pretty seamlessly use the pieces I needed and override what I did not.

Next steps

I'll need to clean up a bit of code and rethink the settings a bit before I package this for general consumption, but I've got shipping this work as separate projects on my to-do list so other Pelican users can use this plugin and theme too if they would like.

Pelican makes it pretty easy to ship Plugins as pip packages and you can load themes in a similar way with 'pelican-themes --install'. I will need to look into Poetry package manager first, and also tweak a few minor responsiveness issues I caused when I tweaked the Dimension theme (you may have noticed some). I'll have more details and a new post when I get there. If you are interested in following for more, I will announce future posts to my linkedin.

The Advantages of Pathlib

Using Python's Pathlib library

01 Sep 2021

Python3 has a standard library with classes for filepaths. Are you using it yet?

If you have some experience using Python, you probably already know it has some good tools for ironing out differences between Windows and Unix paths, provided you don't build paths like this:

path = basepath + '/never' + '/do' + '/this'

The traditional answer has been to use libraries like os and os.path:

from os.path import join
path = join(basepath, 'a', 'better', 'path')

But in terms of ease of use, these libraries are starting to show their age.

Python3 includes pathlib, a more convenient class-based library for interacting with paths.

Pathlib Classes

By importing pathlib and using a Path class, we'll get a concrete class based on the underlying filesystem. In my case, I'm using Windows. Testing in a Python console will return a WindowsPath.

>>> from pathlib import Path
>>> pathlib_path = Path.cwd()
>>> type(pathlib_path)
<class 'pathlib.WindowsPath'>
>>> print(pathlib_path)
C:\Users\Tom\AppData\Local\Programs\Python\Python39

pathlib also has a PosixPath concrete class that you'll get from calling Path() on an Ubuntu machine, for instance. Each concrete class is inherited from a PurePath parent, and each PurePath class allows path operations, provided they don't touch the filesystem, which will error.

>>> from pathlib import PurePosixPath
>>> pure_linux_path = PurePosixPath('/usr/local/bin/python3')
>>> pure_linux_path.parent
PurePosixPath('/usr/local/bin')
>>> pure_linux_path.rmdir()
Traceback (most recent call last):
File "<pyshell#63>", line 1, in <module>
pure_linux_path.rmdir()
AttributeError: 'PurePosixPath' object has no attribute 'rmdir'

This can be an elegant way to do some filepath manipulation in the opposite platform; a necessary evil that I've sometimes run into for cross-platform CI projects.

Console comparisons with os

One pet-peeve of mine, especially when revisiting Python after some time away, is that os.path contains functions instead of methods. It is easy to forget that as my little whoops below demonstrates. The OOP consistency from pathlib avoids this.

>>> import os
>>> os_path = os.getcwd()
>>> os_path.exists()
Traceback (most recent call last):
File "<pyshell#11>", line 1, 
in <module> os_path.exists()
AttributeError: 'str' object has no attribute 'exists'
>>> os.path.exists(os_path)
True
>>> from pathlib import Path
>>> pathlib_path = Path.cwd()
>>> pathlib_path.exists()
True

Another source of errors is the inconsistent interface. Directories have to do with paths, so the function for listing them must be in os.path, right?

>>> os.path.listdir(os_path)
Traceback (most recent call last):
File "<pyshell#19>", line 1, in <module>
os.path.listdir(os_path)
AttributeError: module 'ntpath' has no attribute 'listdir'
>>> os.listdir(os_path)
['DLLs', 'Doc', ...]
>>> list(pathlib_path.iterdir())
[WindowsPath('C:/Users/Tom/AppData/Local/Programs/Python/Python39/DLLs'), 
 WindowsPath('C:/Users/Tom/AppData/Local/Programs/Python/Python39/Doc'),
 ...]

pathlib also provides some convenient attributes, retrieving values related to the path is as simple as it should be. Some of the terminology pathlib uses for paths can be quickly understood by looking at the pathlib cheatsheet.

>>> os_home = os.path.expanduser('~Tom')
>>> os.path.basename(os_home)
'Tom'
>>> os.path.dirname(os_home)
'C:\\Users'
>>> os.path.splitext(os.path.join(os_home, 'test.txt'))[1]
'.txt'
>>> pathlib_home = Path('~Tom').expanduser()
>>> pathlib_home.name
'Tom'
>>> pathlib_home.parent
WindowsPath('C:/Users')
>>> Path(pathlib_home, 'test.txt').suffix
'.txt'
Final examples

So while it is still important to avoid code like this:

path = basepath + '/never' + '/do' + '/this'

It is easy to understand the temptation, which brings me to my conclusion: pathlib enables me to think about paths the way I already do, without the hurdles of a dispersed interface. Below I've simplified a scenario I've encountered working on a CI project, again comparing os with the same logic refactored for pathlib.

import os
info_folder = os.path.join(os.environ.get('WORKSPACE', '.'), 'build', 'info')
os.makedirs(info_folder, exist_ok=True)
project.build()
with open(os.path.join(info_folder, 'results.xml')) as f:
    xml_results = f.read()
with open(os.path.join(info_folder, 'build_log.txt')) as f:
    build_log = f.read()
from pathlib import Path
workspace = Path(os.environ.get('WORKSPACE', '.'))
(workspace/'build'/'info').mkdir(parents=True, exist_ok=True)
project.build()
xml_results = (workspace/'build'/'info'/'results.xml').read_text()
build_log = (workspace/'build'/'info'/'build_log.txt').read_text()

With os, I'm almost forced to over-name things to keep the verbosity down, hence the info_folder variable. There really isn't a need for such a variable when using pathlib. I can use forward slashes on either platform and pathlib will manage the differences behind the scenes. This matches Java/Groovy behavior I've used in Jenkins Pipeline before too, so switching languages feels smoother. I can get back to how I actually think about the path and see the portion I care about, under one library of consistently named methods. Which of the above would you rather read?

If that isn't enough to convince you to change your code to pathlib, there is also the flexibility of partially swapping out pathlib without having to adjust for each new pathlib method. Thanks to PEP 519 and the PathLike base class, pathlib paths resolve to strings and can be used as arguments to built-in functions as if they were the path strings from os functions.

>>> with open(os.path.join(os.path.expanduser('~Tom'), 'test.txt')) as f:
...     x = f.read()
...
>>> with open(Path.home()/'test.txt') as f:
...     y = f.read()
...
>>> with (Path.home()/'test.txt').open() as f:
...     z = f.read()
...
>>> min(x == y, y == z)
True

So while using classes instead of strings comes with a bit more resource overhead, the reduction in errors and readability you get from pathlib is usually well worth it. I would love to see more code use pathlib, so if you aren't already using it, I hope this post has swayed you.

How I Host this Site

Hosting static content with Google Kubernetes Engine

28 May 2021

In my last post, I explained how I am building this site with Hugo, but stopped short on how I host it. In this post, I'll take you through a bit of how Kubernetes works and explain how I'm using it.

Kubernetes Background

Kubernetes is an orchestration system for running containers and automating a range of container-oriented tasks; think things like deployment, scaling, and self-healing. While Kubernetes is well-suited for a variety of applications, its traditional application is running twelve-factor web apps. There is a lot to Kubernetes, and an exhaustive tour is beyond the scope of this post, but I'll try to provide a brief background and explain each piece as I go.

In the beginning, web apps were run on physical servers. When an organization approached their server capacity, they were forced to buy more. This created obvious planning and resource-utilization headaches, the most ironic being servers unable to handle a sudden influx in traffic due to a web app going viral. Broadly the industry would settle on a more sharing-economy approach to computing power, democratizing access to scalable resources with pay-as-you-go pricing. For SaaS, container technology, combined with the ability to automatically create additional VMs via a cloud provider, meant the tools are available for even small organizations to automate this problem away.

And scalability is just one of many concerns that running a web app raises.

What if a container goes down? How do you recognize that and recover? If you are running your app across multiple containers, how do you balance traffic between them?

There are many ways to solve these, but organizations having extensive experience with these types of problems settle on certain approaches being better than others. In one particular case, Google came up with a packaged set of tools written in GoLang based on their own needs. They open-sourced the project and provided it as a free-to-use tool that integrates with cloud APIs.

They named that tool Kubernetes.

Building an Image

In order to host my site, first I'll need an image to run, so I'll build it from my Dockerfile and push it up to Google Container Registry. My Dockerfile is about as simple as it gets:

FROM nginx
EXPOSE 80
COPY public /usr/share/nginx/html

My base image is nginx, a lightweight webserver. The Dockerfile exposes the standard http port and copies the contents of my public folder to a folder that nginx will look to for serving static content. In my last post the public folder was created when I ran hugo to generate the html and other files needed for seeing my content outside of Hugo's built-in local server.

Now I'll build the image.

tom@ubuntu:~/git/thomasflanigan$ docker build -t $TOMS_SITE_IMG .
Sending build context to Docker daemon  7.965MB
Step 1/3 : FROM nginx
latest: Pulling from library/nginx
69692152171a: Pull complete
30afc0b18f67: Pull complete
596b1d696923: Pull complete
febe5bd23e98: Pull complete
8283eee92e2f: Pull complete
351ad75a6cfa: Pull complete
Digest: sha256:6d75c99af15565a301e48297fa2d121e15d80ad526f8369c526324f0f7ccb750
Status: Downloaded newer image for nginx:latest
---> d1a364dc548d
Step 2/3 : EXPOSE 80
---> Running in 9da7a9dd02cc
Removing intermediate container 9da7a9dd02cc
---> 62f1ae442966
Step 3/3 : COPY public /usr/share/nginx/html
---> e1670ad1b9a4
Successfully built e1670ad1b9a4
Successfully tagged gcr.io/[GCP_PROJECT_ID]/thomasflanigan:latest

I've set an environment variable TOMS_SITE_IMG in the format gcr.io/[GCP_PROJECT_ID]/[IMAGE_NAME]:[IMAGE_TAG]. Before pushing it out to the registry, I'll run docker run -p 8080:80 $TOMS_SITE_IMG and navigate to http://localhost:8080 to see that site is being hosted in the container correctly.

This might look like a bit of magic. By poking around interactively within the container, a better picture emerges of how this uses the default nginx configuration coming from my image's base layer.

tom@ubuntu:~/git/thomasflanigan$ docker run -it $TOMS_SITE_IMG /bin/bash
root@6113c3fcd402:/# cat /etc/nginx/nginx.conf

user  nginx;
worker_processes  1;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;


events {
worker_connections  1024;
}


http {
include       /etc/nginx/mime.types;
default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

The /etc/nginx/nginx.conf file is where nginx looks for its configuration. I could copy my own version of this file into the image and overwrite this one if I needed something more custom. At the bottom of the configuration I can see that everything with a .conf in the /etc/nginx/conf.d/ directory will be included in the configuration. While still in the container:

root@6113c3fcd402:/# ls /etc/nginx/conf.d/
default.conf
root@6113c3fcd402:/# cat /etc/nginx/conf.d/default.conf
server {
listen       80;
server_name  localhost;

    #charset koi8-r;
    #access_log  /var/log/nginx/host.access.log  main;

    location / {
        root   /usr/share/nginx/html;
        index  index.html index.htm;
    }

    #error_page  404              /404.html;

    # redirect server error pages to the static page /50x.html
    #
    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }

    # proxy the PHP scripts to Apache listening on 127.0.0.1:80
    #
    #location ~ \.php$ {
    #    proxy_pass   http://127.0.0.1;
    #}

    # pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
    #
    #location ~ \.php$ {
    #    root           html;
    #    fastcgi_pass   127.0.0.1:9000;
    #    fastcgi_index  index.php;
    #    fastcgi_param  SCRIPT_FILENAME  /scripts$fastcgi_script_name;
    #    include        fastcgi_params;
    #}

    # deny access to .htaccess files, if Apache's document root
    # concurs with nginx's one
    #
    #location ~ /\.ht {
    #    deny  all;
    #}
}

From the server directive we can see the same port 80 that was exposed in the Dockerfile. From the location section within that we can see that nginx is using /usr/share/nginx/html to look for content. Nginx will serve any index.html files for paths under that location, the same one the Dockerfile copies my content to. So the "magic" is really just relying on the default nginx configuration coming from the base image and copying the site's content into the default folder.

Finally, I'll push the image up to the registry.

tom@ubuntu:~/git/thomasflanigan$ docker push $TOMS_SITE_IMG
Using default tag: latest
The push refers to repository [gcr.io/GCP_PROJECT_ID/thomasflanigan:latest]
35e0dc2a6cbb: Pushed
075508cf8f04: Layer already exists
5c865c78bc96: Layer already exists
134e19b2fac5: Layer already exists
83634f76e732: Layer already exists
766fe2c3fc08: Layer already exists
02c055ef67f5: Layer already exists
latest: digest: sha256:2a22fffa87737085ec8b9a1f13fff11f9b78d5d7a3d9e53d973d2199eae0dbdc size: 1781
Kubernetes Controllers

To get my image running in Kubernetes, I'll define a number of Kubernetes objects using a yaml file. A single instance of Kubernetes is called a cluster and configuration can be defined declaratively. This means I can describe my desired state via versioned files, and Kubernetes will use controllers corresponding to each object I define until the observed cluster state matches the one I defined.

In contrast, setting up a web app the traditional way involved a set of instructions run imperatively; analogous to giving someone directions from point A to B. Today with an online map service, instead of directions, we could specify the end destination B only, and let the service worry about the rest, not having to worry about point A at all. Kubernetes lets you take a similar approach, in my case, for running my site's image and defining the infrastructure for it.

Namespace

First I'll start by defining a namespace. Namespaces are a good way to keep things isolated in the cluster. If I left this out, Kubernetes would place all my resources in the default namespace, so I'll explicitly create one specifically for my site.

apiVersion: v1
kind: Namespace
metadata:
  name: thomasflanigan
Deployment

Next I'll need to run a container for my nginx image. I can use a Deployment to do so.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: site
  namespace: thomasflanigan
spec:
  replicas: 1
  selector:
    matchLabels:
      app: site
  template:
    metadata:
      labels:
        app: site
    spec:
      containers:
      - name: nginx
#       Meant to be run with envsubst
        image: $TOMS_SITE_IMG
        imagePullPolicy: Always
        ports:
        - containerPort: 80

A Deployment is a wrapper for running Kubernetes Pods, which in turn can run one or more containers. With replicas: 1 and the containers: list, I am telling my cluster I want a single container of my site exposed on port 80. I also specify a selector so that I can target this container with a Service resource.

Service

Services allow you to manage connectivity to Kubernetes Pods. Here I define a NodePort type Service, with a selector that will match any Pods with an app: site matchLabel. In my case, this is a single Pod with my one container in it.

apiVersion: v1
kind: Service
metadata:
  name: site-svc
  namespace: thomasflanigan
spec:
  selector:
    app: site
  type: NodePort
  ports:
  - protocol: TCP
    port: 8080
    targetPort: 80
Testing it out so far

So far I have just been talking about yaml files, but I haven't told the cluster to make it so yet. I'll need to hop onto my cloud shell environment where I have already created a GKE instance. Here I'll use the kubectl apply command to have my cluster start running the Service. In order to swap in my environment variable for my site image, I'll use envsubst and pipe the result to the kubectl command.

tom@cloudshell:~/git/thomasflanigan $ cat k8s-config.yaml | envsubst | kubectl apply -f -
namespace/thomasflanigan configured
deployment.apps/site configured
service/site-svc configured

I'll make sure the Pod and Service started okay, and then use kubectl port-forward to preview the site.

tom@cloudshell:~/git/thomasflanigan $ kubectl get pods -n thomasflanigan
NAME                    READY   STATUS    RESTARTS   AGE
site-594ccf99c8-wwn28   1/1     Running   0          12m
tom@cloudshell:~/git/thomasflanigan $ kubectl get svc -n thomasflanigan
NAME       TYPE       CLUSTER-IP    EXTERNAL-IP   PORT(S)          AGE
site-svc   NodePort   10.60.14.87   <none>        8080:31208/TCP   12m
tom@cloudshell:~/git/thomasflanigan $ kubectl port-forward service/site-svc -n thomasflanigan 8080:8080 >> /dev/null

With the Service port forwarded, I can use the web preview feature in Google Cloud Shell to preview the site.

Web Preview

This will launch the site at a temporary url. It works!

Site Preview

Ingress

With things working inside the cluster, the final step is to expose it to the web more permanently. I have reserved a static external IP named 'thomasflanigan' in my GCP project to reach the outside web. I can use an Ingress to connect the static IP to my Service.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.allow-http: "false"
    kubernetes.io/ingress.global-static-ip-name: thomasflanigan
    networking.gke.io/managed-certificates: site-cert
  name: site-ingress
  namespace: thomasflanigan
spec:
  rules:
    - http:
        paths:
          - path: /*
            pathType: ImplementationSpecific
            backend:
              service:
                name: site-svc
                port:
                  number: 8080

Under spec.rules.http[0].paths, I can list one or more paths. Here I am saying everything going to my static IP from the root path down will be sent to my site-svc.

I am also turning off http access to run my site under https only. I get the SSL termination from referencing a managed certificate.

Managed Certificate

If you take a look back at the above yaml code, you may notice it has been necessary to specify an apiVersion. Up until this point, I have been using yaml that defines resources built-in to Kubernetes. Kubernetes also allows for custom resources powered by custom controllers. Notice the apiVersion below specifies GKE, a Google-specific cloud service, since I am defining a custom resource tied to a Google-managed SSL certificate. Resources like this will differ for each cloud provider.

apiVersion: networking.gke.io/v1beta1
kind: ManagedCertificate
metadata:
  name: site-cert
  namespace: thomasflanigan
spec:
  domains:
    - thomasflanigan.com

In order for the cert to work, I'll need to point my thomasflanigan.com domain to the static IP address I created. Using a managed cert is convenient, since I don't ever have to worry about renewing the certificate manually (managed certs auto-renew). I will need to wait a few minutes for the certificate to provision, so I'll apply it now.

tom@cloudshell:~/git/thomasflanigan $ cat k8s-config.yaml | envsubst | kubectl apply -f -
namespace/thomasflanigan unchanged
deployment.apps/site unchanged
service/site-svc unchanged
ingress.networking.k8s.io/site-ingress configured
managedcertificate.networking.gke.io/site-cert configured

After some time, I can check on the cert and see it provisioned, and my site will be available.

tom@cloudshell:~/git/thomasflanigan $ kubectl get managedcertificate site-cert -n thomasflanigan -o jsonpath='{.status.certificateStatus}'
Active
Conclusion

I have really only scratched the surface with Kubernetes here. There is a lot more I could do to improve the backend, but this will suffice for a static site for now. In fact, Kubernetes is so powerful that using it for only a static site is complete overkill, but I plan to add to this cluster over time.

To see the complete yaml configuration, take a look at it on my github.

How I Built this Site

Building static content with hugo

17 May 2021

Update: While this still may interest you if you're interested in hugo, I no longer build my site using it. I am now using Pelican.

I have been using more cloud tools these days and had an itch to start a blog about it. I have a dev ops background without much front-end web experience, so I was happy to find out about tools like Jekyll and Hugo; a good topic for my first post.

Jekyll vs Hugo

Jekyll and Hugo are both site generators. Hugo struck me as a better fit due to its speed when compared with Jekyll. With the hugo serve command, I can update files and see the html rendered on a live local server. It is a nice feature for maintaining a tight feedback loop for staying in a flow state.

Create a Hugo Project

First I'll need to install hugo and create a repository. I've named mine hugo-demo. Note since I have already created the repository on github, I'll need to use --force when creating the site.

tom@ubuntu:~/git$ git clone git@github.com:exvertus/hugo-demo.git
Cloning into 'hugo-demo'...
remote: Enumerating objects: 3, done.
remote: Counting objects: 100% (3/3), done.
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (3/3), done.
tom@ubuntu:~/git$ hugo new site hugo-demo --force
Congratulations! Your new Hugo site is created in /home/tom/git/hugo-demo.

Just a few more steps and you're ready to go:

1. Download a theme into the same-named folder.
   Choose a theme from https://themes.gohugo.io/ or
   create your own with the "hugo new theme <THEMENAME>" command.
2. Perhaps you want to add some content. You can add single files
   with "hugo new <SECTIONNAME>/<FILENAME>.<FORMAT>".
3. Start the built-in live server via "hugo server".

Visit https://gohugo.io/ for quickstart guide and full documentation.

This will add some files and folders to the root of the repository.

tom@ubuntu:~/git$ cd hugo-demo/
tom@ubuntu:~/git/hugo-demo$ ls
archetypes  config.toml  content  data  layouts  README.md  static  themes

Hugo will look to config.toml for the project's global config (although this may be configured in other ways). It should look something like this:

baseURL = "http://example.org/"
languageCode = "en-us"
title = "My New Hugo Site"

Which I'll update to be less generic.

baseURL = "http://thomasflanigan.com/"
languageCode = "en-us"
title = "Tom's New Hugo Site"
Adding a Theme

Before I can serve the site I'll need to choose a theme. Hugo has hundreds of pre-built themes, but you may also create your own. I am using the coder theme for my site. I'll add it to my config.toml:

baseURL = "http://thomasflanigan.com/"
languageCode = "en-us"
title = "Tom's New Hugo Site"
theme = "hugo-coder"

Hugo will look to in /themes for the one I have defined in config.toml. This is typically added as a git submodule:

tom@ubuntu:~/git/hugo-demo$ git submodule add https://github.com/luizdepra/hugo-coder.git themes/hugo-coder
Cloning into '/home/tom/git/hugo-demo/themes/hugo-coder'...
remote: Enumerating objects: 2383, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 2383 (delta 0), reused 1 (delta 0), pack-reused 2382
Receiving objects: 100% (2383/2383), 2.38 MiB | 9.68 MiB/s, done.
Resolving deltas: 100% (1228/1228), done.

Now I can check out the site with hugo serve.

tom@ubuntu:~/git/hugo-demo$ hugo serve -D
Start building sites 

                   | EN  
-------------------+-----
Pages            |  7  
Paginator pages  |  0  
Non-page files   |  0  
Static files     |  5  
Processed images |  0  
Aliases          |  0  
Sitemaps         |  1  
Cleaned          |  0

Built in 53 ms
Watching for changes in /home/tom/git/hugo-demo/{archetypes,content,data,layouts,static,themes}
Watching for config changes in /home/tom/git/hugo-demo/config.toml, /home/tom/git/hugo-demo/themes/hugo-coder/config.toml
Environment: "development"
Serving pages from memory
Running in Fast Render Mode. For full rebuilds on change: hugo server --disableFastRender
Web Server is available at http://localhost:1313/ (bind address 127.0.0.1)
Press Ctrl+C to stop

But this doesn't give me much of a site quite yet:

Blank Site

So I'll add params to my config.toml.

config.toml

baseURL = "http://thomasflanigan.com/"
languageCode = "en-us"
title = "Tom's New Hugo Site"
theme = "hugo-coder"
pygmentsStyle = "bw"

[params]
author = "Tom Flanigan"
description = "Tom Flanigan's personal website"
keywords = "blog,developer,personal,resume"
info = "Developer and Dev Ops Specialist"
avatarurl = "https://raw.githubusercontent.com/luizdepra/hugo-coder/master/exampleSite/static/images/avatar.jpg"

[[params.social]]
name = "Github"
icon = "fa fa-github"
weight = 1
url = "https://github.com/exvertus/"
[[params.social]]
name = "LinkedIn"
icon = "fa fa-linkedin"
weight = 2
url = "https://www.linkedin.com/in/thomas-flanigan/"

[[languages.en.menu.main]]
name = "About"
weight = 1
url = "about/"

[[languages.en.menu.main]]
name = "Blog"
weight = 2
url = "posts/"

That's looking a bit better now:

Site

You can see a full list of parameters in the theme's stackbit.yaml file with additional information in the example config.

Adding Content

My site will need more than a single page, so I'll need to add content so the 'About' and 'Blog' menu links don't 404. Adding an about.md file to the root of the content folder will cause hugo to serve an /about/index.html page:

tom@ubuntu:~/git/hugo-demo$ hugo new about.md
/home/tom/git/hugo-demo/content/about.md created

Hugo will automatically add some "front matter" to the top of the file that serves as metadata for the page. I'll add some basic content for the page as well:

---
title: "About"
date: 2021-05-20T18:00:47-05:00
draft: true
---

Hi I'm Tom. I like music, art, and technology.

I'll create a first blog post too. Note I'm creating this in a subfolder named posts:

tom@tom-UX303UA:~/git/hugo-demo$ hugo new posts/first-post.md
/home/tom/git/hugo-demo/content/posts/first-post.md created
+++
draft = true
date = 2021-05-20T19:29:38-05:00
title = "My First Post"
description = "Demo Blog Post"
slug = ""
authors = []
tags = []
categories = []
externalLink = ""
series = []
+++

Hey! Check out my first post

An example code snippet

class Something(object): pass


Navigating to localhost:1313/about/ brings me directly to the about page, but going to /posts/ takes me to a list page showing each file in the content/posts folder:

About

Posts

Taking a look at the look at my blog post, I'm not really happy with how the code snippet coloring looks against the white background:

BwPost

I can change the 'bw' pygments coloring by editing the 'pygmentsStyle' parameter in my config.toml file. I'm able to preview some other choices from this page. I'll change pygmentsStyle to use 'monokai' instead:

# config.toml
pygmentsStyle = "monokai"

MonokaiPost

Wrapping Up

I'm happy with the result and ready to generate the static content for my site. First I'll set each content file's front matter so that they are no longer drafts.

draft: false

Then I can run the hugo command from the root of the project to generate the content. Hugo will drop the output into a folder named 'public' by default.

tom@ubuntu:~/git/hugo-demo$ hugo
Start building sites 

                   | EN  
-------------------+-----
Pages            |  8  
Paginator pages  |  0  
Non-page files   |  0  
Static files     |  5  
Processed images |  0  
Aliases          |  0  
Sitemaps         |  1  
Cleaned          |  0

Total in 79 ms
tom@ubuntu:~/git/hugo-demo$ ls public
404.html  categories  fonts       index.xml  sitemap.xml
about     css         index.html  js         tags

Since the public folder contains the hugo build artifacts, I don't want anything in that folder to pollute the repository. I'll add it, and the resources folder (hugo uses that directory as a cache), to my .gitignore

# .gitignore
public
resources
Conclusion

That's it! Note I've used a demo repository for the purposes of this post. If you'd like to see the live code for this site (including the code for the page you're reading now), you can view my github project.

Experience

Tom Flanigan

Software Developer, MEDITECH July 2012-Jan 2022

Systems Development Group, Jan 2020-Jan 2022
  • Designed and led a new software-delivery solution for twenty-developer team
  • Created Continuous Integration roadmap and provided bi-weekly updates to developers and management
  • Rolled out and maintained fully-declarative CI-server on our cloud platform
  • Configured all jobs including main build-test jobs that run in parallel on Windows and Linux
  • Maintained internal documentation and provided education and guidance to developers using new tools
  • Mentored new-hire training them on Software Delivery best-practices, Python, Jenkins, and Google Cloud Platform
  • Served as scrum master for five-developer sub-team
ALM-Tools Group, 2016-2019
  • Administered company-wide Jenkins instance for dozens of bi-weekly deployments
  • Used unittest.mock library to run all unit tests in under a second, enabling developers to skip latency-sensitive tests while developing on their clients
  • Coded a CLI-wrapper library for using SVN with Python, enabling us to move away from PySVN and the Jenkins/JVM SVNKit to avoid performance and dependency issues
  • Updated Python deploy jobs to use Jenkinsfiles and added coverage, version, deployed location to lightweight database, providing a Confluence report to display the latest data for each class of job
  • Adjusted code for Python 3 compatibility to avoid using Python 2 past EOL date
  • Automated record-only merging for abandoning Jira issues triggered from a Jira transition
  • Created Jenkinsfiles, SVN hooks, Jira transition scripts, and rolled out an improved “change number” system for new Ship pipelines
  • Refactored monolithic application build logic to an object-oriented design, modified configurations to build in two environments against one source with parallel pipeline jobs
  • Wrote build code and managed pipelines for Core Project
Core Group, 2012-2016 - M-AT, FS (MEDITECH proprietary languages)
  • Created an efficiency-testing program for measuring code snippets
  • Designed and coded internal mail-server API, enabling Core developers to add automated email functionality
  • Developed GUI search feature for Core Cases (an internally developed ticketing system)
  • Created a code library to prevent team from repeating itself with “webComponents”
  • Improved table-filtering and other GUI desktop-component features

Non-tech Experience

  • Sales Associate, Hess Corporation Aug 2008-Sep 2011
  • Office Clerk and Order Writer for Standard Products, Litecontrol Jan 2007-Sep 2007

Education

  • Ithaca College, Bachelor of Music 2007
  • 3.7 Cumulative GPA, Magna cum Laude

Projects

Current

  • thomasflanigan.com - The code for this website.
  • Tom's Services - Kubernetes configuration for my internal services domain behind Google IAP that I use to access my Jenkins and a Discord bot shared among friends.
  • Jenkins - YAML config for my Jenkins instance.
  • Darnbot (private repository) - Bot for a Discord Server I share with close friends using Node.js.

Next up

Skills

Tech skills

* Indicates I am actively working to improve this skill.

Python
Jenkins
Git
Subversion
Jira & Altassian Suite
Google Cloud Platform
Intellij
PyCharm
Sublime Text
Agile Development
DevOps/12 Factor Methodology
Docker & Containers
Kubernetes
Algorithms
Java/Groovy
JS
Html
VSCode (favored IDE currently)
Eclipse
*Mongo/NoSQL
*CSS
SQL

Soft skills and work-style

I take a 'people and process first' approach when integrating with a team and actively promote an environment where everyone feels respect, trust, and patience toward one another. I believe firmly that room for harmonious conflict is critical: team members feel comfortable asking tough questions, express important concerns early and often, and graceful criticism without wounding becomes possible.

While I am a poet and occassionally write metaphorically (and will sprinkle some gifs and emojis in my chats), I am mostly precise, polite and concise with my written communication on technical projects.

I am just as strong a verbal communicator. As a emphathetic and comprehensive listener, I practice the Dale Carnegie approach when speaking with others.

A well-functioning team does not require everyone be best friends (although I do value the strong "outside work" relationships I've gained through technical collaboration). Nor do they need to routinely exist in the same physical space, but the team does require strong ongoing mutual respect and I believe deeper conversation with potential for back-and-forth works best over regular video communication. I am happy to make myself available for video communication within clearly defined time windows. I have learned designated office hours are best to balance with distraction-free time for my "communicate with the computer" work.

This awareness also is important in how I am able to manage my time well indepedently. It does come with a few caveats that makes me a better fit for some teams over others. I have a neurological disability that is blessing if I am given the room to follow my strict time-management rules I have developed for myself.

I use a 25/5 Pomodoro strategy for shallow work and a hyper-focus mode for deep work, where I can quickly acquire new skills and apply them to complete larger challenging technical projects in bursts of heightened productivity. These bursts are momentum-based and can involve over an hour warmup period to get in the zone, but I can become locked in for an extended period of elevated flow. More can be understood about the benefits and limitations of my hyperfocusing here.

For complex or poorly defined problems that take multiple days to solve, it is common for novel, elegant and complete solutions to come to me subconsciously after investing time hyper-focused, and I take pride in the results when I am granted the flexibility to plunge into deep work.

My productivity can otherwise become cursed in constant-interuption or high-micromanagement environments. I am flexible to come into an office every so often, but my productivity is severely impaired by an ongoing in-office requirement.