Notes From The Management

Making an effort to keep rambling about the site isolated from the actually interesting bits...

More About That Blogging

2018-06-16 23:41:00 -0400

So much for low-friction - while it is the case that I did reduce blogging to "write some markdown, run two commands", I then proceeded to not actually do any of that writing for four years, even with weekly reminders. I've even got some topics collected... but I only got back here because I noticed a dead link on the bloggery page and got annoyed at it :-}

To be fair, I have had Things To Do - three years of a new startup, a bunch of gadgets, several conferences, various other life events - so all I have to do is actually sit down and write...

About That Blogging

2014-03-29 02:52:54 -0400

The entire purpose of this rewrite (and fundamentally, of the website itself) is to give me a reasonably low-friction path for publishing that still lets me tweak the things I want to tweak. In the past two weeks, the snow has melted, the wildlife has started to wake up, and the only posts I've put out are hidden here in the meta-blog about the site itself. "That's not right..."

This Friday I was reminded that I ought to post to my gadget blog (I literally have a calendar reminder on my phone telling me to blog there, once a week...) especially since the most recent post was something I wrote in January as much as a tracer bullet for the software, as for actual information. Still, it did only appear on the net two weeks ago, so arguably I'm not that far behind; still, I have a significant backlog of gadgets to write about, and more arriving every day (the Verve USB sensor box showed up this morning) so if I don't start getting things written, I'm never going to catch up :-)

(Why do I even need to "catch up"? Isn't it many writers dream to have a vast field of things to write about, and to never have to fear writers block? Sure, but many of these gadget reports are a lot less interesting if they aren't timely - a KickStarter that's now real and maybe you can buy it directly, a new bit of tech that hasn't seen many other reviews - and while I'm not unwilling to write about useful everyday tools it doesn't make sense to let them get in the way of writing about the shiny new gadgets.)

It'll take a while for actually writing to be come more of a habit than tweaking the python code that generates the site so let's at least see if I can manage to do this two weeks in a row. At least this post only took about 10 minutes to get typed and pushed...

Spring Is Sprung

2014-03-12 02:18:13 -0400

There was one "final" delay to make the new infrastructure support the little "single serving" web sites I've collected over the years (rather than moving the main www pointer and then moving the rest of them back to a different pointer, and there weren't that many files involved anyway, nor (with one exception) much history beyond "created site in 2008", so there wasn't much history to worry about.) After I got that working, I flipped the switch in the morning of March 11, 2014. Not yet officially spring, but a local high of 60F anyway which is just as good :-)

Of course, I immediately found half a dozen failures, mostly by watching the access and error logs; my old nagaina monitoring system pointed out that I'd mishandled some links to old photo galleries, and I'm still sorting through some of that (mostly with the intent of constructing some new photo galleries) so I'm not as concerned with precise preservation there (much of it was acled off previously anyhow.)

Overall, nothing severe enough to even consider rolling back. Other than finally releasing months of new bloggery (like this entire section), things shouldn't look that different, I haven't added any decorative photography or even much color, but the structure is in place to try some new things. At very least, The Rants Will Flow!

Burning Down

2014-03-05 02:41:43 -0500

Spring is now two weeks away (still gloomy, still below 30F, yard still covered in snow, no crocuses... but Spring is still around the corner.)

A few days back I actually hit "feature complete" in the sense that as far as I could tell everything I needed to do in order to Flip The Switch was implemented, and that while there is still a fair list of things that Would Be Interesting and Would Be Good Improvements, none of them truly block Being As Good As The Existing One combined with Actually Letting Five Months Of Backlog Out The Door... but I still had a lingering Fear Of Screwing It Up. Does this really all work? Will it look ok?

Well, it's never going to look good if I'm the only one working on the visual appearance, but at a glance it's at least cleaned up a bit. The real concern is having things that could be called Broken or otherwise not as I intended. What's the standard software engineering way of increasing confidence? Well, from the outside, you'd be forgiven for thinking "delay actually shipping anything" was the Best Practice :-) but what I'm getting at is Testing. Pick a few things that I'm worried about, and implement tests for them. (And because This Is Me, we're talking fully automated tests...)

The first step was dependency tracking. Not because I'm trying (yet) to do this as an incremental build - even on the crufty slow machines with buckets of spinning rust that I use as servers, a full rebuild only takes 15 seconds, a full build to an empty directory takes 20 - but because it let me figure out what things in the output directory were spurious leftovers from a previous build (and should trigger a clean build) and what things in the source tree were getting ignored (usually by not being properly attributed in the dependency graph, but it did expose some actual bugs.)

The second step was building the link graph - I did a codes-well-with-others pass on the easily available ones, linkchecker is nicely packaged in Debian, under active development (yet already quite feature-rich) but the default (fixed in the 9.0 release that went out this week) was to fetch and check external links too, which is a good thing to have in general, except that

that's really a content-quality operation, not a rendering implementation one
a bunch of my first-level links are to large AVI files on s3 (though to be fair I could configure explicit exclusions for those)
there wasn't an easy way to say never leave the site
fundamentally, a link checker doesn't have access to the underlying storage and can't check if there is "orphaned" content.

linkchecker does have some nice features like the ability to report output as a directly usable sitemap, which I will probably revisit when 9.0 comes out.

It only took an hour to do a trivial walk from the top level index.html of the output tree using lxml.html and record what paths it saw, filtering out HTML "anchors", links that were offsite, and normalize them all to in-tree pathnames. It took very little longer to match that up against the output side of the dependency checker, and then (by hand) to check some of the "missing" files in google... leading me to conclude that a bunch of stuff is accessible due to being included in RSS feeds, even though it's not actually linked anywhere. Enough things were reachable to convince me that the test worked and that the site was basically OK, and that more significant linking is actually a content project, not a deployment one...

So these confidence-building steps have gone in, they've built confidence appropriately, and the only reason I haven't switched DNS over is that I hang out with enough operations people that I Accept As Truth that I shouldn't do this right before bedtime :-)

Flipping the switch tomorrow...

Plausible Deployment

2014-02-16 02:04:34 -0500

This week, I've gotten a plausible burndown list together, which is always as useful for the things not on it as for the things that are ("yes I know that part would be interesting to hack on and we know clever things about doing it, but it's not in the way of the deployment so back off of it for now.") One or two things may turn out to be dependencies (getting RSS generation right might require actually implementing the README.md parser so I have the classic material to start with, although it's also possible that I can treat those as legacy components for now just to get this out the door, since I have generic blogging working.)

Since this is a relatively small static site on a tiny network, the constraints on web server choice aren't weighted the same, and in particular "sane configuration" takes precedence over performance (because none of them are that bad at just throwing files over the wire) once the basic feature checklist is satisfied. This led me to satisfy my decade of frustration with Apache configuration by tossing it out the window and using nginx instead. The configuration is still arcane, of course, it's just not syntactically horrifying, and it does start with more modern assumptions about what things should be easy to express. It's also nearly as pathologically undebuggable as apache configuration is, and could really use a higher level configuration language that generates the one it includes - or at least a macro language - I could easily take the hundred lines of config I have now and drop it down to fewer than ten descriptive lines just using cpp but it's not really worth doing that here and now.

I also need to start using a simple problem-tracker to remind me of things like that, but a few text files are working out well enough as well, and I really shouldn't let that get in the way of deployment :-)

A codes-well-with-others shoutout to moreutils; a pattern of roughly

chronic flock $command 2>&1 | ifne mail -s "failed" $me

makes a decent git post-update hook for triggering a deployment command on a push to a particular repo.

Winter is... err... almost over

2014-02-10 02:02:13 -0500

While "ThokTober" sounded good when I started, we're well into February and Spring is only 40 days off. Not that that makes the project unsucessful per se - I understand the problem much more deeply which was certainly a goal, and there was never an external deadline, just a self-inflicted one.

The deepest bit of understanding was that attempting to shoehorn the "weight of history" (that is, the legacy thok.org content) into an existing system was misguided effort - the legacy content was some 2300 files, the new content (that could easily be adapted to whatever system I chose) was only 45 files - "And we decided that one big pile is better than two little piles, and rather than bring that one up we decided to throw ours down." - and that the optimal deployment tool for the legacy content was simply rsync. This reduced the complexity of the rest of the problem a great deal, as now it was simply a matter of recognizing files that needed markdown processing, and recognizing files of more interesting type ("blog post", "blog", "photoessay"...) and doing something with them.

Once I got past the problem of stuffing the large pile into a third party blogging system, what about the small pile? All of the static-blog tools I looked at where a little too opinionated (which is great when starting from scratch) and the static-site-generator tools were not opinionated enough. I finally concluded that in order to figure out what features I actually wanted, I'd need to sit down and implement them from scratch, and discover what aspects of "blog" features were artifacts of "what is easy to write" and what things actually mattered. (This also, admittedly, let me side-step the conclusion that ikiwiki, not nikola, was closest in behaviour to what I was looking for, even though it was a pile of perl with character set issues...)

Having concluded that the big opinion was that "text should be in markdown" pulling little things together was a lot simpler, and I could make visible progress a feature at a time. Other useful conclusions include:

source goes in git
- therefore a post date can easily be the git-author-date
- if the destination is in git, that's an unrelated deployment issue; mkdir and rsync are entirely sufficient for 0.1
python-markdown has an "arbitrary metadata" extension; for 0.1 I need to reparse the input files to extract it, but with 45 files that just isn't important, since I can clearly put in time later to be as clever as I need to be. For the first round, having Tags metadata that gets pulled out to a single tag-index.html demonstrates the concept and is a good enough implementation of article keywords; adding other metadata like "twitter summary" can follow in time, but I don't need to start with it.

As a nod to codes-well-with-others, I've built most of the recent pieces atop two new third party libraries:

for things like "blogs" (rather than individual "posts") that are entirely metadata (title, description, content pointer), Python configobj turns out to support a variety of nicely python-esque syntax options, reasonably structured data, and isn't YAML.
lxml.html is quite nice for mucking about with generated html; since HTML is horribly fragile but no tools report legitimate diagnostics for it, sticking to carefully constructed operations on the element tree seems like the only sane way to perform operations like "add stylesheet" or "promote the first H1 to a head title element" in arbitrary contexts.

All that said, it really does look like I could do a first deployment this week, though perhaps the start of Spring is ultimately more realistic...

... but they grind exceedingly fine

2014-01-11 16:09:47 -0500

The "ThokTober" effort is showing glorious levels of scope creep and schedule slip. This post is still only going to the "pre-production" version of the site, so it doesn't count towards the live publication milestone, but in the mean time I've

gathered up all of the non-versioned files, the "slightly-versioned" files (given foo and foo~ and their datestamps, that counts as at just about the least plausible amount of history to attempt to preserve) and the individually versioned files (ad-hoc RCS use on files that were served up directly, rather than through cgiweb) and fed them all through cvs2git producing an epic "blob" that serves as the starting point for the new git-based world...
figured out how to use git rebase to glue that 18-year "blob" on to the short-term ikiwiki prototype such that they look like a continuous stream of "history"...
hunted down the remaining microsites and planned to isolate them and not include them in the broader project...
populated two prototype "House Wiki" sites with real information (in the process, learning how to manage these sites together.)

All in all I've made a lot of progress, it's just that the direction of that progress hasn't been towards the original goal of "publishing my writing again." Still, much was learned, and this post should contribute to confirming that the machinery still works after the above bits of git churn...

Transitions

2013-12-08 03:37:54 -0500

THOK.ORG finally ground to a halt - the last post here that wasn't either part of the clawback of my blogspot blog or meta-bloggery about blog tool making was in 2010 (a rant that coined the term career_limiting_memcpy which sadly failed to catch on.) It's not that I haven't had time to write, or things to write (I churned out two other tumblr blogs in between, for a while, solely because I could write in markdown and post from my phone both of which were far less friction than the state I'd gotten stuck in with my own code.)

All of this finally came to a head in October 2013, which I christened "ThokTober" with the intent of spending a month picking up some existing blogging tool and running with it. As integration projects are wont to do, this dragged on to early December, at which point I settled on ikiwiki (and I do mean "settled" - I really though I'd end up with nikola what with python vs. perl, vastly more plugins and features, and bigger dev community... and I may yet go back to it, but ikiwiki ended up with far less friction in terms of actually getting sites up and "good enough".)

There will be some loose ends for a while (part of the transition involves semi-automatically converting my ad-hoc markup to proper Markdown, and I still don't really know what I'm going to do with the existing RSS) but if I get one new non-meta post up before Christmas I'll finally declare victory :)