Finebits and I are implementing a table of contents
My thoughts were sparked by an update to the Ghost theme Finebits, which I use here. One line in the changelog caught my eye:
Added support for showing a Table of Contents in posts โจ
I’m actually a big fan of well-structured documents. You know those people who don’t use headings, but instead just underline and bold their text? At work, it always annoys me when I can’t navigate directly to headings and have to scroll through endlessly long texts.
Until now, I haven’t been so strict on the blog. There are things called headers, which are just thick bars. A modern trend that I’ve experimented with and since discarded. I prefer simple headings because they save space. But many of my posts had no headings at all. That changes now. From now on, I will structure my posts more thoughtfully.
Reading is out
Regarding my last blog post, my girlfriend said: “Wow, that’s long. I don’t feel like reading it.” The idea quickly came to me that I could offer my posts not only as text but also as audio. However, anyone who has heard me in a voice chat, on a stream, or elsewhere will know that my pronunciation isn’t exactly the best. So, the option of me recording my own posts is out.
In recent weeks, I’ve been experimenting a lot with text-to-speech, actually in a different context. If you don’t want to spend any money, you inevitably stumble upon Microsoft’s Azure Voices, which are available for free via Python with Edge-TTS. With a simple command, you can easily convert texts into speech.
My posts, now listenable
For my blog posts, after a bit of trial and error, I’m using the following command:
edge-tts –rate=+15% –voice de-DE-KatjaNeural –file input.txt –write-media output.mp3
Even though the technology has advanced massively, it’s not yet perfect. It gets tricky, for example, when individual English words appear in a sentence. Still, dear Katja sounds much better than I would. And to be honest, the effort is also significantly less. The conversion takes just a few seconds and requires no post-processing.
In my first test for the last post, I noticed even more advantages. First, I listen to my entire post again. This helps me catch spelling and grammar mistakes I would have otherwise overlooked. But the structure of the post also becomes clearer to me. One option I’ll use in extreme cases is to adjust the text for the speech conversion. “Gamer” just isn’t read out well, so I replaced it with “player” in the template for the speech conversion.
From now on, you will find an audio player at the beginning of each post. As an alternative, I’ll also upload the posts to YouTube, albeit without any real video. For the first one, I just had the post scroll by. I probably won’t add audio to old posts retroactively, with the possible exception of very popular ones.