The pros and cons of markdown (podcast, part 2)
Podcast: Play in new window | Download
Subscribe: Apple Podcasts | Spotify | Amazon Music | Email | TuneIn | RSS
In episode 98 of The Content Strategy Experts podcast, Sarah O’Keefe and Dr. Carlos Evia of Virginia Tech continue their discussion about the pros and cons of markdown.
“If you want to make a website and you need to write the text in a fast way that does not involve adding a lot of the brackets that are in HTML syntax, I think that’s the main use for markdown.”
–Dr. Carlos Evia
Related links:
- The pros and cons of markdown (podcast, part 1)
- Does markdown fit into your content strategy?
- Lightweight DITA podcast: part 1 with guests Carlos Evia and Michael Priestley
- Lightweight DITA podcast: part 2 with guests Carlos Evia and Michael Priestley
Twitter handles:
Transcript:
Sarah O’Keefe: Welcome to The Content Strategy Experts Podcast, brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize and distribute content in an efficient way. My name is Sarah O’Keefe and I’m your host today. In this episode, we continue our discussion about the pros and cons of markdown with Dr. Carlos Evia. This is part two of a two-part podcast. So when we talk about markdown, because I think that probably most of the people on this podcast in general are more familiar with DITA. When you talk about markdown, what is the sort of use case for markdown, the clearest possible place where you say “oh, this is a case where you definitely want to use markdown.” What are those factors?
Dr. Carlos Evia: You need to make content that is going to be published mainly to the web and here I say mainly because markdown now, that part of processing the syntax that used to be in the beginning, “let’s process with this tiny, tiny tool that will only convert to HTML or XHTML,” now there are many other tools that can actually process and transform markdown to other things to create that multichannel publishing that we also do in DITA.
CE: So, I think the main use case is if you need to have something that is going to be published or presented in a website, and you do not want to write HTML. It’s a shorthand, just like when we were back in junior high, the cool thing was that you will take a course in typewriting, so you can be working with computers. This is back in the Fred Flintstone days.
CE: But before you could touch the keyboard, you had to take a course on shorthand, and shorthand was several syntaxes. I think there are two amazing texts for shorthand. You had to learn how to do it with a pencil, actually a special pencil. And there were different notations that you will do. And then, once you dominated those things and you could take dictation super fast, you can go and transcribe it using the keyboard.
CE: So I think that’s kind of the equivalent of markdown. If you want to make a website and you need to write the text for your website in a fast way that does not involve you adding a lot of the brackets that are in HTML syntax, I think that’s the main use for markdown. Write it following this very simple text-based syntax.
CE: And then there will be a tool that will transform it, and mainly to a website. But like I said, there are some things, and these are things that I use all the time, I use Pandoc, for example, and Pandoc is a tool that can transform a markdown file to a ton of things, including XML. So we can send my markdown file or files to HTML, to EPUB, to DocBook. There’s no native transform for DITA. And I considered building one a few years ago, but it’s developing Haskell and I don’t understand Haskell as a programming language. So I gave up, but that’s the main use. The main use case is you want quick HTML. So go ahead and use markdown.
SO: What are some of the limitations that you found in markdown, the factors that caused you to look at it and say, this is not going to be a good fit for what I’m doing?
CE: The biggest problem is that it’s not really structured text. The structure provided by markdown is mainly at the block level. You can have a heading, you can have a subheading, you can have paragraphs and then you can have a couple of inlines, but there is no real structure like we have in XML, and particularly in DITA, that allows you to put attributes in a sentence or even a word to tell it how to behave, to tell it how to look, to tell it filter it out or filter it in when you create user aimed documentation. So that’s one of the biggest challenges when you’re working with markdown. If you only have one version of content to publication, and there are no filters involved, you don’t have to feed it the needs of different audiences. You don’t have to feed the needs of different platforms. Okay, use markdown.
CE: And people here are going to be like, “wait a minute! Well, there’s this flavor of markdown that if you add a YAML header and you put a bunch of little pairings of variables, and then you start spicing it up with more of those squiggly brackets and semi-colons,” yes, I agree. But that is not markdown. That becomes something new, a new spaghetti kind of thing that actually, John Gruber, one of the guys who invented markdown, he doesn’t like it when you start adding things to your markdown because it breaks with that idea of making it simple.
CE: So that’s the main limitation that I see from my perspective. When I started talking about it in my classes, when I started using it for my specific publication needs, the structure happens at the block level.
CE: Beyond that, it kind of becomes the worst enemy of the content specialist, which is that blob, the blob of text that we all fear that we see when we’re working in a word processor and there’s no real structure behind it. That can easily happen in markdown, that you’re just writing paragraphs and there’s really no structure to it. And again, not every document, not every website in the world needs serious intense structure, but if you’re writing for an audience of human beings in a potential audience of machines that are going to be taking your content to do some machine learning, artificial intelligence, are going to be sending this to voice assistants and sending it to the dashboard of your fancy Tesla, markdown is going to face those limitations that your text is mainly a blob with a header and a subheader, but it’s not really structured in a way that enables behind the scenes computation. And again, you can claim that “oh, but my flavor if you add these twenty-five hundred other characters,” yeah, but that is not simple markdown.
SO: So I know that you’ve been involved with Lightweight DITA and are leading that effort along with a couple of other people. Does that have potential to kind of unify markdown and DITA, unify these two use cases in some way?
CE: Yep. That’s precisely one of the ideas behind Lightweight DITA. When my colleague, Michael Priestley, came up with the idea of Lightweight DITA after, Don Day and Michael Priestly, who came up with the idea of DITA, by the way, they came up with the idea of Lightweight DITA in 2015, they wanted to have a simple way to represent the most frequently used elements of DITA in a smaller set of XML tags. And as they started working on it, then they realized, wait a minute, what if we also create a way to do this subset of elements in HTML? And it was a logical thing that as markdown became more popular as a shorthand approach for creating HTML, why don’t we have a Lightweight DITA flavor that is in markdown and you write it in simple text and then it becomes those HTML elements.
CE: And somebody mentioned a few years ago, somebody sent me a paper, an article that said DITA is a universal publishing solution. And I want to say Lightweight DITA is not universal. Lightweight DITA is ‘pluriversal’ because it allows different languages to be merged into DITA-like workflows. And at the end, when you publish, when you create a document and you give it to your users, they will not know what came from markdown, what came from HTML and what came from XML. When they get their document or their website or whatever it is that you’re going to transform to, it all looks like it came from the same source. And that is one of the biggest principles behind that design and development of Lightweight DITA. We want to take topics that follow some basic rules and you can create them in XML in HTML or in markdown.
CE: And they all live together in a map and you can transform them. Then develop documents that when you give them to your users, they won’t know what came from where. So it’s a ‘pluriversal’ approach instead of the universal “use these one way,” no, we want to be open to possibilities. So that’s what we do with markdown and by design, we have really tried to avoid creating our complicated flavor of markdown in Lightweight DITA that has a ton of bizarre characters. We don’t want to over spice our markdown with squigglies and brackets and stuff.
CE: So one of the principles is okay, do you want to bring a markdown file to a Lightweight DITA party? Bring the most basic, a couple of hashtags for a Heading 1 and a Heading 2 and two paragraphs, bring it on, put it in a DITA map or a Lightweight DITA map, and it will work. It will publish. It will transform. And that’s something that you can do now. And to be honest with you, it’s something that you probably could be doing. And some people are doing it since 2016, when the DITA Open Toolkit started having a version of Lightweight DITA. So I know people that do that on a daily basis. They mix their DITA with markdown topics and nobody’s complaining.
SO: Yeah, we actually do have a couple of clients that are doing some version of that. And it’s been interesting trying to figure out how to bring those things together and unify them while maintaining, I think in their case, it very often comes down to markdown is more convenient for the people who are creating it. And very often stashed in something like Git or GitHub and then they have this full-on narrative authoring environment, which is the DITA content, but they need to unify the two, as you said, for delivery purposes. So they slurp the markdown into DITA and then out through the DITA processing pipelines.
CE: And you know what, when I teach my students how to use DITA with markdown, everything lives in the same GitHub repository. And the most beautiful thing that we have seen is that you can have one GitHub repo that has topics in DITA, and some topics were filed in markdown. And then inside that same repo, you have a DITA map or an XDITA map in the case of Lightweight DITA that brings them all together. And from that, you can build and publish whatever you want, but in another repository that might be yours or might be your classmates’, you can have a sub module that borrows, that repo. And that other repo can be a headless CMS source that builds a website with a react or whatever it is. And it’s using the same markdown files that the other repository is using to build something in a Lightweight DITA workflow.
CE: So see, they can work together and because we’re avoiding to over spice the markdown with squigglies and whatnot, they can still work and they work in both systems, there you have it. Have a repo that builds something in a Lightweight DITA workflow and embed that as a sub module in another repository that is using react, a static site generator, you’re using the text markdown, and you can have them both living together and nobody complains.
SO: Well, I think I’ll leave it there because nobody complains. Seems like a good closing for this podcast. Carlos, thank you so much for coming in on this. I’m going to leave some resources, both on markdown, but also on Lightweight DITA in the show notes and I think your background information is in there and we’ll have a couple of other things. So with that, thanks again. And thank you for listening to The Content Strategy Experts Podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
John Mogilewsky
Fascinating conversation. I’ve been following lwDITA closely and it does seem to be evolving along these lines.
It’s worth noting that there are other LML (lightweight markup languages) in broad use. Some are more suited to production of formal publications. To use one example, Asciidoc (“shorthand” DocBook) natively supports 1) transclusion (include directive), 2) partial transclusion (include directive to tagged regions), and 3) conditional content (conditional directives ifdef, ifndef, ifeval). It does so without requiring extensions or plugins, as you might have to do on a Markdown publishing platform.
The downside here, as you’ve both mentioned, is complexity.
I would posit that the complexity stems not only from the markup side but also from the business side. Single sourcing a procedure – to use one example – is only efficient if the variants share 75% of existing content. Once the variants deviate by large amounts, the complexity of maintaining conditional content rapidly overshadows the complexity of simply forking the procedure into different files or utilizing git methods (fork/branch/tag) to handle variance. The catch-22 is that technical writers often don’t know how similar a variant is before the writing starts, so the writer stands a good chance of going down the wrong path.
This is an example of a complexity problem from the business process perspective, as this problem is largely tool agnostic. The takeaway is that conditional content – regardless of tools/specification used – requires informed coordination with organizational change management / configuration management groups. The CM group can tell the writer in one word whether or not it makes sense to re-use a procedure for a new product variant.
John Mogilewsky
Fascinating conversation. I’ve been following lwDITA closely and it does seem to be evolving along these lines.
It’s worth noting that there are other LML (lightweight markup languages) in broad use. Some are more suited to production of formal publications. To use one example, Asciidoc (“shorthand” DocBook) natively supports 1) transclusion (include directive), 2) partial transclusion (include directive to tagged regions), and 3) conditional content (conditional directives ifdef, ifndef, ifeval). It does so without requiring extensions or plugins, as you might have to do on a Markdown publishing platform.
The downside here, as you’ve both mentioned, is complexity.
I would posit that the complexity stems not only from the markup side but also from the business side. Single sourcing a procedure – to use one example – is only efficient if the variants share 75% of existing content. Once the variants deviate by large amounts, the complexity of maintaining conditional content rapidly overshadows the complexity of simply forking the procedure into different files or utilizing git methods (fork/branch/tag) to handle variance. The catch-22 is that technical writers often don’t know how similar a variant is before the writing starts, so the writer stands a good chance of going down the wrong path.
This is an example of a complexity problem from the business process perspective, as this problem is largely tool agnostic. The takeaway is that conditional content – regardless of tools/specification used – requires informed coordination with organizational change management / configuration management groups. The CM group can tell the writer in one word whether or not it makes sense to re-use a procedure for a new product variant.