在用 ox-hugo 导出到 markdown 文件时,发现也会出现类似 org2blog 发布文章时的文字换行问题,也就是在 org 文件中的换行符没有自动清除掉。不一样的是,中文之间的换行符已经清除了,但是有英文之间或中英混合文字之间有换行符时就不会清除,这样在最后生成的 html 页面中就会转换成<br />,所以必须要处理一下。

类似于 org-html-paragraph,我给 org-hugo-paragraph 也增加一个 advice,处理方法与 org-html-paragraph 的完全一样,代码完全可以复用。

另外,对原先的代码也进行了一下升级,对中英文混合之间的换行符进行清除,对两个英文之间的换行符替换为空格。

更新后的完整代码如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
(defadvice! +chinese--org-html-paragraph-a (args)
  "Join consecutive Chinese lines into a single long line without
unwanted space when exporting org-mode to html."
  :filter-args #'org-html-paragraph
  (++chinese--org-paragraph-a args))

(defadvice! +chinese--org-hugo-paragraph-a (args)
  "Join consecutive Chinese lines into a single long line without
unwanted space when exporting org-mode to hugo markdown."
  :filter-args #'org-hugo-paragraph
  (++chinese--org-paragraph-a args))

(defun ++chinese--org-paragraph-a (args)
  (cl-destructuring-bind (paragraph content info) args
    (let* ((origin-contents
            (replace-regexp-in-string
             "<[Bb][Rr][\t ]*/>"
             ""
             content))
           (origin-contents
            (replace-regexp-in-string
             "\\([[:multibyte:]]\\)[\t ]*\n[\t ]*\\([[:multibyte:]]\\)"
             "\\1\\2"
             origin-contents))
           (fixed-contents
            (replace-regexp-in-string
             "\\([^\t ]\\)[\t ]*\n[\t ]*\\([^\t ]\\)"
             "\\1 \\2"
             origin-contents)))
      (list paragraph fixed-contents info))))